Re: [GIT PULL] overlay filesystem v25
On Tue, Oct 28, 2014 at 10:55:13PM +, Al Viro wrote: > On Mon, Oct 27, 2014 at 09:11:27PM -0700, Paul E. McKenney wrote: > > On Tue, Oct 28, 2014 at 01:12:14AM +, Al Viro wrote: > > > On Mon, Oct 27, 2014 at 10:36:21AM -0700, Paul E. McKenney wrote: > > > > Code making direct use of smp_read_barrier_depends() is harder to read, > > > > in my experience, but good point on the sparse noise. Maybe a new > > > > lockless_dereference() primitive? Maybe something like the following? > > > > (Untested, probably does not even build.) > > > > > > > > #define lockless_dereference(p) \ > > > > ({ \ > > > > typeof(*p) *_p1 = ACCESS_ONCE(p); \ > > > > smp_read_barrier_depends(); /* Dependency order vs. p above. */ > > > > \ > > > > _p1; \ > > > > }) > > > > > > Hmm... Where would you prefer to put it? rcupdate.h? > > > > Good a place as any, I guess. Please see patch below. Left to myself, > > I would send this along for the next merge window, but please let me > > know if you would like it sooner. > > It's needed sooner, unfortunately. Guys, could you take a look at > vfs.git#for-linus and comment? The version in your tree looks good. I will drop my commit in favor of yours. The use of lockless_dereference() and smp_mb__before_spinlock() in d45f00ae43 (overlayfs: barriers for opening upper-layer directory) looks fine to me. Only nit is lack of space between "=" and lockless_dereference(). Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] overlay filesystem v25
On Mon, Oct 27, 2014 at 09:11:27PM -0700, Paul E. McKenney wrote: > On Tue, Oct 28, 2014 at 01:12:14AM +, Al Viro wrote: > > On Mon, Oct 27, 2014 at 10:36:21AM -0700, Paul E. McKenney wrote: > > > Code making direct use of smp_read_barrier_depends() is harder to read, > > > in my experience, but good point on the sparse noise. Maybe a new > > > lockless_dereference() primitive? Maybe something like the following? > > > (Untested, probably does not even build.) > > > > > > #define lockless_dereference(p) \ > > > ({ \ > > > typeof(*p) *_p1 = ACCESS_ONCE(p); \ > > > smp_read_barrier_depends(); /* Dependency order vs. p above. */ \ > > > _p1; \ > > > }) > > > > Hmm... Where would you prefer to put it? rcupdate.h? > > Good a place as any, I guess. Please see patch below. Left to myself, > I would send this along for the next merge window, but please let me > know if you would like it sooner. It's needed sooner, unfortunately. Guys, could you take a look at vfs.git#for-linus and comment? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] overlay filesystem v25
On Mon, Oct 27, 2014 at 09:11:27PM -0700, Paul E. McKenney wrote: On Tue, Oct 28, 2014 at 01:12:14AM +, Al Viro wrote: On Mon, Oct 27, 2014 at 10:36:21AM -0700, Paul E. McKenney wrote: Code making direct use of smp_read_barrier_depends() is harder to read, in my experience, but good point on the sparse noise. Maybe a new lockless_dereference() primitive? Maybe something like the following? (Untested, probably does not even build.) #define lockless_dereference(p) \ ({ \ typeof(*p) *_p1 = ACCESS_ONCE(p); \ smp_read_barrier_depends(); /* Dependency order vs. p above. */ \ _p1; \ }) Hmm... Where would you prefer to put it? rcupdate.h? Good a place as any, I guess. Please see patch below. Left to myself, I would send this along for the next merge window, but please let me know if you would like it sooner. It's needed sooner, unfortunately. Guys, could you take a look at vfs.git#for-linus and comment? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] overlay filesystem v25
On Tue, Oct 28, 2014 at 10:55:13PM +, Al Viro wrote: On Mon, Oct 27, 2014 at 09:11:27PM -0700, Paul E. McKenney wrote: On Tue, Oct 28, 2014 at 01:12:14AM +, Al Viro wrote: On Mon, Oct 27, 2014 at 10:36:21AM -0700, Paul E. McKenney wrote: Code making direct use of smp_read_barrier_depends() is harder to read, in my experience, but good point on the sparse noise. Maybe a new lockless_dereference() primitive? Maybe something like the following? (Untested, probably does not even build.) #define lockless_dereference(p) \ ({ \ typeof(*p) *_p1 = ACCESS_ONCE(p); \ smp_read_barrier_depends(); /* Dependency order vs. p above. */ \ _p1; \ }) Hmm... Where would you prefer to put it? rcupdate.h? Good a place as any, I guess. Please see patch below. Left to myself, I would send this along for the next merge window, but please let me know if you would like it sooner. It's needed sooner, unfortunately. Guys, could you take a look at vfs.git#for-linus and comment? The version in your tree looks good. I will drop my commit in favor of yours. The use of lockless_dereference() and smp_mb__before_spinlock() in d45f00ae43 (overlayfs: barriers for opening upper-layer directory) looks fine to me. Only nit is lack of space between = and lockless_dereference(). Thanx, Paul -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] overlay filesystem v25
On Tue, Oct 28, 2014 at 01:12:14AM +, Al Viro wrote: > On Mon, Oct 27, 2014 at 10:36:21AM -0700, Paul E. McKenney wrote: > > Code making direct use of smp_read_barrier_depends() is harder to read, > > in my experience, but good point on the sparse noise. Maybe a new > > lockless_dereference() primitive? Maybe something like the following? > > (Untested, probably does not even build.) > > > > #define lockless_dereference(p) \ > > ({ \ > > typeof(*p) *_p1 = ACCESS_ONCE(p); \ > > smp_read_barrier_depends(); /* Dependency order vs. p above. */ \ > > _p1; \ > > }) > > Hmm... Where would you prefer to put it? rcupdate.h? Good a place as any, I guess. Please see patch below. Left to myself, I would send this along for the next merge window, but please let me know if you would like it sooner. Thanx, Paul rcu: Provide counterpart to rcu_dereference() for non-RCU situations Although rcu_dereference() and friends can be used in situations where object lifetimes are being managed by something other than RCU, the resulting sparse and lockdep-RCU noise can be annoying. This commit therefore supplies a lockless_dereference(), which provides the protection for dereferences without the RCU-related debugging noise. Reported-by: Al Viro Signed-off-by: Paul E. McKenney diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index 9575c2d403b5..ed4f5939a452 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -617,6 +617,21 @@ static inline void rcu_preempt_sleep_check(void) #define RCU_INITIALIZER(v) (typeof(*(v)) __force __rcu *)(v) /** + * lockless_dereference() - safely load a pointer for later dereference + * @p: The pointer to load + * + * Similar to rcu_dereference(), but for situations where the pointed-to + * object's lifetime is managed by something other than RCU. That + * "something other" might be reference counting or simple immortality. + */ +#define lockless_dereference(p) \ +({ \ + typeof(p) _p1 = ACCESS_ONCE(p); \ + smp_read_barrier_depends(); /* Dependency order vs. p above. */ \ + (_p1); \ +}) + +/** * rcu_assign_pointer() - assign to RCU-protected pointer * @p: pointer to assign to * @v: value to assign (publish) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] overlay filesystem v25
On Mon, Oct 27, 2014 at 10:36:21AM -0700, Paul E. McKenney wrote: > Code making direct use of smp_read_barrier_depends() is harder to read, > in my experience, but good point on the sparse noise. Maybe a new > lockless_dereference() primitive? Maybe something like the following? > (Untested, probably does not even build.) > > #define lockless_dereference(p) \ > ({ \ > typeof(*p) *_p1 = ACCESS_ONCE(p); \ > smp_read_barrier_depends(); /* Dependency order vs. p above. */ \ > _p1; \ > }) Hmm... Where would you prefer to put it? rcupdate.h? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] overlay filesystem v25
On Mon, Oct 27, 2014 at 05:28:01PM +, Al Viro wrote: > On Mon, Oct 27, 2014 at 08:59:01AM -0700, Paul E. McKenney wrote: > > > Indeed, life is hard here. Keep in mind that lock acquisition is not > > guaranteed to prevent prior operations from being reordered into the > > critical section, possibly as follows: > > > > CPU1: > > grab lock > > if (!global) > > global = p; > > /* Assume all of CPU2's accesses happen here. */ > > p->foo = 1; > > A bit of context: p is a local pointer to struct file here and alloc is > opening it... OK, good to know. ;-) > > This clearly allows CPU2 to execute as follows: > > > > CPU2: > > p = global; /* gets p */ > > if (p) /* non-NULL */ > > q = p->foo; /* might not be 1 */ > > > > Not only that, on DEC Alpha, even if CPU1's accesses are ordered, CPU2's > > accesses can be misordered. You need rcu_dereference() or the combination > > of ACCESS_ONCE() and smp_read_barrier_depends() to avoid this issue. > > As always, see http://www.openvms.compaq.com/wizard/wiz_2637.html for > > more info. > > > > So no, there is no guarantee. I am assuming that the lock grabbed by > > CPU1 guards all assignments to "global", otherwise the code needs further > > help. I am further assuming that the memory pointed to by CPU1's "p" > > is inaccessible to any other CPU, as in CPU1 just allocated the memory. > > Otherwise, the assignment "p->foo = 1" is questionable. And finally, > > I am assuming that p->foo stays constant once it has been made > > accessible to readers. > > > > But the following should work: > > > > CPU1: > > p->foo = 1; /* Assumes p is local. */ > > smp_mb__before_spinlock(); > > grab lock > > if (!global) /* Assumes lock protects all assignments to global. */ > > global = p; > > > > CPU2: > > p = rcu_dereference(global); > > if (p) > > q = p->foo; /* Assumes p->foo constant once visible to readers. */ > > /* Also assumes p and q are local. */ > > > > If the assumptions called out in the comments do not hold, you at least > > need ACCESS_ONCE(), and perhaps even more synchronization. For more info > > on ACCESS_ONCE(), Jon's LWN article is at http://lwn.net/Articles/508991/. > > First of all, this "p->foo = 1" is a shorthand for initialization of > struct file done by opening a file. What you are saying is that it > can leak past the point where we stick a pointer to freshly opened > file into a place where another CPU can see it, but not past the > barrier in dropping the lock, right? Exactly! I should also add that smp_mb__before_spinlock() implies smp_wmb(), but nothing more. But that is OK because smp_wmb() suffices in this case. > And you are suggesting rcu_dereference() as a way to bring the required > barriers in. Ouch... The names are really bad, but there's another > fun issue - rcu_dereference brings in sparse noise. Wouldn't direct use > of smp_read_barrier_depends() be cleaner, anyway? Code making direct use of smp_read_barrier_depends() is harder to read, in my experience, but good point on the sparse noise. Maybe a new lockless_dereference() primitive? Maybe something like the following? (Untested, probably does not even build.) #define lockless_dereference(p) \ ({ \ typeof(*p) *_p1 = ACCESS_ONCE(p); \ smp_read_barrier_depends(); /* Dependency order vs. p above. */ \ _p1; \ }) Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] overlay filesystem v25
On Mon, Oct 27, 2014 at 08:59:01AM -0700, Paul E. McKenney wrote: > Indeed, life is hard here. Keep in mind that lock acquisition is not > guaranteed to prevent prior operations from being reordered into the > critical section, possibly as follows: > > CPU1: > grab lock > if (!global) > global = p; > /* Assume all of CPU2's accesses happen here. */ > p->foo = 1; A bit of context: p is a local pointer to struct file here and alloc is opening it... > This clearly allows CPU2 to execute as follows: > > CPU2: > p = global; /* gets p */ > if (p) /* non-NULL */ > q = p->foo; /* might not be 1 */ > > Not only that, on DEC Alpha, even if CPU1's accesses are ordered, CPU2's > accesses can be misordered. You need rcu_dereference() or the combination > of ACCESS_ONCE() and smp_read_barrier_depends() to avoid this issue. > As always, see http://www.openvms.compaq.com/wizard/wiz_2637.html for > more info. > > So no, there is no guarantee. I am assuming that the lock grabbed by > CPU1 guards all assignments to "global", otherwise the code needs further > help. I am further assuming that the memory pointed to by CPU1's "p" > is inaccessible to any other CPU, as in CPU1 just allocated the memory. > Otherwise, the assignment "p->foo = 1" is questionable. And finally, > I am assuming that p->foo stays constant once it has been made > accessible to readers. > > But the following should work: > > CPU1: > p->foo = 1; /* Assumes p is local. */ > smp_mb__before_spinlock(); > grab lock > if (!global) /* Assumes lock protects all assignments to global. */ > global = p; > > CPU2: > p = rcu_dereference(global); > if (p) >q = p->foo; /* Assumes p->foo constant once visible to readers. */ >/* Also assumes p and q are local. */ > > If the assumptions called out in the comments do not hold, you at least > need ACCESS_ONCE(), and perhaps even more synchronization. For more info > on ACCESS_ONCE(), Jon's LWN article is at http://lwn.net/Articles/508991/. First of all, this "p->foo = 1" is a shorthand for initialization of struct file done by opening a file. What you are saying is that it can leak past the point where we stick a pointer to freshly opened file into a place where another CPU can see it, but not past the barrier in dropping the lock, right? And you are suggesting rcu_dereference() as a way to bring the required barriers in. Ouch... The names are really bad, but there's another fun issue - rcu_dereference brings in sparse noise. Wouldn't direct use of smp_read_barrier_depends() be cleaner, anyway? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] overlay filesystem v25
On Mon, Oct 27, 2014 at 09:06:54AM +0100, Miklos Szeredi wrote: > [Paul McKenney added to CC] > > On Sat, Oct 25, 2014 at 7:06 PM, Al Viro wrote: > > On Sat, Oct 25, 2014 at 11:53:52AM +0200, Miklos Szeredi wrote: > > > >> Yes, but it's not about race with copy-up (which the ovl_path_upper() > >> protects against), but race of two fsync calls with each other. If > >> there's no synchronization between them, then that od->upperfile does > >> indeed count as lockless access, no matter that the assignment was > >> done under lock. > > > > p = global; > > if (!p) { // outside of lock > > p = alloc(); > > grab lock > > if (!global) { > > global = p; > > } else { > > destroy(p); > > p = global; > > } > > drop lock > > } > > is a very common pattern, especially if you look for cases when lock is > > a spinlock and allocation is blocking (in those cases you'll often see > > destroy() part done after dropping the lock; that's where what I fucked up > > in > > what I'd originally pushed. And it wasn't even needed - fput() under > > ->i_mutex is OK...) > > Being a very common pattern does not automatically make it correct... > > My understanding of these issues is very limited, but it's not clear > to me what will order initialization of members of p with the storing > of p into global. E.g. we start out with global == NULL and p->foo == > 0. > > CPU1: > p->foo = 1 > grab lock > if (!global) > global = p > > CPU1: If it is all the same to you, I will call this CPU2 to distinguish it from the first CPU1. ;-) > p = global > if (p) > q = p->foo > > Is it guaranteed that the above sequence (as is, without any barriers > or ACCESS_ONCE() other than the lock acquisition) will result in q == > 1 if p != NULL? Indeed, life is hard here. Keep in mind that lock acquisition is not guaranteed to prevent prior operations from being reordered into the critical section, possibly as follows: CPU1: grab lock if (!global) global = p; /* Assume all of CPU2's accesses happen here. */ p->foo = 1; This clearly allows CPU2 to execute as follows: CPU2: p = global; /* gets p */ if (p) /* non-NULL */ q = p->foo; /* might not be 1 */ Not only that, on DEC Alpha, even if CPU1's accesses are ordered, CPU2's accesses can be misordered. You need rcu_dereference() or the combination of ACCESS_ONCE() and smp_read_barrier_depends() to avoid this issue. As always, see http://www.openvms.compaq.com/wizard/wiz_2637.html for more info. So no, there is no guarantee. I am assuming that the lock grabbed by CPU1 guards all assignments to "global", otherwise the code needs further help. I am further assuming that the memory pointed to by CPU1's "p" is inaccessible to any other CPU, as in CPU1 just allocated the memory. Otherwise, the assignment "p->foo = 1" is questionable. And finally, I am assuming that p->foo stays constant once it has been made accessible to readers. But the following should work: CPU1: p->foo = 1; /* Assumes p is local. */ smp_mb__before_spinlock(); grab lock if (!global) /* Assumes lock protects all assignments to global. */ global = p; CPU2: p = rcu_dereference(global); if (p) q = p->foo; /* Assumes p->foo constant once visible to readers. */ /* Also assumes p and q are local. */ If the assumptions called out in the comments do not hold, you at least need ACCESS_ONCE(), and perhaps even more synchronization. For more info on ACCESS_ONCE(), Jon's LWN article is at http://lwn.net/Articles/508991/. Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] overlay filesystem v25
[Paul McKenney added to CC] On Sat, Oct 25, 2014 at 7:06 PM, Al Viro wrote: > On Sat, Oct 25, 2014 at 11:53:52AM +0200, Miklos Szeredi wrote: > >> Yes, but it's not about race with copy-up (which the ovl_path_upper() >> protects against), but race of two fsync calls with each other. If >> there's no synchronization between them, then that od->upperfile does >> indeed count as lockless access, no matter that the assignment was >> done under lock. > > p = global; > if (!p) { // outside of lock > p = alloc(); > grab lock > if (!global) { > global = p; > } else { > destroy(p); > p = global; > } > drop lock > } > is a very common pattern, especially if you look for cases when lock is > a spinlock and allocation is blocking (in those cases you'll often see > destroy() part done after dropping the lock; that's where what I fucked up in > what I'd originally pushed. And it wasn't even needed - fput() under > ->i_mutex is OK...) Being a very common pattern does not automatically make it correct... My understanding of these issues is very limited, but it's not clear to me what will order initialization of members of p with the storing of p into global. E.g. we start out with global == NULL and p->foo == 0. CPU1: p->foo = 1 grab lock if (!global) global = p CPU1: p = global if (p) q = p->foo Is it guaranteed that the above sequence (as is, without any barriers or ACCESS_ONCE() other than the lock acquisition) will result in q == 1 if p != NULL? Thanks, Miklos -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] overlay filesystem v25
[Paul McKenney added to CC] On Sat, Oct 25, 2014 at 7:06 PM, Al Viro v...@zeniv.linux.org.uk wrote: On Sat, Oct 25, 2014 at 11:53:52AM +0200, Miklos Szeredi wrote: Yes, but it's not about race with copy-up (which the ovl_path_upper() protects against), but race of two fsync calls with each other. If there's no synchronization between them, then that od-upperfile does indeed count as lockless access, no matter that the assignment was done under lock. p = global; if (!p) { // outside of lock p = alloc(); grab lock if (!global) { global = p; } else { destroy(p); p = global; } drop lock } is a very common pattern, especially if you look for cases when lock is a spinlock and allocation is blocking (in those cases you'll often see destroy() part done after dropping the lock; that's where what I fucked up in what I'd originally pushed. And it wasn't even needed - fput() under -i_mutex is OK...) Being a very common pattern does not automatically make it correct... My understanding of these issues is very limited, but it's not clear to me what will order initialization of members of p with the storing of p into global. E.g. we start out with global == NULL and p-foo == 0. CPU1: p-foo = 1 grab lock if (!global) global = p CPU1: p = global if (p) q = p-foo Is it guaranteed that the above sequence (as is, without any barriers or ACCESS_ONCE() other than the lock acquisition) will result in q == 1 if p != NULL? Thanks, Miklos -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] overlay filesystem v25
On Mon, Oct 27, 2014 at 09:06:54AM +0100, Miklos Szeredi wrote: [Paul McKenney added to CC] On Sat, Oct 25, 2014 at 7:06 PM, Al Viro v...@zeniv.linux.org.uk wrote: On Sat, Oct 25, 2014 at 11:53:52AM +0200, Miklos Szeredi wrote: Yes, but it's not about race with copy-up (which the ovl_path_upper() protects against), but race of two fsync calls with each other. If there's no synchronization between them, then that od-upperfile does indeed count as lockless access, no matter that the assignment was done under lock. p = global; if (!p) { // outside of lock p = alloc(); grab lock if (!global) { global = p; } else { destroy(p); p = global; } drop lock } is a very common pattern, especially if you look for cases when lock is a spinlock and allocation is blocking (in those cases you'll often see destroy() part done after dropping the lock; that's where what I fucked up in what I'd originally pushed. And it wasn't even needed - fput() under -i_mutex is OK...) Being a very common pattern does not automatically make it correct... My understanding of these issues is very limited, but it's not clear to me what will order initialization of members of p with the storing of p into global. E.g. we start out with global == NULL and p-foo == 0. CPU1: p-foo = 1 grab lock if (!global) global = p CPU1: If it is all the same to you, I will call this CPU2 to distinguish it from the first CPU1. ;-) p = global if (p) q = p-foo Is it guaranteed that the above sequence (as is, without any barriers or ACCESS_ONCE() other than the lock acquisition) will result in q == 1 if p != NULL? Indeed, life is hard here. Keep in mind that lock acquisition is not guaranteed to prevent prior operations from being reordered into the critical section, possibly as follows: CPU1: grab lock if (!global) global = p; /* Assume all of CPU2's accesses happen here. */ p-foo = 1; This clearly allows CPU2 to execute as follows: CPU2: p = global; /* gets p */ if (p) /* non-NULL */ q = p-foo; /* might not be 1 */ Not only that, on DEC Alpha, even if CPU1's accesses are ordered, CPU2's accesses can be misordered. You need rcu_dereference() or the combination of ACCESS_ONCE() and smp_read_barrier_depends() to avoid this issue. As always, see http://www.openvms.compaq.com/wizard/wiz_2637.html for more info. So no, there is no guarantee. I am assuming that the lock grabbed by CPU1 guards all assignments to global, otherwise the code needs further help. I am further assuming that the memory pointed to by CPU1's p is inaccessible to any other CPU, as in CPU1 just allocated the memory. Otherwise, the assignment p-foo = 1 is questionable. And finally, I am assuming that p-foo stays constant once it has been made accessible to readers. But the following should work: CPU1: p-foo = 1; /* Assumes p is local. */ smp_mb__before_spinlock(); grab lock if (!global) /* Assumes lock protects all assignments to global. */ global = p; CPU2: p = rcu_dereference(global); if (p) q = p-foo; /* Assumes p-foo constant once visible to readers. */ /* Also assumes p and q are local. */ If the assumptions called out in the comments do not hold, you at least need ACCESS_ONCE(), and perhaps even more synchronization. For more info on ACCESS_ONCE(), Jon's LWN article is at http://lwn.net/Articles/508991/. Thanx, Paul -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] overlay filesystem v25
On Mon, Oct 27, 2014 at 08:59:01AM -0700, Paul E. McKenney wrote: Indeed, life is hard here. Keep in mind that lock acquisition is not guaranteed to prevent prior operations from being reordered into the critical section, possibly as follows: CPU1: grab lock if (!global) global = p; /* Assume all of CPU2's accesses happen here. */ p-foo = 1; A bit of context: p is a local pointer to struct file here and alloc is opening it... This clearly allows CPU2 to execute as follows: CPU2: p = global; /* gets p */ if (p) /* non-NULL */ q = p-foo; /* might not be 1 */ Not only that, on DEC Alpha, even if CPU1's accesses are ordered, CPU2's accesses can be misordered. You need rcu_dereference() or the combination of ACCESS_ONCE() and smp_read_barrier_depends() to avoid this issue. As always, see http://www.openvms.compaq.com/wizard/wiz_2637.html for more info. So no, there is no guarantee. I am assuming that the lock grabbed by CPU1 guards all assignments to global, otherwise the code needs further help. I am further assuming that the memory pointed to by CPU1's p is inaccessible to any other CPU, as in CPU1 just allocated the memory. Otherwise, the assignment p-foo = 1 is questionable. And finally, I am assuming that p-foo stays constant once it has been made accessible to readers. But the following should work: CPU1: p-foo = 1; /* Assumes p is local. */ smp_mb__before_spinlock(); grab lock if (!global) /* Assumes lock protects all assignments to global. */ global = p; CPU2: p = rcu_dereference(global); if (p) q = p-foo; /* Assumes p-foo constant once visible to readers. */ /* Also assumes p and q are local. */ If the assumptions called out in the comments do not hold, you at least need ACCESS_ONCE(), and perhaps even more synchronization. For more info on ACCESS_ONCE(), Jon's LWN article is at http://lwn.net/Articles/508991/. First of all, this p-foo = 1 is a shorthand for initialization of struct file done by opening a file. What you are saying is that it can leak past the point where we stick a pointer to freshly opened file into a place where another CPU can see it, but not past the barrier in dropping the lock, right? And you are suggesting rcu_dereference() as a way to bring the required barriers in. Ouch... The names are really bad, but there's another fun issue - rcu_dereference brings in sparse noise. Wouldn't direct use of smp_read_barrier_depends() be cleaner, anyway? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] overlay filesystem v25
On Mon, Oct 27, 2014 at 05:28:01PM +, Al Viro wrote: On Mon, Oct 27, 2014 at 08:59:01AM -0700, Paul E. McKenney wrote: Indeed, life is hard here. Keep in mind that lock acquisition is not guaranteed to prevent prior operations from being reordered into the critical section, possibly as follows: CPU1: grab lock if (!global) global = p; /* Assume all of CPU2's accesses happen here. */ p-foo = 1; A bit of context: p is a local pointer to struct file here and alloc is opening it... OK, good to know. ;-) This clearly allows CPU2 to execute as follows: CPU2: p = global; /* gets p */ if (p) /* non-NULL */ q = p-foo; /* might not be 1 */ Not only that, on DEC Alpha, even if CPU1's accesses are ordered, CPU2's accesses can be misordered. You need rcu_dereference() or the combination of ACCESS_ONCE() and smp_read_barrier_depends() to avoid this issue. As always, see http://www.openvms.compaq.com/wizard/wiz_2637.html for more info. So no, there is no guarantee. I am assuming that the lock grabbed by CPU1 guards all assignments to global, otherwise the code needs further help. I am further assuming that the memory pointed to by CPU1's p is inaccessible to any other CPU, as in CPU1 just allocated the memory. Otherwise, the assignment p-foo = 1 is questionable. And finally, I am assuming that p-foo stays constant once it has been made accessible to readers. But the following should work: CPU1: p-foo = 1; /* Assumes p is local. */ smp_mb__before_spinlock(); grab lock if (!global) /* Assumes lock protects all assignments to global. */ global = p; CPU2: p = rcu_dereference(global); if (p) q = p-foo; /* Assumes p-foo constant once visible to readers. */ /* Also assumes p and q are local. */ If the assumptions called out in the comments do not hold, you at least need ACCESS_ONCE(), and perhaps even more synchronization. For more info on ACCESS_ONCE(), Jon's LWN article is at http://lwn.net/Articles/508991/. First of all, this p-foo = 1 is a shorthand for initialization of struct file done by opening a file. What you are saying is that it can leak past the point where we stick a pointer to freshly opened file into a place where another CPU can see it, but not past the barrier in dropping the lock, right? Exactly! I should also add that smp_mb__before_spinlock() implies smp_wmb(), but nothing more. But that is OK because smp_wmb() suffices in this case. And you are suggesting rcu_dereference() as a way to bring the required barriers in. Ouch... The names are really bad, but there's another fun issue - rcu_dereference brings in sparse noise. Wouldn't direct use of smp_read_barrier_depends() be cleaner, anyway? Code making direct use of smp_read_barrier_depends() is harder to read, in my experience, but good point on the sparse noise. Maybe a new lockless_dereference() primitive? Maybe something like the following? (Untested, probably does not even build.) #define lockless_dereference(p) \ ({ \ typeof(*p) *_p1 = ACCESS_ONCE(p); \ smp_read_barrier_depends(); /* Dependency order vs. p above. */ \ _p1; \ }) Thanx, Paul -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] overlay filesystem v25
On Mon, Oct 27, 2014 at 10:36:21AM -0700, Paul E. McKenney wrote: Code making direct use of smp_read_barrier_depends() is harder to read, in my experience, but good point on the sparse noise. Maybe a new lockless_dereference() primitive? Maybe something like the following? (Untested, probably does not even build.) #define lockless_dereference(p) \ ({ \ typeof(*p) *_p1 = ACCESS_ONCE(p); \ smp_read_barrier_depends(); /* Dependency order vs. p above. */ \ _p1; \ }) Hmm... Where would you prefer to put it? rcupdate.h? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] overlay filesystem v25
On Tue, Oct 28, 2014 at 01:12:14AM +, Al Viro wrote: On Mon, Oct 27, 2014 at 10:36:21AM -0700, Paul E. McKenney wrote: Code making direct use of smp_read_barrier_depends() is harder to read, in my experience, but good point on the sparse noise. Maybe a new lockless_dereference() primitive? Maybe something like the following? (Untested, probably does not even build.) #define lockless_dereference(p) \ ({ \ typeof(*p) *_p1 = ACCESS_ONCE(p); \ smp_read_barrier_depends(); /* Dependency order vs. p above. */ \ _p1; \ }) Hmm... Where would you prefer to put it? rcupdate.h? Good a place as any, I guess. Please see patch below. Left to myself, I would send this along for the next merge window, but please let me know if you would like it sooner. Thanx, Paul rcu: Provide counterpart to rcu_dereference() for non-RCU situations Although rcu_dereference() and friends can be used in situations where object lifetimes are being managed by something other than RCU, the resulting sparse and lockdep-RCU noise can be annoying. This commit therefore supplies a lockless_dereference(), which provides the protection for dereferences without the RCU-related debugging noise. Reported-by: Al Viro v...@zeniv.linux.org.uk Signed-off-by: Paul E. McKenney paul...@linux.vnet.ibm.com diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index 9575c2d403b5..ed4f5939a452 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -617,6 +617,21 @@ static inline void rcu_preempt_sleep_check(void) #define RCU_INITIALIZER(v) (typeof(*(v)) __force __rcu *)(v) /** + * lockless_dereference() - safely load a pointer for later dereference + * @p: The pointer to load + * + * Similar to rcu_dereference(), but for situations where the pointed-to + * object's lifetime is managed by something other than RCU. That + * something other might be reference counting or simple immortality. + */ +#define lockless_dereference(p) \ +({ \ + typeof(p) _p1 = ACCESS_ONCE(p); \ + smp_read_barrier_depends(); /* Dependency order vs. p above. */ \ + (_p1); \ +}) + +/** * rcu_assign_pointer() - assign to RCU-protected pointer * @p: pointer to assign to * @v: value to assign (publish) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] overlay filesystem v25
On Sat, Oct 25, 2014 at 11:53:52AM +0200, Miklos Szeredi wrote: > Yes, but it's not about race with copy-up (which the ovl_path_upper() > protects against), but race of two fsync calls with each other. If > there's no synchronization between them, then that od->upperfile does > indeed count as lockless access, no matter that the assignment was > done under lock. p = global; if (!p) { // outside of lock p = alloc(); grab lock if (!global) { global = p; } else { destroy(p); p = global; } drop lock } is a very common pattern, especially if you look for cases when lock is a spinlock and allocation is blocking (in those cases you'll often see destroy() part done after dropping the lock; that's where what I fucked up in what I'd originally pushed. And it wasn't even needed - fput() under ->i_mutex is OK...) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] overlay filesystem v25
On Fri, Oct 24, 2014 at 09:24:45AM +0200, Miklos Szeredi wrote: > The reason I didn't do your "fix" is that it > > - adds more lines than it takes, > > - I wasn't sure at all if the lockless access is actually correct > without the ACCESS_ONCE and all the memory barrier magic that might be > necessary on weird architectures. _What_ lockless accesses? There is an extremely embarrassing bug in that commit, all right, but it has nothing to do with barriers... All barrier-related issues are taken care of by ovl_path_upper() (and without that you'd have tons of worse problems). Fetching ->upperfile outside of ->i_mutex is fine - in the worst case we'll fetch NULL, open the sucker grab ->i_mutex and find out that it has already been taken care of. In which case we fput() what we'd opened and move on (fput() under ->i_mutex is fine - it's going to be delayed until return from syscall anyway). There was a very dumb braino in there; fixed, force-pushed, passes unionmount tests, no regressions on LTP syscall ones and xfstests. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] overlay filesystem v25
On Sat, Oct 25, 2014 at 10:18 AM, Al Viro wrote: > On Fri, Oct 24, 2014 at 09:24:45AM +0200, Miklos Szeredi wrote: > >> The reason I didn't do your "fix" is that it >> >> - adds more lines than it takes, >> >> - I wasn't sure at all if the lockless access is actually correct >> without the ACCESS_ONCE and all the memory barrier magic that might be >> necessary on weird architectures. > > _What_ lockless accesses? There is an extremely embarrassing bug in that > commit, all right, but it has nothing to do with barriers... All > barrier-related issues are taken care of by ovl_path_upper() (and without > that you'd have tons of worse problems). Fetching ->upperfile outside of > ->i_mutex is fine - in the worst case we'll fetch NULL, open the sucker > grab ->i_mutex and find out that it has already been taken care of. > In which case we fput() what we'd opened and move on (fput() under > ->i_mutex is fine - it's going to be delayed until return from syscall > anyway). Yes, but it's not about race with copy-up (which the ovl_path_upper() protects against), but race of two fsync calls with each other. If there's no synchronization between them, then that od->upperfile does indeed count as lockless access, no matter that the assignment was done under lock. Thanks, Miklos -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] overlay filesystem v25
On Sat, Oct 25, 2014 at 10:18 AM, Al Viro v...@zeniv.linux.org.uk wrote: On Fri, Oct 24, 2014 at 09:24:45AM +0200, Miklos Szeredi wrote: The reason I didn't do your fix is that it - adds more lines than it takes, - I wasn't sure at all if the lockless access is actually correct without the ACCESS_ONCE and all the memory barrier magic that might be necessary on weird architectures. _What_ lockless accesses? There is an extremely embarrassing bug in that commit, all right, but it has nothing to do with barriers... All barrier-related issues are taken care of by ovl_path_upper() (and without that you'd have tons of worse problems). Fetching -upperfile outside of -i_mutex is fine - in the worst case we'll fetch NULL, open the sucker grab -i_mutex and find out that it has already been taken care of. In which case we fput() what we'd opened and move on (fput() under -i_mutex is fine - it's going to be delayed until return from syscall anyway). Yes, but it's not about race with copy-up (which the ovl_path_upper() protects against), but race of two fsync calls with each other. If there's no synchronization between them, then that od-upperfile does indeed count as lockless access, no matter that the assignment was done under lock. Thanks, Miklos -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] overlay filesystem v25
On Fri, Oct 24, 2014 at 09:24:45AM +0200, Miklos Szeredi wrote: The reason I didn't do your fix is that it - adds more lines than it takes, - I wasn't sure at all if the lockless access is actually correct without the ACCESS_ONCE and all the memory barrier magic that might be necessary on weird architectures. _What_ lockless accesses? There is an extremely embarrassing bug in that commit, all right, but it has nothing to do with barriers... All barrier-related issues are taken care of by ovl_path_upper() (and without that you'd have tons of worse problems). Fetching -upperfile outside of -i_mutex is fine - in the worst case we'll fetch NULL, open the sucker grab -i_mutex and find out that it has already been taken care of. In which case we fput() what we'd opened and move on (fput() under -i_mutex is fine - it's going to be delayed until return from syscall anyway). There was a very dumb braino in there; fixed, force-pushed, passes unionmount tests, no regressions on LTP syscall ones and xfstests. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] overlay filesystem v25
On Sat, Oct 25, 2014 at 11:53:52AM +0200, Miklos Szeredi wrote: Yes, but it's not about race with copy-up (which the ovl_path_upper() protects against), but race of two fsync calls with each other. If there's no synchronization between them, then that od-upperfile does indeed count as lockless access, no matter that the assignment was done under lock. p = global; if (!p) { // outside of lock p = alloc(); grab lock if (!global) { global = p; } else { destroy(p); p = global; } drop lock } is a very common pattern, especially if you look for cases when lock is a spinlock and allocation is blocking (in those cases you'll often see destroy() part done after dropping the lock; that's where what I fucked up in what I'd originally pushed. And it wasn't even needed - fput() under -i_mutex is OK...) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] overlay filesystem v25
Hi Al, On Fri, Oct 24, 2014 at 5:24 AM, Al Viro wrote: > On Fri, Oct 24, 2014 at 03:20:55AM +0100, Al Viro wrote: >> Why the hell do you hold ->i_mutex across the entire opening of underlying >> directory? All you need is to serialize one assignment; the side that loses >> the race will simply fput() what it opened... >> >> Oh, well - that goes under "weird pessimisations, easy to fix in >> followups"... > > OK, pulled into vfs.git, followups in question added. Also there: fix for > a long-standing leak in d_splice_alias() failure exits. Guys, could you > check that current vfs.git#for-linus survives your local tests? Seems to > survive here; if I don't hear of any problems by tomorrow morning, to Linus > it goes... FWIW, for that pull request stats would be I think you forgot to merge your for-linus branch into for-next? Anyway, as Stephen announced there will be no linux-next release today, I'm afraid you'll have to delay your pull request to Tuesday. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] overlay filesystem v25
On Fri, Oct 24, 2014 at 5:24 AM, Al Viro wrote: > On Fri, Oct 24, 2014 at 03:20:55AM +0100, Al Viro wrote: >> Why the hell do you hold ->i_mutex across the entire opening of underlying >> directory? All you need is to serialize one assignment; the side that loses >> the race will simply fput() what it opened... The reason I didn't do your "fix" is that it - adds more lines than it takes, - I wasn't sure at all if the lockless access is actually correct without the ACCESS_ONCE and all the memory barrier magic that might be necessary on weird architectures. The rest of the changes look OK. Thanks, Miklos -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] overlay filesystem v25
On Fri, Oct 24, 2014 at 5:24 AM, Al Viro v...@zeniv.linux.org.uk wrote: On Fri, Oct 24, 2014 at 03:20:55AM +0100, Al Viro wrote: Why the hell do you hold -i_mutex across the entire opening of underlying directory? All you need is to serialize one assignment; the side that loses the race will simply fput() what it opened... The reason I didn't do your fix is that it - adds more lines than it takes, - I wasn't sure at all if the lockless access is actually correct without the ACCESS_ONCE and all the memory barrier magic that might be necessary on weird architectures. The rest of the changes look OK. Thanks, Miklos -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] overlay filesystem v25
Hi Al, On Fri, Oct 24, 2014 at 5:24 AM, Al Viro v...@zeniv.linux.org.uk wrote: On Fri, Oct 24, 2014 at 03:20:55AM +0100, Al Viro wrote: Why the hell do you hold -i_mutex across the entire opening of underlying directory? All you need is to serialize one assignment; the side that loses the race will simply fput() what it opened... Oh, well - that goes under weird pessimisations, easy to fix in followups... OK, pulled into vfs.git, followups in question added. Also there: fix for a long-standing leak in d_splice_alias() failure exits. Guys, could you check that current vfs.git#for-linus survives your local tests? Seems to survive here; if I don't hear of any problems by tomorrow morning, to Linus it goes... FWIW, for that pull request stats would be I think you forgot to merge your for-linus branch into for-next? Anyway, as Stephen announced there will be no linux-next release today, I'm afraid you'll have to delay your pull request to Tuesday. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say programmer or something like that. -- Linus Torvalds -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] overlay filesystem v25
On Fri, Oct 24, 2014 at 03:20:55AM +0100, Al Viro wrote: > Why the hell do you hold ->i_mutex across the entire opening of underlying > directory? All you need is to serialize one assignment; the side that loses > the race will simply fput() what it opened... > > Oh, well - that goes under "weird pessimisations, easy to fix in followups"... OK, pulled into vfs.git, followups in question added. Also there: fix for a long-standing leak in d_splice_alias() failure exits. Guys, could you check that current vfs.git#for-linus survives your local tests? Seems to survive here; if I don't hear of any problems by tomorrow morning, to Linus it goes... FWIW, for that pull request stats would be Shortlog: Al Viro (5): fix inode leaks on d_splice_alias() failure exits overlayfs: don't hold ->i_mutex over opening the real directory overlayfs: make ovl_cache_entry->name an array instead of pointer overlayfs: embed root into overlay_readdir_data overlayfs: embed middle into overlay_readdir_data Andy Whitcroft (1): overlayfs: add statfs support Erez Zadok (1): overlayfs: implement show_options Miklos Szeredi (11): vfs: add i_op->dentry_open() vfs: export do_splice_direct() to modules vfs: export __inode_permission() to modules vfs: introduce clone_private_mount() vfs: export check_sticky() vfs: add whiteout support vfs: add RENAME_WHITEOUT ext4: support RENAME_WHITEOUT shmem: support RENAME_WHITEOUT overlay filesystem fs: limit filesystem stacking depth Neil Brown (1): overlay: overlay filesystem documentation Diffstat: Documentation/filesystems/Locking |2 + Documentation/filesystems/overlayfs.txt | 198 +++ Documentation/filesystems/vfs.txt |7 + MAINTAINERS |7 + fs/Kconfig |1 + fs/Makefile |1 + fs/btrfs/ioctl.c| 20 +- fs/dcache.c |2 + fs/ecryptfs/main.c |7 + fs/ext4/namei.c | 95 +++- fs/internal.h |7 - fs/namei.c | 41 +- fs/namespace.c | 27 + fs/open.c | 23 +- fs/overlayfs/Kconfig| 10 + fs/overlayfs/Makefile |7 + fs/overlayfs/copy_up.c | 414 ++ fs/overlayfs/dir.c | 921 +++ fs/overlayfs/inode.c| 425 ++ fs/overlayfs/overlayfs.h| 191 +++ fs/overlayfs/readdir.c | 589 fs/overlayfs/super.c| 796 ++ fs/splice.c |1 + include/linux/fs.h | 39 ++ include/linux/mount.h |3 + include/uapi/linux/fs.h |1 + mm/shmem.c | 36 +- 27 files changed, 3813 insertions(+), 58 deletions(-) create mode 100644 Documentation/filesystems/overlayfs.txt create mode 100644 fs/overlayfs/Kconfig create mode 100644 fs/overlayfs/Makefile create mode 100644 fs/overlayfs/copy_up.c create mode 100644 fs/overlayfs/dir.c create mode 100644 fs/overlayfs/inode.c create mode 100644 fs/overlayfs/overlayfs.h create mode 100644 fs/overlayfs/readdir.c create mode 100644 fs/overlayfs/super.c -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] overlay filesystem v25
On Fri, Oct 24, 2014 at 01:25:39AM +0200, Miklos Szeredi wrote: > Linus, > > Please pull > > git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git overlayfs.v25 > > Plenty of bugs fixed relative to the previous version, thanks to Al Viro's > review and is now officially be BugFree(TM). Also passes the > unionmount-testsuite by David Howells. *blink* Why the hell do you hold ->i_mutex across the entire opening of underlying directory? All you need is to serialize one assignment; the side that loses the race will simply fput() what it opened... Oh, well - that goes under "weird pessimisations, easy to fix in followups"... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] overlay filesystem v25
Linus, Please pull git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git overlayfs.v25 Plenty of bugs fixed relative to the previous version, thanks to Al Viro's review and is now officially be BugFree(TM). Also passes the unionmount-testsuite by David Howells. Thanks, Miklos --- Andy Whitcroft (1): overlayfs: add statfs support Erez Zadok (1): overlayfs: implement show_options Miklos Szeredi (11): vfs: add i_op->dentry_open() vfs: export do_splice_direct() to modules vfs: export __inode_permission() to modules vfs: introduce clone_private_mount() vfs: export check_sticky() vfs: add whiteout support vfs: add RENAME_WHITEOUT ext4: support RENAME_WHITEOUT shmem: support RENAME_WHITEOUT overlay filesystem fs: limit filesystem stacking depth Neil Brown (1): overlay: overlay filesystem documentation --- Documentation/filesystems/Locking | 2 + Documentation/filesystems/overlayfs.txt | 198 +++ Documentation/filesystems/vfs.txt | 7 + MAINTAINERS | 7 + fs/Kconfig | 1 + fs/Makefile | 1 + fs/btrfs/ioctl.c| 20 +- fs/ecryptfs/main.c | 7 + fs/ext4/namei.c | 95 +++- fs/internal.h | 7 - fs/namei.c | 41 +- fs/namespace.c | 27 + fs/open.c | 23 +- fs/overlayfs/Kconfig| 10 + fs/overlayfs/Makefile | 7 + fs/overlayfs/copy_up.c | 414 ++ fs/overlayfs/dir.c | 921 fs/overlayfs/inode.c| 425 +++ fs/overlayfs/overlayfs.h| 191 +++ fs/overlayfs/readdir.c | 587 fs/overlayfs/super.c| 796 +++ fs/splice.c | 1 + include/linux/fs.h | 39 ++ include/linux/mount.h | 3 + include/uapi/linux/fs.h | 1 + mm/shmem.c | 36 +- 26 files changed, 3809 insertions(+), 58 deletions(-) create mode 100644 Documentation/filesystems/overlayfs.txt create mode 100644 fs/overlayfs/Kconfig create mode 100644 fs/overlayfs/Makefile create mode 100644 fs/overlayfs/copy_up.c create mode 100644 fs/overlayfs/dir.c create mode 100644 fs/overlayfs/inode.c create mode 100644 fs/overlayfs/overlayfs.h create mode 100644 fs/overlayfs/readdir.c create mode 100644 fs/overlayfs/super.c -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] overlay filesystem v25
Linus, Please pull git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git overlayfs.v25 Plenty of bugs fixed relative to the previous version, thanks to Al Viro's review and is now officially be BugFree(TM). Also passes the unionmount-testsuite by David Howells. Thanks, Miklos --- Andy Whitcroft (1): overlayfs: add statfs support Erez Zadok (1): overlayfs: implement show_options Miklos Szeredi (11): vfs: add i_op-dentry_open() vfs: export do_splice_direct() to modules vfs: export __inode_permission() to modules vfs: introduce clone_private_mount() vfs: export check_sticky() vfs: add whiteout support vfs: add RENAME_WHITEOUT ext4: support RENAME_WHITEOUT shmem: support RENAME_WHITEOUT overlay filesystem fs: limit filesystem stacking depth Neil Brown (1): overlay: overlay filesystem documentation --- Documentation/filesystems/Locking | 2 + Documentation/filesystems/overlayfs.txt | 198 +++ Documentation/filesystems/vfs.txt | 7 + MAINTAINERS | 7 + fs/Kconfig | 1 + fs/Makefile | 1 + fs/btrfs/ioctl.c| 20 +- fs/ecryptfs/main.c | 7 + fs/ext4/namei.c | 95 +++- fs/internal.h | 7 - fs/namei.c | 41 +- fs/namespace.c | 27 + fs/open.c | 23 +- fs/overlayfs/Kconfig| 10 + fs/overlayfs/Makefile | 7 + fs/overlayfs/copy_up.c | 414 ++ fs/overlayfs/dir.c | 921 fs/overlayfs/inode.c| 425 +++ fs/overlayfs/overlayfs.h| 191 +++ fs/overlayfs/readdir.c | 587 fs/overlayfs/super.c| 796 +++ fs/splice.c | 1 + include/linux/fs.h | 39 ++ include/linux/mount.h | 3 + include/uapi/linux/fs.h | 1 + mm/shmem.c | 36 +- 26 files changed, 3809 insertions(+), 58 deletions(-) create mode 100644 Documentation/filesystems/overlayfs.txt create mode 100644 fs/overlayfs/Kconfig create mode 100644 fs/overlayfs/Makefile create mode 100644 fs/overlayfs/copy_up.c create mode 100644 fs/overlayfs/dir.c create mode 100644 fs/overlayfs/inode.c create mode 100644 fs/overlayfs/overlayfs.h create mode 100644 fs/overlayfs/readdir.c create mode 100644 fs/overlayfs/super.c -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] overlay filesystem v25
On Fri, Oct 24, 2014 at 01:25:39AM +0200, Miklos Szeredi wrote: Linus, Please pull git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git overlayfs.v25 Plenty of bugs fixed relative to the previous version, thanks to Al Viro's review and is now officially be BugFree(TM). Also passes the unionmount-testsuite by David Howells. *blink* Why the hell do you hold -i_mutex across the entire opening of underlying directory? All you need is to serialize one assignment; the side that loses the race will simply fput() what it opened... Oh, well - that goes under weird pessimisations, easy to fix in followups... -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] overlay filesystem v25
On Fri, Oct 24, 2014 at 03:20:55AM +0100, Al Viro wrote: Why the hell do you hold -i_mutex across the entire opening of underlying directory? All you need is to serialize one assignment; the side that loses the race will simply fput() what it opened... Oh, well - that goes under weird pessimisations, easy to fix in followups... OK, pulled into vfs.git, followups in question added. Also there: fix for a long-standing leak in d_splice_alias() failure exits. Guys, could you check that current vfs.git#for-linus survives your local tests? Seems to survive here; if I don't hear of any problems by tomorrow morning, to Linus it goes... FWIW, for that pull request stats would be Shortlog: Al Viro (5): fix inode leaks on d_splice_alias() failure exits overlayfs: don't hold -i_mutex over opening the real directory overlayfs: make ovl_cache_entry-name an array instead of pointer overlayfs: embed root into overlay_readdir_data overlayfs: embed middle into overlay_readdir_data Andy Whitcroft (1): overlayfs: add statfs support Erez Zadok (1): overlayfs: implement show_options Miklos Szeredi (11): vfs: add i_op-dentry_open() vfs: export do_splice_direct() to modules vfs: export __inode_permission() to modules vfs: introduce clone_private_mount() vfs: export check_sticky() vfs: add whiteout support vfs: add RENAME_WHITEOUT ext4: support RENAME_WHITEOUT shmem: support RENAME_WHITEOUT overlay filesystem fs: limit filesystem stacking depth Neil Brown (1): overlay: overlay filesystem documentation Diffstat: Documentation/filesystems/Locking |2 + Documentation/filesystems/overlayfs.txt | 198 +++ Documentation/filesystems/vfs.txt |7 + MAINTAINERS |7 + fs/Kconfig |1 + fs/Makefile |1 + fs/btrfs/ioctl.c| 20 +- fs/dcache.c |2 + fs/ecryptfs/main.c |7 + fs/ext4/namei.c | 95 +++- fs/internal.h |7 - fs/namei.c | 41 +- fs/namespace.c | 27 + fs/open.c | 23 +- fs/overlayfs/Kconfig| 10 + fs/overlayfs/Makefile |7 + fs/overlayfs/copy_up.c | 414 ++ fs/overlayfs/dir.c | 921 +++ fs/overlayfs/inode.c| 425 ++ fs/overlayfs/overlayfs.h| 191 +++ fs/overlayfs/readdir.c | 589 fs/overlayfs/super.c| 796 ++ fs/splice.c |1 + include/linux/fs.h | 39 ++ include/linux/mount.h |3 + include/uapi/linux/fs.h |1 + mm/shmem.c | 36 +- 27 files changed, 3813 insertions(+), 58 deletions(-) create mode 100644 Documentation/filesystems/overlayfs.txt create mode 100644 fs/overlayfs/Kconfig create mode 100644 fs/overlayfs/Makefile create mode 100644 fs/overlayfs/copy_up.c create mode 100644 fs/overlayfs/dir.c create mode 100644 fs/overlayfs/inode.c create mode 100644 fs/overlayfs/overlayfs.h create mode 100644 fs/overlayfs/readdir.c create mode 100644 fs/overlayfs/super.c -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/