Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek
- Original Message - > Here's a second version of the patch (now a patch set) to eliminate > rhashtable_walk_peek in gfs2. > > The first patch introduces lockref_put_not_zero, the inverse of > lockref_get_not_zero. > > The second patch eliminates rhashtable_walk_peek in gfs2. In > gfs2_glock_iter_next, the new lockref function from patch one is used to > drop a lockref count as long as the count doesn't drop to zero. This is > almost always the case; if there is a risk of dropping the last > reference, we must defer that to a work queue because dropping the last > reference may sleep. > > Thanks, > Andreas > > Andreas Gruenbacher (2): > lockref: Add lockref_put_not_zero > gfs2: Stop using rhashtable_walk_peek > > fs/gfs2/glock.c | 47 --- > include/linux/lockref.h | 1 + > lib/lockref.c | 28 > 3 files changed, 57 insertions(+), 19 deletions(-) > > -- > 2.14.3 Hi, Thanks. These two patches are now pushed to the for-next branch of the linux-gfs2 tree: https://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2.git/commit/?h=for-next&id=450b1f6f56350c630e795f240dc5a77aa8aa2419 https://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2.git/commit/?h=for-next&id=3fd5d3ad35dc44aaf0f28d60cc0eb75887bff54d Regards, Bob Peterson Red Hat File Systems
Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek
On Wed, Apr 04 2018, Andreas Grünbacher wrote: > Herbert Xu schrieb am Mi. 4. Apr. 2018 um > 17:51: > >> On Wed, Apr 04, 2018 at 11:46:28AM -0400, Bob Peterson wrote: >> > >> > The patches look good. The big question is whether to add them to this >> > merge window while it's still open. Opinions? >> >> We're still hashing out the rhashtable interface so I don't think now is >> the time to rush things. > > > Fair enough. No matter how rhashtable_walk_peek changes, we‘ll still need > these two patches to fix the glock dump though. Those two patches look fine to me and don't depend on changes to rhashtable, so it is up to you when they go upstream. However, I think the code can be substantially simplified, particularly once we make rhashtable a little cleverer. So this is what I'll probably be doing for a similar situation in lustre Having examined seqfile closely, it is apparent that if ->start never changes *ppos, and if ->next always increments it (except maybe on error) then 1/ ->next is only ever given a 'pos' that was returned by the previous call to ->start or ->next. So it should *always* return the next object, after the one previously returned by ->start or ->next. It never needs to 'seek'. The 'traverse()' function in seq_file.c does any seeking needed. ->next needs to increase 'pos' and when walking a dynamic list, it is easiest if it just increments it. 2/ ->start is only called with a pos of: 0 - in this case it should rewind to the start the last pos passed to ->start of ->next In this case it should return the same thing that was returned last time. If it no longer exists, then the following one should be returned. one more than the last pos passed to ->start or ->next In this case it should return the object after the last one returned. The proposed enhancement to rhashtable_walk* is to add a rhashtable_walk_prev() which returns the previously returned object, if it is still in the table, or NULL. It also enhances rhashtable_walk_start() so that if the previously returned object is still in the table, it is preserved as the current cursor. This means that if you take some action to ensure that the previously returned object remains in the table until the next ->start, then you can reliably walk the table with no duplicates or omissions (unless a concurrent rehash causes duplicates) If you don't keep the object in the table and it gets removed, then the 'skip' counter is used to find your place, and you might get duplicates or omissions. NeilBrown signature.asc Description: PGP signature
Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek
On Wed, Apr 04, 2018 at 11:46:28AM -0400, Bob Peterson wrote: > > The patches look good. The big question is whether to add them to this > merge window while it's still open. Opinions? We're still hashing out the rhashtable interface so I don't think now is the time to rush things. Thanks, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek
- Original Message - > Here's a second version of the patch (now a patch set) to eliminate > rhashtable_walk_peek in gfs2. > > The first patch introduces lockref_put_not_zero, the inverse of > lockref_get_not_zero. > > The second patch eliminates rhashtable_walk_peek in gfs2. In > gfs2_glock_iter_next, the new lockref function from patch one is used to > drop a lockref count as long as the count doesn't drop to zero. This is > almost always the case; if there is a risk of dropping the last > reference, we must defer that to a work queue because dropping the last > reference may sleep. > > Thanks, > Andreas > > Andreas Gruenbacher (2): > lockref: Add lockref_put_not_zero > gfs2: Stop using rhashtable_walk_peek > > fs/gfs2/glock.c | 47 --- > include/linux/lockref.h | 1 + > lib/lockref.c | 28 > 3 files changed, 57 insertions(+), 19 deletions(-) > > -- > 2.14.3 Hi, The patches look good. The big question is whether to add them to this merge window while it's still open. Opinions? Acked-by: Bob Peterson Regards, Bob Peterson
Re: [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek
On Tue, Apr 03, 2018 at 01:41:26PM +1000, NeilBrown wrote: > > Do we really need a rhashtable_walk_peek() interface? > I imagine that a seqfile ->start function can do: > > if (*ppos == 0 && last_pos != 0) { > rhashtable_walk_exit(&iter); > rhashtable_walk_enter(&table, &iter); > last_pos = 0; > } > rhashtable_walk_start(&iter); > if (*ppos == last_pos && iter.p) > return iter.p; > last_pos = *ppos; We don't want users poking into the internals of iter. If you're suggesting we could simplify rhashtable_walk_peek to just this after your patch then yes possibly. You also need to ensure that not only seqfs users continue to work but also netlink dump users which are slightly different. > It might be OK to have a function call instead of expecting people to > use iter.p directly. Yes that would definitely be the preferred option. Thanks, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek
On Fri, Mar 30 2018, Herbert Xu wrote: > On Thu, Mar 29, 2018 at 06:52:34PM +0200, Andreas Gruenbacher wrote: >> >> Should rhashtable_walk_peek be kept around even if there are no more >> users? I have my doubts. > > Absolutely. All netlink dumps using rhashtable_walk_next are buggy > and need to switch over to rhashtable_walk_peek. As otherwise > the object that triggers the out-of-space condition will be skipped > upon resumption. Do we really need a rhashtable_walk_peek() interface? I imagine that a seqfile ->start function can do: if (*ppos == 0 && last_pos != 0) { rhashtable_walk_exit(&iter); rhashtable_walk_enter(&table, &iter); last_pos = 0; } rhashtable_walk_start(&iter); if (*ppos == last_pos && iter.p) return iter.p; last_pos = *ppos; return rhashtable_walk_next(&iter) and the ->next function just does last_pos = *ppos; *ppos += 1; do p = rhashtable_walk_next(&iter); while (IS_ERR(p)); return p; It might be OK to have a function call instead of expecting people to use iter.p directly. static inline void *rhashtable_walk_prev(struct rhashtable_iter *iter) { return iter->p; } Thoughts? Thanks, NeilBrown signature.asc Description: PGP signature
Re: [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek
On Thu, Mar 29, 2018 at 06:52:34PM +0200, Andreas Gruenbacher wrote: > > Should rhashtable_walk_peek be kept around even if there are no more > users? I have my doubts. Absolutely. All netlink dumps using rhashtable_walk_next are buggy and need to switch over to rhashtable_walk_peek. As otherwise the object that triggers the out-of-space condition will be skipped upon resumption. Cheers, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek
On 29 March 2018 at 17:41, Herbert Xu wrote: > On Thu, Mar 29, 2018 at 03:15:54PM +0200, Andreas Gruenbacher wrote: >> >> For all I know, Neil's latest plan is to get rhashtable_walk_peek >> replaced and removed because it is unfixable. This patch removes the >> one and only user. > > His latest patch makes rhashtable_walk_peek stable in the face of > removals. > > https://patchwork.ozlabs.org/patch/892534/ Ok, I can slightly update my patch description. The problem still remains that glocks can be deleted from the rhashtable between stop/start, and that needs to be fixed in gfs2. Once that's done, keeping track of the current glock comes for free and we won't need rhashtable_walk_peek anymore. Should rhashtable_walk_peek be kept around even if there are no more users? I have my doubts. Thanks, Andreas
Re: [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek
On Thu, Mar 29, 2018 at 03:15:54PM +0200, Andreas Gruenbacher wrote: > > For all I know, Neil's latest plan is to get rhashtable_walk_peek > replaced and removed because it is unfixable. This patch removes the > one and only user. His latest patch makes rhashtable_walk_peek stable in the face of removals. https://patchwork.ozlabs.org/patch/892534/ Cheers, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek
On 29 March 2018 at 14:35, Herbert Xu wrote: > On Thu, Mar 29, 2018 at 02:06:10PM +0200, Andreas Gruenbacher wrote: >> Here's a second version of the patch (now a patch set) to eliminate >> rhashtable_walk_peek in gfs2. >> >> The first patch introduces lockref_put_not_zero, the inverse of >> lockref_get_not_zero. >> >> The second patch eliminates rhashtable_walk_peek in gfs2. In >> gfs2_glock_iter_next, the new lockref function from patch one is used to >> drop a lockref count as long as the count doesn't drop to zero. This is >> almost always the case; if there is a risk of dropping the last >> reference, we must defer that to a work queue because dropping the last >> reference may sleep. > > In light of Neil's latest patch, do we still need this? For all I know, Neil's latest plan is to get rhashtable_walk_peek replaced and removed because it is unfixable. This patch removes the one and only user. Thanks, Andreas
Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek
On 29 March 2018 at 14:24, Steven Whitehouse wrote: > Hi, > > Can we solve the problem another way, by not taking refs on the glocks when > we are iterating over them for the debugfs files? I assume that is the main > issue here. > > We didn't used to take refs since the rcu locking was enough during the walk > itself. We used to only keep track of the hash bucket and offset within the > bucket when we dropped the rcu lock between calls to the iterator. I may > have lost track of why that approach did not work? That doesn't work because when a glock doesn't fit into one read, we need to make sure that the next read will continue with the same glock or else we'll end up with a corrupted dump. And rhashtable_walk_peek cannot guarantee that. I've done some minimal performance testing and the additional ref taking only impacted the performance in the 10% range or less, so it doesn't really matter. Andreas
Re: [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek
On Thu, Mar 29, 2018 at 02:06:10PM +0200, Andreas Gruenbacher wrote: > Here's a second version of the patch (now a patch set) to eliminate > rhashtable_walk_peek in gfs2. > > The first patch introduces lockref_put_not_zero, the inverse of > lockref_get_not_zero. > > The second patch eliminates rhashtable_walk_peek in gfs2. In > gfs2_glock_iter_next, the new lockref function from patch one is used to > drop a lockref count as long as the count doesn't drop to zero. This is > almost always the case; if there is a risk of dropping the last > reference, we must defer that to a work queue because dropping the last > reference may sleep. In light of Neil's latest patch, do we still need this? Thanks, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek
Hi, Can we solve the problem another way, by not taking refs on the glocks when we are iterating over them for the debugfs files? I assume that is the main issue here. We didn't used to take refs since the rcu locking was enough during the walk itself. We used to only keep track of the hash bucket and offset within the bucket when we dropped the rcu lock between calls to the iterator. I may have lost track of why that approach did not work? Steve. On 29/03/18 13:06, Andreas Gruenbacher wrote: Here's a second version of the patch (now a patch set) to eliminate rhashtable_walk_peek in gfs2. The first patch introduces lockref_put_not_zero, the inverse of lockref_get_not_zero. The second patch eliminates rhashtable_walk_peek in gfs2. In gfs2_glock_iter_next, the new lockref function from patch one is used to drop a lockref count as long as the count doesn't drop to zero. This is almost always the case; if there is a risk of dropping the last reference, we must defer that to a work queue because dropping the last reference may sleep. Thanks, Andreas Andreas Gruenbacher (2): lockref: Add lockref_put_not_zero gfs2: Stop using rhashtable_walk_peek fs/gfs2/glock.c | 47 --- include/linux/lockref.h | 1 + lib/lockref.c | 28 3 files changed, 57 insertions(+), 19 deletions(-)
[PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek
Here's a second version of the patch (now a patch set) to eliminate rhashtable_walk_peek in gfs2. The first patch introduces lockref_put_not_zero, the inverse of lockref_get_not_zero. The second patch eliminates rhashtable_walk_peek in gfs2. In gfs2_glock_iter_next, the new lockref function from patch one is used to drop a lockref count as long as the count doesn't drop to zero. This is almost always the case; if there is a risk of dropping the last reference, we must defer that to a work queue because dropping the last reference may sleep. Thanks, Andreas Andreas Gruenbacher (2): lockref: Add lockref_put_not_zero gfs2: Stop using rhashtable_walk_peek fs/gfs2/glock.c | 47 --- include/linux/lockref.h | 1 + lib/lockref.c | 28 3 files changed, 57 insertions(+), 19 deletions(-) -- 2.14.3