Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek

2018-04-12 Thread Bob Peterson
- Original Message -
> Here's a second version of the patch (now a patch set) to eliminate
> rhashtable_walk_peek in gfs2.
> 
> The first patch introduces lockref_put_not_zero, the inverse of
> lockref_get_not_zero.
> 
> The second patch eliminates rhashtable_walk_peek in gfs2.  In
> gfs2_glock_iter_next, the new lockref function from patch one is used to
> drop a lockref count as long as the count doesn't drop to zero.  This is
> almost always the case; if there is a risk of dropping the last
> reference, we must defer that to a work queue because dropping the last
> reference may sleep.
> 
> Thanks,
> Andreas
> 
> Andreas Gruenbacher (2):
>   lockref: Add lockref_put_not_zero
>   gfs2: Stop using rhashtable_walk_peek
> 
>  fs/gfs2/glock.c | 47 ---
>  include/linux/lockref.h |  1 +
>  lib/lockref.c   | 28 
>  3 files changed, 57 insertions(+), 19 deletions(-)
> 
> --
> 2.14.3

Hi,

Thanks. These two patches are now pushed to the for-next branch of the 
linux-gfs2 tree:

https://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2.git/commit/?h=for-next=450b1f6f56350c630e795f240dc5a77aa8aa2419
https://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2.git/commit/?h=for-next=3fd5d3ad35dc44aaf0f28d60cc0eb75887bff54d

Regards,

Bob Peterson
Red Hat File Systems


Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek

2018-04-12 Thread Bob Peterson
- Original Message -
> Here's a second version of the patch (now a patch set) to eliminate
> rhashtable_walk_peek in gfs2.
> 
> The first patch introduces lockref_put_not_zero, the inverse of
> lockref_get_not_zero.
> 
> The second patch eliminates rhashtable_walk_peek in gfs2.  In
> gfs2_glock_iter_next, the new lockref function from patch one is used to
> drop a lockref count as long as the count doesn't drop to zero.  This is
> almost always the case; if there is a risk of dropping the last
> reference, we must defer that to a work queue because dropping the last
> reference may sleep.
> 
> Thanks,
> Andreas
> 
> Andreas Gruenbacher (2):
>   lockref: Add lockref_put_not_zero
>   gfs2: Stop using rhashtable_walk_peek
> 
>  fs/gfs2/glock.c | 47 ---
>  include/linux/lockref.h |  1 +
>  lib/lockref.c   | 28 
>  3 files changed, 57 insertions(+), 19 deletions(-)
> 
> --
> 2.14.3

Hi,

Thanks. These two patches are now pushed to the for-next branch of the 
linux-gfs2 tree:

https://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2.git/commit/?h=for-next=450b1f6f56350c630e795f240dc5a77aa8aa2419
https://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2.git/commit/?h=for-next=3fd5d3ad35dc44aaf0f28d60cc0eb75887bff54d

Regards,

Bob Peterson
Red Hat File Systems


Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek

2018-04-05 Thread NeilBrown
On Wed, Apr 04 2018, Andreas Grünbacher wrote:

> Herbert Xu  schrieb am Mi. 4. Apr. 2018 um
> 17:51:
>
>> On Wed, Apr 04, 2018 at 11:46:28AM -0400, Bob Peterson wrote:
>> >
>> > The patches look good. The big question is whether to add them to this
>> > merge window while it's still open. Opinions?
>>
>> We're still hashing out the rhashtable interface so I don't think now is
>> the time to rush things.
>
>
> Fair enough. No matter how rhashtable_walk_peek changes, we‘ll still need
> these two patches to fix the glock dump though.

Those two patches look fine to me and don't depend on changes to
rhashtable, so it is up to you when they go upstream.

However, I think the code can be substantially simplified, particularly
once we make rhashtable a little cleverer.
So this is what I'll probably be doing for a similar situation in
lustre

Having examined seqfile closely, it is apparent that if ->start never
changes *ppos, and if ->next always increments it (except maybe on error)
then

1/ ->next is only ever given a 'pos' that was returned by the previous
   call to ->start or ->next.  So it should *always* return the next
   object, after the one previously returned by ->start or ->next.  It
   never needs to 'seek'. The 'traverse()' function in seq_file.c does
   any seeking needed.  ->next needs to increase 'pos' and when walking
   a dynamic list, it is easiest if it just increments it.

2/ ->start is only called with a pos of:
0 - in this case it should rewind to the start
the last pos passed to ->start of ->next
 In this case it should return the same thing that was
 returned last time.  If it no longer exists, then
 the following one should be returned.
one more than the last pos passed to ->start or ->next
 In this case it should return the object after the
 last one returned.

The proposed enhancement to rhashtable_walk* is to add a
rhashtable_walk_prev() which returns the previously returned object,
if it is still in the table, or NULL. It also enhances
rhashtable_walk_start() so that if the previously returned object is
still in the table, it is preserved as the current cursor.
This means that if you take some action to ensure that the
previously returned object remains in the table until the next ->start,
then you can reliably walk the table with no duplicates or omissions
(unless a concurrent rehash causes duplicates)
If you don't keep the object in the table and it gets removed, then
the 'skip' counter is used to find your place, and you might get
duplicates or omissions.

NeilBrown


signature.asc
Description: PGP signature


Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek

2018-04-05 Thread NeilBrown
On Wed, Apr 04 2018, Andreas Grünbacher wrote:

> Herbert Xu  schrieb am Mi. 4. Apr. 2018 um
> 17:51:
>
>> On Wed, Apr 04, 2018 at 11:46:28AM -0400, Bob Peterson wrote:
>> >
>> > The patches look good. The big question is whether to add them to this
>> > merge window while it's still open. Opinions?
>>
>> We're still hashing out the rhashtable interface so I don't think now is
>> the time to rush things.
>
>
> Fair enough. No matter how rhashtable_walk_peek changes, we‘ll still need
> these two patches to fix the glock dump though.

Those two patches look fine to me and don't depend on changes to
rhashtable, so it is up to you when they go upstream.

However, I think the code can be substantially simplified, particularly
once we make rhashtable a little cleverer.
So this is what I'll probably be doing for a similar situation in
lustre

Having examined seqfile closely, it is apparent that if ->start never
changes *ppos, and if ->next always increments it (except maybe on error)
then

1/ ->next is only ever given a 'pos' that was returned by the previous
   call to ->start or ->next.  So it should *always* return the next
   object, after the one previously returned by ->start or ->next.  It
   never needs to 'seek'. The 'traverse()' function in seq_file.c does
   any seeking needed.  ->next needs to increase 'pos' and when walking
   a dynamic list, it is easiest if it just increments it.

2/ ->start is only called with a pos of:
0 - in this case it should rewind to the start
the last pos passed to ->start of ->next
 In this case it should return the same thing that was
 returned last time.  If it no longer exists, then
 the following one should be returned.
one more than the last pos passed to ->start or ->next
 In this case it should return the object after the
 last one returned.

The proposed enhancement to rhashtable_walk* is to add a
rhashtable_walk_prev() which returns the previously returned object,
if it is still in the table, or NULL. It also enhances
rhashtable_walk_start() so that if the previously returned object is
still in the table, it is preserved as the current cursor.
This means that if you take some action to ensure that the
previously returned object remains in the table until the next ->start,
then you can reliably walk the table with no duplicates or omissions
(unless a concurrent rehash causes duplicates)
If you don't keep the object in the table and it gets removed, then
the 'skip' counter is used to find your place, and you might get
duplicates or omissions.

NeilBrown


signature.asc
Description: PGP signature


Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek

2018-04-04 Thread Herbert Xu
On Wed, Apr 04, 2018 at 11:46:28AM -0400, Bob Peterson wrote:
>
> The patches look good. The big question is whether to add them to this
> merge window while it's still open. Opinions?

We're still hashing out the rhashtable interface so I don't think
now is the time to rush things.

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek

2018-04-04 Thread Herbert Xu
On Wed, Apr 04, 2018 at 11:46:28AM -0400, Bob Peterson wrote:
>
> The patches look good. The big question is whether to add them to this
> merge window while it's still open. Opinions?

We're still hashing out the rhashtable interface so I don't think
now is the time to rush things.

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek

2018-04-04 Thread Bob Peterson
- Original Message -
> Here's a second version of the patch (now a patch set) to eliminate
> rhashtable_walk_peek in gfs2.
> 
> The first patch introduces lockref_put_not_zero, the inverse of
> lockref_get_not_zero.
> 
> The second patch eliminates rhashtable_walk_peek in gfs2.  In
> gfs2_glock_iter_next, the new lockref function from patch one is used to
> drop a lockref count as long as the count doesn't drop to zero.  This is
> almost always the case; if there is a risk of dropping the last
> reference, we must defer that to a work queue because dropping the last
> reference may sleep.
> 
> Thanks,
> Andreas
> 
> Andreas Gruenbacher (2):
>   lockref: Add lockref_put_not_zero
>   gfs2: Stop using rhashtable_walk_peek
> 
>  fs/gfs2/glock.c | 47 ---
>  include/linux/lockref.h |  1 +
>  lib/lockref.c   | 28 
>  3 files changed, 57 insertions(+), 19 deletions(-)
> 
> --
> 2.14.3

Hi,

The patches look good. The big question is whether to add them to this
merge window while it's still open. Opinions?

Acked-by: Bob Peterson 

Regards,

Bob Peterson


Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek

2018-04-04 Thread Bob Peterson
- Original Message -
> Here's a second version of the patch (now a patch set) to eliminate
> rhashtable_walk_peek in gfs2.
> 
> The first patch introduces lockref_put_not_zero, the inverse of
> lockref_get_not_zero.
> 
> The second patch eliminates rhashtable_walk_peek in gfs2.  In
> gfs2_glock_iter_next, the new lockref function from patch one is used to
> drop a lockref count as long as the count doesn't drop to zero.  This is
> almost always the case; if there is a risk of dropping the last
> reference, we must defer that to a work queue because dropping the last
> reference may sleep.
> 
> Thanks,
> Andreas
> 
> Andreas Gruenbacher (2):
>   lockref: Add lockref_put_not_zero
>   gfs2: Stop using rhashtable_walk_peek
> 
>  fs/gfs2/glock.c | 47 ---
>  include/linux/lockref.h |  1 +
>  lib/lockref.c   | 28 
>  3 files changed, 57 insertions(+), 19 deletions(-)
> 
> --
> 2.14.3

Hi,

The patches look good. The big question is whether to add them to this
merge window while it's still open. Opinions?

Acked-by: Bob Peterson 

Regards,

Bob Peterson


Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek

2018-03-29 Thread Andreas Gruenbacher
On 29 March 2018 at 14:24, Steven Whitehouse  wrote:
> Hi,
>
> Can we solve the problem another way, by not taking refs on the glocks when
> we are iterating over them for the debugfs files? I assume that is the main
> issue here.
>
> We didn't used to take refs since the rcu locking was enough during the walk
> itself. We used to only keep track of the hash bucket and offset within the
> bucket when we dropped the rcu lock between calls to the iterator. I may
> have lost track of why that approach did not work?

That doesn't work because when a glock doesn't fit into one read, we
need to make sure that the next read will continue with the same glock
or else we'll end up with a corrupted dump. And rhashtable_walk_peek
cannot guarantee that.

I've done some minimal performance testing and the additional ref
taking only impacted the performance in the 10% range or less, so it
doesn't really matter.

Andreas


Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek

2018-03-29 Thread Andreas Gruenbacher
On 29 March 2018 at 14:24, Steven Whitehouse  wrote:
> Hi,
>
> Can we solve the problem another way, by not taking refs on the glocks when
> we are iterating over them for the debugfs files? I assume that is the main
> issue here.
>
> We didn't used to take refs since the rcu locking was enough during the walk
> itself. We used to only keep track of the hash bucket and offset within the
> bucket when we dropped the rcu lock between calls to the iterator. I may
> have lost track of why that approach did not work?

That doesn't work because when a glock doesn't fit into one read, we
need to make sure that the next read will continue with the same glock
or else we'll end up with a corrupted dump. And rhashtable_walk_peek
cannot guarantee that.

I've done some minimal performance testing and the additional ref
taking only impacted the performance in the 10% range or less, so it
doesn't really matter.

Andreas


Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek

2018-03-29 Thread Steven Whitehouse

Hi,

Can we solve the problem another way, by not taking refs on the glocks 
when we are iterating over them for the debugfs files? I assume that is 
the main issue here.


We didn't used to take refs since the rcu locking was enough during the 
walk itself. We used to only keep track of the hash bucket and offset 
within the bucket when we dropped the rcu lock between calls to the 
iterator. I may have lost track of why that approach did not work?


Steve.


On 29/03/18 13:06, Andreas Gruenbacher wrote:

Here's a second version of the patch (now a patch set) to eliminate
rhashtable_walk_peek in gfs2.

The first patch introduces lockref_put_not_zero, the inverse of
lockref_get_not_zero.

The second patch eliminates rhashtable_walk_peek in gfs2.  In
gfs2_glock_iter_next, the new lockref function from patch one is used to
drop a lockref count as long as the count doesn't drop to zero.  This is
almost always the case; if there is a risk of dropping the last
reference, we must defer that to a work queue because dropping the last
reference may sleep.

Thanks,
Andreas

Andreas Gruenbacher (2):
   lockref: Add lockref_put_not_zero
   gfs2: Stop using rhashtable_walk_peek

  fs/gfs2/glock.c | 47 ---
  include/linux/lockref.h |  1 +
  lib/lockref.c   | 28 
  3 files changed, 57 insertions(+), 19 deletions(-)





Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek

2018-03-29 Thread Steven Whitehouse

Hi,

Can we solve the problem another way, by not taking refs on the glocks 
when we are iterating over them for the debugfs files? I assume that is 
the main issue here.


We didn't used to take refs since the rcu locking was enough during the 
walk itself. We used to only keep track of the hash bucket and offset 
within the bucket when we dropped the rcu lock between calls to the 
iterator. I may have lost track of why that approach did not work?


Steve.


On 29/03/18 13:06, Andreas Gruenbacher wrote:

Here's a second version of the patch (now a patch set) to eliminate
rhashtable_walk_peek in gfs2.

The first patch introduces lockref_put_not_zero, the inverse of
lockref_get_not_zero.

The second patch eliminates rhashtable_walk_peek in gfs2.  In
gfs2_glock_iter_next, the new lockref function from patch one is used to
drop a lockref count as long as the count doesn't drop to zero.  This is
almost always the case; if there is a risk of dropping the last
reference, we must defer that to a work queue because dropping the last
reference may sleep.

Thanks,
Andreas

Andreas Gruenbacher (2):
   lockref: Add lockref_put_not_zero
   gfs2: Stop using rhashtable_walk_peek

  fs/gfs2/glock.c | 47 ---
  include/linux/lockref.h |  1 +
  lib/lockref.c   | 28 
  3 files changed, 57 insertions(+), 19 deletions(-)