Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek

2018-04-12 Thread Bob Peterson
- Original Message -
> Here's a second version of the patch (now a patch set) to eliminate
> rhashtable_walk_peek in gfs2.
> 
> The first patch introduces lockref_put_not_zero, the inverse of
> lockref_get_not_zero.
> 
> The second patch eliminates rhashtable_walk_peek in gfs2.  In
> gfs2_glock_iter_next, the new lockref function from patch one is used to
> drop a lockref count as long as the count doesn't drop to zero.  This is
> almost always the case; if there is a risk of dropping the last
> reference, we must defer that to a work queue because dropping the last
> reference may sleep.
> 
> Thanks,
> Andreas
> 
> Andreas Gruenbacher (2):
>   lockref: Add lockref_put_not_zero
>   gfs2: Stop using rhashtable_walk_peek
> 
>  fs/gfs2/glock.c | 47 ---
>  include/linux/lockref.h |  1 +
>  lib/lockref.c   | 28 
>  3 files changed, 57 insertions(+), 19 deletions(-)
> 
> --
> 2.14.3

Hi,

Thanks. These two patches are now pushed to the for-next branch of the 
linux-gfs2 tree:

https://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2.git/commit/?h=for-next=450b1f6f56350c630e795f240dc5a77aa8aa2419
https://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2.git/commit/?h=for-next=3fd5d3ad35dc44aaf0f28d60cc0eb75887bff54d

Regards,

Bob Peterson
Red Hat File Systems



Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek

2018-04-05 Thread NeilBrown
On Wed, Apr 04 2018, Andreas Grünbacher wrote:

> Herbert Xu  schrieb am Mi. 4. Apr. 2018 um
> 17:51:
>
>> On Wed, Apr 04, 2018 at 11:46:28AM -0400, Bob Peterson wrote:
>> >
>> > The patches look good. The big question is whether to add them to this
>> > merge window while it's still open. Opinions?
>>
>> We're still hashing out the rhashtable interface so I don't think now is
>> the time to rush things.
>
>
> Fair enough. No matter how rhashtable_walk_peek changes, we‘ll still need
> these two patches to fix the glock dump though.

Those two patches look fine to me and don't depend on changes to
rhashtable, so it is up to you when they go upstream.

However, I think the code can be substantially simplified, particularly
once we make rhashtable a little cleverer.
So this is what I'll probably be doing for a similar situation in
lustre

Having examined seqfile closely, it is apparent that if ->start never
changes *ppos, and if ->next always increments it (except maybe on error)
then

1/ ->next is only ever given a 'pos' that was returned by the previous
   call to ->start or ->next.  So it should *always* return the next
   object, after the one previously returned by ->start or ->next.  It
   never needs to 'seek'. The 'traverse()' function in seq_file.c does
   any seeking needed.  ->next needs to increase 'pos' and when walking
   a dynamic list, it is easiest if it just increments it.

2/ ->start is only called with a pos of:
0 - in this case it should rewind to the start
the last pos passed to ->start of ->next
 In this case it should return the same thing that was
 returned last time.  If it no longer exists, then
 the following one should be returned.
one more than the last pos passed to ->start or ->next
 In this case it should return the object after the
 last one returned.

The proposed enhancement to rhashtable_walk* is to add a
rhashtable_walk_prev() which returns the previously returned object,
if it is still in the table, or NULL. It also enhances
rhashtable_walk_start() so that if the previously returned object is
still in the table, it is preserved as the current cursor.
This means that if you take some action to ensure that the
previously returned object remains in the table until the next ->start,
then you can reliably walk the table with no duplicates or omissions
(unless a concurrent rehash causes duplicates)
If you don't keep the object in the table and it gets removed, then
the 'skip' counter is used to find your place, and you might get
duplicates or omissions.

NeilBrown


signature.asc
Description: PGP signature


Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek

2018-04-04 Thread Andreas Grünbacher
Herbert Xu  schrieb am Mi. 4. Apr. 2018 um
17:51:

> On Wed, Apr 04, 2018 at 11:46:28AM -0400, Bob Peterson wrote:
> >
> > The patches look good. The big question is whether to add them to this
> > merge window while it's still open. Opinions?
>
> We're still hashing out the rhashtable interface so I don't think now is
> the time to rush things.


Fair enough. No matter how rhashtable_walk_peek changes, we‘ll still need
these two patches to fix the glock dump though.

Thanks,
Andreas


Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek

2018-04-04 Thread Herbert Xu
On Wed, Apr 04, 2018 at 11:46:28AM -0400, Bob Peterson wrote:
>
> The patches look good. The big question is whether to add them to this
> merge window while it's still open. Opinions?

We're still hashing out the rhashtable interface so I don't think
now is the time to rush things.

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt



Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek

2018-04-04 Thread Bob Peterson
- Original Message -
> Here's a second version of the patch (now a patch set) to eliminate
> rhashtable_walk_peek in gfs2.
> 
> The first patch introduces lockref_put_not_zero, the inverse of
> lockref_get_not_zero.
> 
> The second patch eliminates rhashtable_walk_peek in gfs2.  In
> gfs2_glock_iter_next, the new lockref function from patch one is used to
> drop a lockref count as long as the count doesn't drop to zero.  This is
> almost always the case; if there is a risk of dropping the last
> reference, we must defer that to a work queue because dropping the last
> reference may sleep.
> 
> Thanks,
> Andreas
> 
> Andreas Gruenbacher (2):
>   lockref: Add lockref_put_not_zero
>   gfs2: Stop using rhashtable_walk_peek
> 
>  fs/gfs2/glock.c | 47 ---
>  include/linux/lockref.h |  1 +
>  lib/lockref.c   | 28 
>  3 files changed, 57 insertions(+), 19 deletions(-)
> 
> --
> 2.14.3

Hi,

The patches look good. The big question is whether to add them to this
merge window while it's still open. Opinions?

Acked-by: Bob Peterson 

Regards,

Bob Peterson



Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek

2018-04-02 Thread Herbert Xu
On Tue, Apr 03, 2018 at 01:41:26PM +1000, NeilBrown wrote:
>
> Do we really need a rhashtable_walk_peek() interface?
> I imagine that a seqfile ->start function can do:
> 
>   if (*ppos == 0 && last_pos != 0) {
>   rhashtable_walk_exit();
> rhashtable_walk_enter(, );
> last_pos = 0;
>   }
>   rhashtable_walk_start();
>   if (*ppos == last_pos && iter.p)
>   return iter.p;
>   last_pos = *ppos;

We don't want users poking into the internals of iter.  If you're
suggesting we could simplify rhashtable_walk_peek to just this after
your patch then yes possibly.  You also need to ensure that not only
seqfs users continue to work but also netlink dump users which are
slightly different.

> It might be OK to have a function call instead of expecting people to
> use iter.p directly.

Yes that would definitely be the preferred option.

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt



Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek

2018-04-02 Thread NeilBrown
On Fri, Mar 30 2018, Herbert Xu wrote:

> On Thu, Mar 29, 2018 at 06:52:34PM +0200, Andreas Gruenbacher wrote:
>>
>> Should rhashtable_walk_peek be kept around even if there are no more
>> users? I have my doubts.
>
> Absolutely.  All netlink dumps using rhashtable_walk_next are buggy
> and need to switch over to rhashtable_walk_peek.  As otherwise
> the object that triggers the out-of-space condition will be skipped
> upon resumption.

Do we really need a rhashtable_walk_peek() interface?
I imagine that a seqfile ->start function can do:

  if (*ppos == 0 && last_pos != 0) {
rhashtable_walk_exit();
rhashtable_walk_enter(, );
last_pos = 0;
  }
  rhashtable_walk_start();
  if (*ppos == last_pos && iter.p)
return iter.p;
  last_pos = *ppos;
  return rhashtable_walk_next()


and the ->next function just does

  last_pos = *ppos;
  *ppos += 1;
  do p = rhashtable_walk_next(); while (IS_ERR(p));
  return p;

It might be OK to have a function call instead of expecting people to
use iter.p directly.

static inline void *rhashtable_walk_prev(struct rhashtable_iter *iter)
{
return iter->p;
}

Thoughts?

Thanks,
NeilBrown


signature.asc
Description: PGP signature


Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek

2018-03-29 Thread Herbert Xu
On Thu, Mar 29, 2018 at 06:52:34PM +0200, Andreas Gruenbacher wrote:
>
> Should rhashtable_walk_peek be kept around even if there are no more
> users? I have my doubts.

Absolutely.  All netlink dumps using rhashtable_walk_next are buggy
and need to switch over to rhashtable_walk_peek.  As otherwise
the object that triggers the out-of-space condition will be skipped
upon resumption.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt



Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek

2018-03-29 Thread Andreas Gruenbacher
On 29 March 2018 at 17:41, Herbert Xu  wrote:
> On Thu, Mar 29, 2018 at 03:15:54PM +0200, Andreas Gruenbacher wrote:
>>
>> For all I know, Neil's latest plan is to get rhashtable_walk_peek
>> replaced and removed because it is unfixable. This patch removes the
>> one and only user.
>
> His latest patch makes rhashtable_walk_peek stable in the face of
> removals.
>
> https://patchwork.ozlabs.org/patch/892534/

Ok, I can slightly update my patch description. The problem still
remains that glocks can be deleted from the rhashtable between
stop/start, and that needs to be fixed in gfs2. Once that's done,
keeping track of the current glock comes for free and we won't need
rhashtable_walk_peek anymore.

Should rhashtable_walk_peek be kept around even if there are no more
users? I have my doubts.

Thanks,
Andreas



Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek

2018-03-29 Thread Herbert Xu
On Thu, Mar 29, 2018 at 03:15:54PM +0200, Andreas Gruenbacher wrote:
>
> For all I know, Neil's latest plan is to get rhashtable_walk_peek
> replaced and removed because it is unfixable. This patch removes the
> one and only user.

His latest patch makes rhashtable_walk_peek stable in the face of
removals.

https://patchwork.ozlabs.org/patch/892534/

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt



Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek

2018-03-29 Thread Andreas Gruenbacher
On 29 March 2018 at 14:35, Herbert Xu  wrote:
> On Thu, Mar 29, 2018 at 02:06:10PM +0200, Andreas Gruenbacher wrote:
>> Here's a second version of the patch (now a patch set) to eliminate
>> rhashtable_walk_peek in gfs2.
>>
>> The first patch introduces lockref_put_not_zero, the inverse of
>> lockref_get_not_zero.
>>
>> The second patch eliminates rhashtable_walk_peek in gfs2.  In
>> gfs2_glock_iter_next, the new lockref function from patch one is used to
>> drop a lockref count as long as the count doesn't drop to zero.  This is
>> almost always the case; if there is a risk of dropping the last
>> reference, we must defer that to a work queue because dropping the last
>> reference may sleep.
>
> In light of Neil's latest patch, do we still need this?

For all I know, Neil's latest plan is to get rhashtable_walk_peek
replaced and removed because it is unfixable. This patch removes the
one and only user.

Thanks,
Andreas



Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek

2018-03-29 Thread Herbert Xu
On Thu, Mar 29, 2018 at 02:06:10PM +0200, Andreas Gruenbacher wrote:
> Here's a second version of the patch (now a patch set) to eliminate
> rhashtable_walk_peek in gfs2.
> 
> The first patch introduces lockref_put_not_zero, the inverse of
> lockref_get_not_zero.
> 
> The second patch eliminates rhashtable_walk_peek in gfs2.  In
> gfs2_glock_iter_next, the new lockref function from patch one is used to
> drop a lockref count as long as the count doesn't drop to zero.  This is
> almost always the case; if there is a risk of dropping the last
> reference, we must defer that to a work queue because dropping the last
> reference may sleep.

In light of Neil's latest patch, do we still need this?

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt



Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek

2018-03-29 Thread Steven Whitehouse

Hi,

Can we solve the problem another way, by not taking refs on the glocks 
when we are iterating over them for the debugfs files? I assume that is 
the main issue here.


We didn't used to take refs since the rcu locking was enough during the 
walk itself. We used to only keep track of the hash bucket and offset 
within the bucket when we dropped the rcu lock between calls to the 
iterator. I may have lost track of why that approach did not work?


Steve.


On 29/03/18 13:06, Andreas Gruenbacher wrote:

Here's a second version of the patch (now a patch set) to eliminate
rhashtable_walk_peek in gfs2.

The first patch introduces lockref_put_not_zero, the inverse of
lockref_get_not_zero.

The second patch eliminates rhashtable_walk_peek in gfs2.  In
gfs2_glock_iter_next, the new lockref function from patch one is used to
drop a lockref count as long as the count doesn't drop to zero.  This is
almost always the case; if there is a risk of dropping the last
reference, we must defer that to a work queue because dropping the last
reference may sleep.

Thanks,
Andreas

Andreas Gruenbacher (2):
   lockref: Add lockref_put_not_zero
   gfs2: Stop using rhashtable_walk_peek

  fs/gfs2/glock.c | 47 ---
  include/linux/lockref.h |  1 +
  lib/lockref.c   | 28 
  3 files changed, 57 insertions(+), 19 deletions(-)