Re: [Cluster-devel] vmalloc with GFP_NOFS

2018-05-09 Thread Darrick J. Wong
On Wed, May 09, 2018 at 11:04:47PM +0200, Michal Hocko wrote:
> On Wed 09-05-18 08:13:51, Darrick J. Wong wrote:
> > On Wed, May 09, 2018 at 03:42:22PM +0200, Michal Hocko wrote:
> > > On Tue 24-04-18 13:25:42, Michal Hocko wrote:
> > > [...]
> > > > > As a suggestion, could you take
> > > > > documentation about how to convert to the memalloc_nofs_{save,restore}
> > > > > scope api (which I think you've written about e-mails at length
> > > > > before), and put that into a file in Documentation/core-api?
> > > > 
> > > > I can.
> > > 
> > > Does something like the below sound reasonable/helpful?
> > > ---
> > > =
> > > GFP masks used from FS/IO context
> > > =
> > > 
> > > :Date: Mapy, 2018
> > > :Author: Michal Hocko 
> > > 
> > > Introduction
> > > 
> > > 
> > > FS resp. IO submitting code paths have to be careful when allocating
> > 
> > Not sure what 'FS resp. IO' means here -- 'FS and IO' ?
> > 
> > (Or is this one of those things where this looks like plain English text
> > but in reality it's some sort of markup that I'm not so familiar with?)
> > 
> > Confused because I've seen 'resp.' used as shorthand for
> > 'responsible'...
> 
> Well, I've tried to cover both. Filesystem and IO code paths which
> allocate while in sensitive context. IO submission is kinda clear but I
> am not sure what a general term for filsystem code paths would be. I
> would be greatful for any hints here.

"Code paths in the filesystem and IO stacks must be careful when
allocating memory to prevent recursion deadlocks caused by direct memory
reclaim calling back into the FS or IO paths and blocking on already
held resources (e.g. locks)." ?

--D

> 
> > 
> > > memory to prevent from potential recursion deadlocks caused by direct
> > > memory reclaim calling back into the FS/IO path and block on already
> > > held resources (e.g. locks). Traditional way to avoid this problem
> > 
> > 'The traditional way to avoid this deadlock problem...'
> 
> Done
> 
> > > is to clear __GFP_FS resp. __GFP_IO (note the later implies clearing
> > > the first as well) in the gfp mask when calling an allocator. GFP_NOFS
> > > resp. GFP_NOIO can be used as shortcut.
> > > 
> > > This has been the traditional way to avoid deadlocks since ages. It
> > 
> > I think this sentence is a little redundant with the previous sentence,
> > you could chop it out and join this paragraph to the one before it.
> 
> OK
> 
> > 
> > > turned out though that above approach has led to abuses when the 
> > > restricted
> > > gfp mask is used "just in case" without a deeper consideration which leads
> > > to problems because an excessive use of GFP_NOFS/GFP_NOIO can lead to
> > > memory over-reclaim or other memory reclaim issues.
> > > 
> > > New API
> > > ===
> > > 
> > > Since 4.12 we do have a generic scope API for both NOFS and NOIO context
> > > ``memalloc_nofs_save``, ``memalloc_nofs_restore`` resp. 
> > > ``memalloc_noio_save``,
> > > ``memalloc_noio_restore`` which allow to mark a scope to be a critical
> > > section from the memory reclaim recursion into FS/IO POV. Any allocation
> > > from that scope will inherently drop __GFP_FS resp. __GFP_IO from the 
> > > given
> > > mask so no memory allocation can recurse back in the FS/IO.
> > > 
> > > FS/IO code then simply calls the appropriate save function right at
> > > the layer where a lock taken from the reclaim context (e.g. shrinker)
> > > is taken and the corresponding restore function when the lock is
> > > released. All that ideally along with an explanation what is the reclaim
> > > context for easier maintenance.
> > > 
> > > What about __vmalloc(GFP_NOFS)
> > > ==
> > > 
> > > vmalloc doesn't support GFP_NOFS semantic because there are hardcoded
> > > GFP_KERNEL allocations deep inside the allocator which are quit 
> > > non-trivial
> > 
> > ...which are quite non-trivial...
> 
> fixed
> 
> > > to fix up. That means that calling ``vmalloc`` with GFP_NOFS/GFP_NOIO is
> > > almost always a bug. The good news is that the NOFS/NOIO semantic can be
> > > achieved by the scope api.
> > > 
> > > In the ideal world, upper layers should already mark dangerous contexts
> > > and so no special care is required and vmalloc should be called without
> > > any problems. Sometimes if the context is not really clear or there are
> > > layering violations then the recommended way around that is to wrap 
> > > ``vmalloc``
> > > by the scope API with a comment explaining the problem.
> > 
> > Otherwise looks ok to me based on my understanding of how all this is
> > supposed to work...
> > 
> > Reviewed-by: Darrick J. Wong 
> 
> Thanks for your review!
> 
> -- 
> Michal Hocko
> SUSE Labs



Re: [Cluster-devel] vmalloc with GFP_NOFS

2018-05-09 Thread Michal Hocko
On Wed 09-05-18 19:24:51, Mike Rapoport wrote:
> On Wed, May 09, 2018 at 08:13:51AM -0700, Darrick J. Wong wrote:
> > On Wed, May 09, 2018 at 03:42:22PM +0200, Michal Hocko wrote:
[...]
> > > FS/IO code then simply calls the appropriate save function right at
> > > the layer where a lock taken from the reclaim context (e.g. shrinker)
> > > is taken and the corresponding restore function when the lock is
> 
> Seems like the second "is taken" got there by mistake

yeah, fixed. Thanks!
-- 
Michal Hocko
SUSE Labs



Re: [Cluster-devel] vmalloc with GFP_NOFS

2018-05-09 Thread Michal Hocko
On Wed 09-05-18 08:13:51, Darrick J. Wong wrote:
> On Wed, May 09, 2018 at 03:42:22PM +0200, Michal Hocko wrote:
> > On Tue 24-04-18 13:25:42, Michal Hocko wrote:
> > [...]
> > > > As a suggestion, could you take
> > > > documentation about how to convert to the memalloc_nofs_{save,restore}
> > > > scope api (which I think you've written about e-mails at length
> > > > before), and put that into a file in Documentation/core-api?
> > > 
> > > I can.
> > 
> > Does something like the below sound reasonable/helpful?
> > ---
> > =
> > GFP masks used from FS/IO context
> > =
> > 
> > :Date: Mapy, 2018
> > :Author: Michal Hocko 
> > 
> > Introduction
> > 
> > 
> > FS resp. IO submitting code paths have to be careful when allocating
> 
> Not sure what 'FS resp. IO' means here -- 'FS and IO' ?
> 
> (Or is this one of those things where this looks like plain English text
> but in reality it's some sort of markup that I'm not so familiar with?)
> 
> Confused because I've seen 'resp.' used as shorthand for
> 'responsible'...

Well, I've tried to cover both. Filesystem and IO code paths which
allocate while in sensitive context. IO submission is kinda clear but I
am not sure what a general term for filsystem code paths would be. I
would be greatful for any hints here.

> 
> > memory to prevent from potential recursion deadlocks caused by direct
> > memory reclaim calling back into the FS/IO path and block on already
> > held resources (e.g. locks). Traditional way to avoid this problem
> 
> 'The traditional way to avoid this deadlock problem...'

Done

> > is to clear __GFP_FS resp. __GFP_IO (note the later implies clearing
> > the first as well) in the gfp mask when calling an allocator. GFP_NOFS
> > resp. GFP_NOIO can be used as shortcut.
> > 
> > This has been the traditional way to avoid deadlocks since ages. It
> 
> I think this sentence is a little redundant with the previous sentence,
> you could chop it out and join this paragraph to the one before it.

OK

> 
> > turned out though that above approach has led to abuses when the restricted
> > gfp mask is used "just in case" without a deeper consideration which leads
> > to problems because an excessive use of GFP_NOFS/GFP_NOIO can lead to
> > memory over-reclaim or other memory reclaim issues.
> > 
> > New API
> > ===
> > 
> > Since 4.12 we do have a generic scope API for both NOFS and NOIO context
> > ``memalloc_nofs_save``, ``memalloc_nofs_restore`` resp. 
> > ``memalloc_noio_save``,
> > ``memalloc_noio_restore`` which allow to mark a scope to be a critical
> > section from the memory reclaim recursion into FS/IO POV. Any allocation
> > from that scope will inherently drop __GFP_FS resp. __GFP_IO from the given
> > mask so no memory allocation can recurse back in the FS/IO.
> > 
> > FS/IO code then simply calls the appropriate save function right at
> > the layer where a lock taken from the reclaim context (e.g. shrinker)
> > is taken and the corresponding restore function when the lock is
> > released. All that ideally along with an explanation what is the reclaim
> > context for easier maintenance.
> > 
> > What about __vmalloc(GFP_NOFS)
> > ==
> > 
> > vmalloc doesn't support GFP_NOFS semantic because there are hardcoded
> > GFP_KERNEL allocations deep inside the allocator which are quit non-trivial
> 
> ...which are quite non-trivial...

fixed

> > to fix up. That means that calling ``vmalloc`` with GFP_NOFS/GFP_NOIO is
> > almost always a bug. The good news is that the NOFS/NOIO semantic can be
> > achieved by the scope api.
> > 
> > In the ideal world, upper layers should already mark dangerous contexts
> > and so no special care is required and vmalloc should be called without
> > any problems. Sometimes if the context is not really clear or there are
> > layering violations then the recommended way around that is to wrap 
> > ``vmalloc``
> > by the scope API with a comment explaining the problem.
> 
> Otherwise looks ok to me based on my understanding of how all this is
> supposed to work...
> 
> Reviewed-by: Darrick J. Wong 

Thanks for your review!

-- 
Michal Hocko
SUSE Labs



Re: [Cluster-devel] [PATCH] gfs2-utils tests: Fix testsuite cleanup

2018-05-09 Thread Andrew Price

On 09/05/18 16:40, Valentin Vidic wrote:

Parallel distclean removes testsuite before it is finished
running, causing make error.

Signed-off-by: Valentin Vidic 
---
  tests/Makefile.am | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)


Applied - thanks!

Andy



diff --git a/tests/Makefile.am b/tests/Makefile.am
index 1dedc2b2..e52aab4c 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -6,8 +6,7 @@ EXTRA_DIST = \
  
  DISTCLEANFILES = \

atlocal \
-   atconfig \
-   $(TESTSUITE)
+   atconfig
  
  CLEANFILES = testvol
  
@@ -103,6 +102,7 @@ installcheck-local: atconfig atlocal $(TESTSUITE)
  
  clean-local:

test ! -f '$(TESTSUITE)' || $(SHELL) '$(TESTSUITE)' --clean
+   rm -f '$(TESTSUITE)'
  
  atconfig: $(top_builddir)/config.status

cd $(top_builddir) && ./config.status tests/$@





Re: [Cluster-devel] vmalloc with GFP_NOFS

2018-05-09 Thread Mike Rapoport
On Wed, May 09, 2018 at 08:13:51AM -0700, Darrick J. Wong wrote:
> On Wed, May 09, 2018 at 03:42:22PM +0200, Michal Hocko wrote:
> > On Tue 24-04-18 13:25:42, Michal Hocko wrote:
> > [...]
> > > > As a suggestion, could you take
> > > > documentation about how to convert to the memalloc_nofs_{save,restore}
> > > > scope api (which I think you've written about e-mails at length
> > > > before), and put that into a file in Documentation/core-api?
> > > 
> > > I can.
> > 
> > Does something like the below sound reasonable/helpful?
> > ---
> > =
> > GFP masks used from FS/IO context
> > =
> > 
> > :Date: Mapy, 2018
> > :Author: Michal Hocko 
> > 
> > Introduction
> > 
> > 
> > FS resp. IO submitting code paths have to be careful when allocating
> 
> Not sure what 'FS resp. IO' means here -- 'FS and IO' ?
> 
> (Or is this one of those things where this looks like plain English text
> but in reality it's some sort of markup that I'm not so familiar with?)
> 
> Confused because I've seen 'resp.' used as shorthand for
> 'responsible'...
> 
> > memory to prevent from potential recursion deadlocks caused by direct
> > memory reclaim calling back into the FS/IO path and block on already
> > held resources (e.g. locks). Traditional way to avoid this problem
> 
> 'The traditional way to avoid this deadlock problem...'
> 
> > is to clear __GFP_FS resp. __GFP_IO (note the later implies clearing
> > the first as well) in the gfp mask when calling an allocator. GFP_NOFS
> > resp. GFP_NOIO can be used as shortcut.
> > 
> > This has been the traditional way to avoid deadlocks since ages. It
> 
> I think this sentence is a little redundant with the previous sentence,
> you could chop it out and join this paragraph to the one before it.
> 
> > turned out though that above approach has led to abuses when the restricted
> > gfp mask is used "just in case" without a deeper consideration which leads
> > to problems because an excessive use of GFP_NOFS/GFP_NOIO can lead to
> > memory over-reclaim or other memory reclaim issues.
> > 
> > New API
> > ===
> > 
> > Since 4.12 we do have a generic scope API for both NOFS and NOIO context
> > ``memalloc_nofs_save``, ``memalloc_nofs_restore`` resp. 
> > ``memalloc_noio_save``,
> > ``memalloc_noio_restore`` which allow to mark a scope to be a critical
> > section from the memory reclaim recursion into FS/IO POV. Any allocation
> > from that scope will inherently drop __GFP_FS resp. __GFP_IO from the given
> > mask so no memory allocation can recurse back in the FS/IO.
> > 
> > FS/IO code then simply calls the appropriate save function right at
> > the layer where a lock taken from the reclaim context (e.g. shrinker)
> > is taken and the corresponding restore function when the lock is

Seems like the second "is taken" got there by mistake

> > released. All that ideally along with an explanation what is the reclaim
> > context for easier maintenance.
> > 
> > What about __vmalloc(GFP_NOFS)
> > ==
> > 
> > vmalloc doesn't support GFP_NOFS semantic because there are hardcoded
> > GFP_KERNEL allocations deep inside the allocator which are quit non-trivial
> 
> ...which are quite non-trivial...
> 
> > to fix up. That means that calling ``vmalloc`` with GFP_NOFS/GFP_NOIO is
> > almost always a bug. The good news is that the NOFS/NOIO semantic can be
> > achieved by the scope api.
> > 
> > In the ideal world, upper layers should already mark dangerous contexts
> > and so no special care is required and vmalloc should be called without
> > any problems. Sometimes if the context is not really clear or there are
> > layering violations then the recommended way around that is to wrap 
> > ``vmalloc``
> > by the scope API with a comment explaining the problem.
> 
> Otherwise looks ok to me based on my understanding of how all this is
> supposed to work...
> 
> Reviewed-by: Darrick J. Wong 
> 
> --D
> 
> > -- 
> > Michal Hocko
> > SUSE Labs
> 

-- 
Sincerely yours,
Mike.



[Cluster-devel] [PATCH] gfs2-utils tests: Fix testsuite cleanup

2018-05-09 Thread Valentin Vidic
Parallel distclean removes testsuite before it is finished
running, causing make error.

Signed-off-by: Valentin Vidic 
---
 tests/Makefile.am | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/Makefile.am b/tests/Makefile.am
index 1dedc2b2..e52aab4c 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -6,8 +6,7 @@ EXTRA_DIST = \
 
 DISTCLEANFILES = \
atlocal \
-   atconfig \
-   $(TESTSUITE)
+   atconfig
 
 CLEANFILES = testvol
 
@@ -103,6 +102,7 @@ installcheck-local: atconfig atlocal $(TESTSUITE)
 
 clean-local:
test ! -f '$(TESTSUITE)' || $(SHELL) '$(TESTSUITE)' --clean
+   rm -f '$(TESTSUITE)'
 
 atconfig: $(top_builddir)/config.status
cd $(top_builddir) && ./config.status tests/$@
-- 
2.17.0



Re: [Cluster-devel] vmalloc with GFP_NOFS

2018-05-09 Thread Darrick J. Wong
On Wed, May 09, 2018 at 03:42:22PM +0200, Michal Hocko wrote:
> On Tue 24-04-18 13:25:42, Michal Hocko wrote:
> [...]
> > > As a suggestion, could you take
> > > documentation about how to convert to the memalloc_nofs_{save,restore}
> > > scope api (which I think you've written about e-mails at length
> > > before), and put that into a file in Documentation/core-api?
> > 
> > I can.
> 
> Does something like the below sound reasonable/helpful?
> ---
> =
> GFP masks used from FS/IO context
> =
> 
> :Date: Mapy, 2018
> :Author: Michal Hocko 
> 
> Introduction
> 
> 
> FS resp. IO submitting code paths have to be careful when allocating

Not sure what 'FS resp. IO' means here -- 'FS and IO' ?

(Or is this one of those things where this looks like plain English text
but in reality it's some sort of markup that I'm not so familiar with?)

Confused because I've seen 'resp.' used as shorthand for
'responsible'...

> memory to prevent from potential recursion deadlocks caused by direct
> memory reclaim calling back into the FS/IO path and block on already
> held resources (e.g. locks). Traditional way to avoid this problem

'The traditional way to avoid this deadlock problem...'

> is to clear __GFP_FS resp. __GFP_IO (note the later implies clearing
> the first as well) in the gfp mask when calling an allocator. GFP_NOFS
> resp. GFP_NOIO can be used as shortcut.
> 
> This has been the traditional way to avoid deadlocks since ages. It

I think this sentence is a little redundant with the previous sentence,
you could chop it out and join this paragraph to the one before it.

> turned out though that above approach has led to abuses when the restricted
> gfp mask is used "just in case" without a deeper consideration which leads
> to problems because an excessive use of GFP_NOFS/GFP_NOIO can lead to
> memory over-reclaim or other memory reclaim issues.
> 
> New API
> ===
> 
> Since 4.12 we do have a generic scope API for both NOFS and NOIO context
> ``memalloc_nofs_save``, ``memalloc_nofs_restore`` resp. 
> ``memalloc_noio_save``,
> ``memalloc_noio_restore`` which allow to mark a scope to be a critical
> section from the memory reclaim recursion into FS/IO POV. Any allocation
> from that scope will inherently drop __GFP_FS resp. __GFP_IO from the given
> mask so no memory allocation can recurse back in the FS/IO.
> 
> FS/IO code then simply calls the appropriate save function right at
> the layer where a lock taken from the reclaim context (e.g. shrinker)
> is taken and the corresponding restore function when the lock is
> released. All that ideally along with an explanation what is the reclaim
> context for easier maintenance.
> 
> What about __vmalloc(GFP_NOFS)
> ==
> 
> vmalloc doesn't support GFP_NOFS semantic because there are hardcoded
> GFP_KERNEL allocations deep inside the allocator which are quit non-trivial

...which are quite non-trivial...

> to fix up. That means that calling ``vmalloc`` with GFP_NOFS/GFP_NOIO is
> almost always a bug. The good news is that the NOFS/NOIO semantic can be
> achieved by the scope api.
> 
> In the ideal world, upper layers should already mark dangerous contexts
> and so no special care is required and vmalloc should be called without
> any problems. Sometimes if the context is not really clear or there are
> layering violations then the recommended way around that is to wrap 
> ``vmalloc``
> by the scope API with a comment explaining the problem.

Otherwise looks ok to me based on my understanding of how all this is
supposed to work...

Reviewed-by: Darrick J. Wong 

--D

> -- 
> Michal Hocko
> SUSE Labs



Re: [Cluster-devel] vmalloc with GFP_NOFS

2018-05-09 Thread David Sterba
On Wed, May 09, 2018 at 03:42:22PM +0200, Michal Hocko wrote:
> On Tue 24-04-18 13:25:42, Michal Hocko wrote:
> [...]
> > > As a suggestion, could you take
> > > documentation about how to convert to the memalloc_nofs_{save,restore}
> > > scope api (which I think you've written about e-mails at length
> > > before), and put that into a file in Documentation/core-api?
> > 
> > I can.
> 
> Does something like the below sound reasonable/helpful?

Sounds good to me and matches how we've been using the vmalloc/nofs so
far.



Re: [Cluster-devel] vmalloc with GFP_NOFS

2018-05-09 Thread Michal Hocko
On Tue 24-04-18 13:25:42, Michal Hocko wrote:
[...]
> > As a suggestion, could you take
> > documentation about how to convert to the memalloc_nofs_{save,restore}
> > scope api (which I think you've written about e-mails at length
> > before), and put that into a file in Documentation/core-api?
> 
> I can.

Does something like the below sound reasonable/helpful?
---
=
GFP masks used from FS/IO context
=

:Date: Mapy, 2018
:Author: Michal Hocko 

Introduction


FS resp. IO submitting code paths have to be careful when allocating
memory to prevent from potential recursion deadlocks caused by direct
memory reclaim calling back into the FS/IO path and block on already
held resources (e.g. locks). Traditional way to avoid this problem
is to clear __GFP_FS resp. __GFP_IO (note the later implies clearing
the first as well) in the gfp mask when calling an allocator. GFP_NOFS
resp. GFP_NOIO can be used as shortcut.

This has been the traditional way to avoid deadlocks since ages. It
turned out though that above approach has led to abuses when the restricted
gfp mask is used "just in case" without a deeper consideration which leads
to problems because an excessive use of GFP_NOFS/GFP_NOIO can lead to
memory over-reclaim or other memory reclaim issues.

New API
===

Since 4.12 we do have a generic scope API for both NOFS and NOIO context
``memalloc_nofs_save``, ``memalloc_nofs_restore`` resp. ``memalloc_noio_save``,
``memalloc_noio_restore`` which allow to mark a scope to be a critical
section from the memory reclaim recursion into FS/IO POV. Any allocation
from that scope will inherently drop __GFP_FS resp. __GFP_IO from the given
mask so no memory allocation can recurse back in the FS/IO.

FS/IO code then simply calls the appropriate save function right at
the layer where a lock taken from the reclaim context (e.g. shrinker)
is taken and the corresponding restore function when the lock is
released. All that ideally along with an explanation what is the reclaim
context for easier maintenance.

What about __vmalloc(GFP_NOFS)
==

vmalloc doesn't support GFP_NOFS semantic because there are hardcoded
GFP_KERNEL allocations deep inside the allocator which are quit non-trivial
to fix up. That means that calling ``vmalloc`` with GFP_NOFS/GFP_NOIO is
almost always a bug. The good news is that the NOFS/NOIO semantic can be
achieved by the scope api.

In the ideal world, upper layers should already mark dangerous contexts
and so no special care is required and vmalloc should be called without
any problems. Sometimes if the context is not really clear or there are
layering violations then the recommended way around that is to wrap ``vmalloc``
by the scope API with a comment explaining the problem.
-- 
Michal Hocko
SUSE Labs



[Cluster-devel] [GFS2 PATCH] GFS2: Add to tail, not head, of transaction

2018-05-09 Thread Bob Peterson
Hi,

Before this patch, frunction gfs2_trans_add_meta called list_add
to add a buffer to a transaction, tr_buf. Later, in the before_commit
functions, it traversed the list in sequential order, which meant
that they were processed in a sub-optimal order. For example, blocks
could go out in 54321 order rather than 12345, causing media heads
to bounce unnecessarily.

This makes no difference for small IO operations, but large writes
benefit greatly when they add lots of indirect blocks to a
transaction, and those blocks are allocated in ascending order,
as the block allocator tries to do.

This patch changes it to list_add_tail so they are traversed (and
therefore written back) in the same order as they are added to
the transaction.

In one of the more extreme examples, I did a test where I had
10 simultaneous instances of iozone, each writing 25GB of data,
using one of our performance testing machines. The results are:

 Without the patch   With the patch
 -   --
Children see throughput  1395068.00 kB/sec   1527073.61 kB/sec
Parent sees throughput   1348544.00 kB/sec   1485594.66 kB/sec

These numbers are artificially inflated because I was also running
with Andreas Gruenbacher's iomap-write patch set plus my rgrp
sharing patch set in both these cases. Still, it shows a 9 percent
performance boost for both children and parent throughput.

Signed-off-by: Bob Peterson 
---
 fs/gfs2/trans.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index c75cacaa349b..c4c00b90935c 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -247,7 +247,7 @@ void gfs2_trans_add_meta(struct gfs2_glock *gl, struct 
buffer_head *bh)
gfs2_pin(sdp, bd->bd_bh);
mh->__pad0 = cpu_to_be64(0);
mh->mh_jid = cpu_to_be32(sdp->sd_jdesc->jd_jid);
-   list_add(>bd_list, >tr_buf);
+   list_add_tail(>bd_list, >tr_buf);
tr->tr_num_buf_new++;
 out_unlock:
gfs2_log_unlock(sdp);



Re: [Cluster-devel] [PATCH 0/2] Improve throughput through rgrp sharing (v2)

2018-05-09 Thread Steven Whitehouse

Hi,


On 08/05/18 21:04, Bob Peterson wrote:

Hi,

On 18 April, I posted v1 of this patch set. The idea is to allow multiple
processes on a node to share a glock that's held exclusively in order to
improve performance. Sharing rgrps allows for better throughput by
reducing contention.

Version 1 implemented this by introducing a new glock mode for sharing
glocks. Steve Whitehouse suggested we didn't need a new mode: we can
accomplish the same thing just by having a new glock flag, which also
makes the patch more simple.

This version 2 patch set implements Steve's suggestion.

The first patch introduces the new glock flag. The second patch puts
it into use for rgrp sharing. Exclusive access to the rgrp is implemented
through an rwsem.

Performance testing using iozone looks even better than version 1.
Sounds really good! We should make sure that we give this a really good 
round of testing and it would be nice to see some details of the 
performance improvements. Overall though, that's an excellent result :-)


Steve.


---
Bob Peterson (2):
   GFS2: Introduce GLF_EX_SHARING bit: local EX sharing
   GFS2: Take advantage of new EXSH glock mode for rgrps

  fs/gfs2/bmap.c   |   2 +-
  fs/gfs2/dir.c|   2 +-
  fs/gfs2/glock.c  |  23 ++---
  fs/gfs2/glock.h  |   4 +++
  fs/gfs2/incore.h |   2 ++
  fs/gfs2/inode.c  |   7 ++--
  fs/gfs2/rgrp.c   | 103 ++-
  fs/gfs2/rgrp.h   |   2 +-
  fs/gfs2/super.c  |   8 +++--
  fs/gfs2/xattr.c  |   8 +++--
  10 files changed, 129 insertions(+), 32 deletions(-)