Re: [Cluster-devel] [GFS2 PATCH] GFS2: Add to tail, not head, of transaction

2018-06-01 Thread Steven Whitehouse

Hi,


On 30/05/18 20:24, Bob Peterson wrote:

Hi,

Before this patch, frunction gfs2_trans_add_meta called list_add
to add a buffer to a transaction, tr_buf. Later, in the before_commit
functions, it traversed the list in sequential order, which meant
that they were processed in a sub-optimal order. For example, blocks
could go out in 54321 order rather than 12345, causing media heads
to bounce unnecessarily.

This makes no difference for small IO operations, but large writes
benefit greatly when they add lots of indirect blocks to a
transaction, and those blocks are allocated in ascending order,
as the block allocator tries to do.

This patch changes it to list_add_tail so they are traversed (and
therefore written back) in the same order as they are added to
the transaction.

In one of the more extreme examples, I did a test where I had
10 simultaneous instances of iozone, each writing 25GB of data,
using one of our performance testing machines. The results are:

  Without the patch   With the patch
  -   --
Children see throughput  1395068.00 kB/sec   1527073.61 kB/sec
Parent sees throughput   1348544.00 kB/sec   1485594.66 kB/sec

These numbers are artificially inflated because I was also running
with Andreas Gruenbacher's iomap-write patch set plus my rgrp
sharing patch set in both these cases. Still, it shows a 9 percent
performance boost for both children and parent throughput.

Signed-off-by: Bob Peterson 
That makes sense assuming that a transaction will always add the blocks 
in ascending order. Writing into the journal will not be affected by 
this, since that will always be sequential anyway, so I assume that you 
are referring to AIL writeback here?


It might make more sense just to sort the blocks into order at journal 
flush time, since then we'll always be in order, no matter what order 
the blocks are added during the transactions,


Steve.


---
  fs/gfs2/trans.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index c75cacaa349b..c4c00b90935c 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -247,7 +247,7 @@ void gfs2_trans_add_meta(struct gfs2_glock *gl, struct 
buffer_head *bh)
gfs2_pin(sdp, bd->bd_bh);
mh->__pad0 = cpu_to_be64(0);
mh->mh_jid = cpu_to_be32(sdp->sd_jdesc->jd_jid);
-   list_add(>bd_list, >tr_buf);
+   list_add_tail(>bd_list, >tr_buf);
tr->tr_num_buf_new++;
  out_unlock:
gfs2_log_unlock(sdp);





[Cluster-devel] [GFS2 PATCH] GFS2: Add to tail, not head, of transaction

2018-05-30 Thread Bob Peterson
Hi,

Before this patch, frunction gfs2_trans_add_meta called list_add
to add a buffer to a transaction, tr_buf. Later, in the before_commit
functions, it traversed the list in sequential order, which meant
that they were processed in a sub-optimal order. For example, blocks
could go out in 54321 order rather than 12345, causing media heads
to bounce unnecessarily.

This makes no difference for small IO operations, but large writes
benefit greatly when they add lots of indirect blocks to a
transaction, and those blocks are allocated in ascending order,
as the block allocator tries to do.

This patch changes it to list_add_tail so they are traversed (and
therefore written back) in the same order as they are added to
the transaction.

In one of the more extreme examples, I did a test where I had
10 simultaneous instances of iozone, each writing 25GB of data,
using one of our performance testing machines. The results are:

 Without the patch   With the patch
 -   --
Children see throughput  1395068.00 kB/sec   1527073.61 kB/sec
Parent sees throughput   1348544.00 kB/sec   1485594.66 kB/sec

These numbers are artificially inflated because I was also running
with Andreas Gruenbacher's iomap-write patch set plus my rgrp
sharing patch set in both these cases. Still, it shows a 9 percent
performance boost for both children and parent throughput.

Signed-off-by: Bob Peterson 
---
 fs/gfs2/trans.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index c75cacaa349b..c4c00b90935c 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -247,7 +247,7 @@ void gfs2_trans_add_meta(struct gfs2_glock *gl, struct 
buffer_head *bh)
gfs2_pin(sdp, bd->bd_bh);
mh->__pad0 = cpu_to_be64(0);
mh->mh_jid = cpu_to_be32(sdp->sd_jdesc->jd_jid);
-   list_add(>bd_list, >tr_buf);
+   list_add_tail(>bd_list, >tr_buf);
tr->tr_num_buf_new++;
 out_unlock:
gfs2_log_unlock(sdp);



Re: [Cluster-devel] [GFS2 PATCH] GFS2: Add to tail, not head, of transaction

2018-05-15 Thread Andreas Gruenbacher
On 9 May 2018 at 15:21, Bob Peterson  wrote:
> Hi,
>
> Before this patch, frunction gfs2_trans_add_meta called list_add
> to add a buffer to a transaction, tr_buf. Later, in the before_commit
> functions, it traversed the list in sequential order, which meant
> that they were processed in a sub-optimal order. For example, blocks
> could go out in 54321 order rather than 12345, causing media heads
> to bounce unnecessarily.

Hmm, the patch itself looks reasonable, but given that
gfs2_before_commit sorts the list by block number before iterating it,
the above explanation seems to be at least partially wrong. Could you
please clarify what's really going on?

> This makes no difference for small IO operations, but large writes
> benefit greatly when they add lots of indirect blocks to a
> transaction, and those blocks are allocated in ascending order,
> as the block allocator tries to do.
>
> This patch changes it to list_add_tail so they are traversed (and
> therefore written back) in the same order as they are added to
> the transaction.
>
> In one of the more extreme examples, I did a test where I had
> 10 simultaneous instances of iozone, each writing 25GB of data,
> using one of our performance testing machines. The results are:
>
>  Without the patch   With the patch
>  -   --
> Children see throughput  1395068.00 kB/sec   1527073.61 kB/sec
> Parent sees throughput   1348544.00 kB/sec   1485594.66 kB/sec
>
> These numbers are artificially inflated because I was also running
> with Andreas Gruenbacher's iomap-write patch set plus my rgrp
> sharing patch set in both these cases. Still, it shows a 9 percent
> performance boost for both children and parent throughput.
>
> Signed-off-by: Bob Peterson 
> ---
>  fs/gfs2/trans.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
> index c75cacaa349b..c4c00b90935c 100644
> --- a/fs/gfs2/trans.c
> +++ b/fs/gfs2/trans.c
> @@ -247,7 +247,7 @@ void gfs2_trans_add_meta(struct gfs2_glock *gl, struct 
> buffer_head *bh)
> gfs2_pin(sdp, bd->bd_bh);
> mh->__pad0 = cpu_to_be64(0);
> mh->mh_jid = cpu_to_be32(sdp->sd_jdesc->jd_jid);
> -   list_add(>bd_list, >tr_buf);
> +   list_add_tail(>bd_list, >tr_buf);
> tr->tr_num_buf_new++;
>  out_unlock:
> gfs2_log_unlock(sdp);
>

Thanks,
Andreas



[Cluster-devel] [GFS2 PATCH] GFS2: Add to tail, not head, of transaction

2018-05-09 Thread Bob Peterson
Hi,

Before this patch, frunction gfs2_trans_add_meta called list_add
to add a buffer to a transaction, tr_buf. Later, in the before_commit
functions, it traversed the list in sequential order, which meant
that they were processed in a sub-optimal order. For example, blocks
could go out in 54321 order rather than 12345, causing media heads
to bounce unnecessarily.

This makes no difference for small IO operations, but large writes
benefit greatly when they add lots of indirect blocks to a
transaction, and those blocks are allocated in ascending order,
as the block allocator tries to do.

This patch changes it to list_add_tail so they are traversed (and
therefore written back) in the same order as they are added to
the transaction.

In one of the more extreme examples, I did a test where I had
10 simultaneous instances of iozone, each writing 25GB of data,
using one of our performance testing machines. The results are:

 Without the patch   With the patch
 -   --
Children see throughput  1395068.00 kB/sec   1527073.61 kB/sec
Parent sees throughput   1348544.00 kB/sec   1485594.66 kB/sec

These numbers are artificially inflated because I was also running
with Andreas Gruenbacher's iomap-write patch set plus my rgrp
sharing patch set in both these cases. Still, it shows a 9 percent
performance boost for both children and parent throughput.

Signed-off-by: Bob Peterson 
---
 fs/gfs2/trans.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c
index c75cacaa349b..c4c00b90935c 100644
--- a/fs/gfs2/trans.c
+++ b/fs/gfs2/trans.c
@@ -247,7 +247,7 @@ void gfs2_trans_add_meta(struct gfs2_glock *gl, struct 
buffer_head *bh)
gfs2_pin(sdp, bd->bd_bh);
mh->__pad0 = cpu_to_be64(0);
mh->mh_jid = cpu_to_be32(sdp->sd_jdesc->jd_jid);
-   list_add(>bd_list, >tr_buf);
+   list_add_tail(>bd_list, >tr_buf);
tr->tr_num_buf_new++;
 out_unlock:
gfs2_log_unlock(sdp);