Hi David,

Thanks for posting those results.

>From the Fio runs, I see you are getting around 200 iops at 128kb write io
size. I would imagine you should be getting somewhere around 200-300 iops
for the cluster you posted in the initial post, so it looks like its
performing about right.

200 iops * 128kb is around 25MB/s, so that's as good as you are going to
get, even in an ideal environment with a high queue depth.

The iostat info shows that each ZFS receive is more or less doing single
threaded writes, which is why you are getting such slow performance for a
single ZFS receive and that 8 scale accordingly. It looks like each ZFS
operation has to wait for the previous to finish before submitting the next.

There are a couple of options, I know some are not applicable but I'm
listing for completeness.

1. SSD Journals will give you a significant boost, maybe 5-6x
2. Double the number disks will probably double performance if you can get
the queue depth high enough (ie more concurrent ZFS receives)

These next two are specific to ZFS and I'm not sure if they are possible as
I don't have much knowledge of ZFS

1. Make ZFS receive do larger IO's. If your cluster can do ~40iops per
thread, then RBD bandwidth will scale with increasing IO sizes
2. Somehow get ZFS to coalesce the writes so that they are written to the
RBD at higher queue depths and larger block sizes. I'm not sure if there are
some ZIL parameters that can be changed to achieve this.

You can see the type of effect you could achieve with these last 2 by
running a FIO job with iodepth=8 and bs=1M

Nick

> -----Original Message-----
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> J David
> Sent: 24 April 2015 01:20
> To: Mark Nelson
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] Having trouble getting good performance
> 
> On Thu, Apr 23, 2015 at 4:23 PM, Mark Nelson <mnel...@redhat.com>
> wrote:
> > If you want to adjust the iodepth, you'll need to use an asynchronous
> > ioengine like libaio (you also need to use direct=1)
> 
> Ah yes, libaio makes a big difference.  With 1 job:
> 
> testfile: (g=0): rw=randwrite, bs=128K-128K/128K-128K/128K-128K,
> ioengine=libaio, iodepth=64
> fio-2.1.3
> Starting 1 process
> 
> testfile: (groupid=0, jobs=1): err= 0: pid=6290: Thu Apr 23 20:43:27 2015
>   write: io=30720MB, bw=28503KB/s, iops=222, runt=1103633msec
>     slat (usec): min=12, max=1049.4K, avg=2427.89, stdev=13913.04
>     clat (msec): min=4, max=1975, avg=284.97, stdev=268.71
>      lat (msec): min=4, max=1975, avg=287.40, stdev=268.37
>     clat percentiles (msec):
>      |  1.00th=[    7],  5.00th=[   11], 10.00th=[   20], 20.00th=[   36],
>      | 30.00th=[   60], 40.00th=[  120], 50.00th=[  219], 60.00th=[  318],
>      | 70.00th=[  416], 80.00th=[  519], 90.00th=[  652], 95.00th=[  766],
>      | 99.00th=[ 1090], 99.50th=[ 1221], 99.90th=[ 1516], 99.95th=[ 1598],
>      | 99.99th=[ 1860]
>     bw (KB  /s): min=  236, max=170082, per=100.00%, avg=29037.74,
> stdev=15788.85
>     lat (msec) : 10=4.63%, 20=5.77%, 50=16.59%, 100=10.64%, 250=15.40%
>     lat (msec) : 500=25.38%, 750=15.89%, 1000=4.00%, 2000=1.70%
>   cpu          : usr=0.37%, sys=1.00%, ctx=99920, majf=0, minf=27
>   IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%,
> >=64=100.0%
>      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
>      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%,
> >=64=0.0%
>      issued    : total=r=0/w=245760/d=0, short=r=0/w=0/d=0
> 
> Run status group 0 (all jobs):
>   WRITE: io=30720MB, aggrb=28503KB/s, minb=28503KB/s, maxb=28503KB/s,
> mint=1103633msec, maxt=1103633msec
> 
> Disk stats (read/write):
>   vdb: ios=0/246189, merge=0/219, ticks=0/67559576, in_queue=67564864,
> util=100.00%
> 
> With 2 jobs:
> 
> testfile: (g=0): rw=randwrite, bs=128K-128K/128K-128K/128K-128K,
> ioengine=libaio, iodepth=64
> testfile: (g=0): rw=randwrite, bs=128K-128K/128K-128K/128K-128K,
> ioengine=libaio, iodepth=64
> fio-2.1.3
> Starting 2 processes
> 
> testfile: (groupid=0, jobs=2): err= 0: pid=6394: Thu Apr 23 21:24:09 2015
>   write: io=46406MB, bw=26384KB/s, iops=206, runt=1801073msec
>     slat (usec): min=11, max=3457.7K, avg=9589.56, stdev=44841.01
>     clat (msec): min=5, max=5256, avg=611.29, stdev=507.51
>      lat (msec): min=5, max=5256, avg=620.88, stdev=510.21
>     clat percentiles (msec):
>      |  1.00th=[   25],  5.00th=[   62], 10.00th=[  102], 20.00th=[  192],
>      | 30.00th=[  293], 40.00th=[  396], 50.00th=[  502], 60.00th=[  611],
>      | 70.00th=[  742], 80.00th=[  930], 90.00th=[ 1254], 95.00th=[ 1582],
>      | 99.00th=[ 2376], 99.50th=[ 2769], 99.90th=[ 3687], 99.95th=[ 4080],
>      | 99.99th=[ 4686]
>     bw (KB  /s): min=   98, max=108111, per=53.88%, avg=14214.41,
> stdev=10031.64
>     lat (msec) : 10=0.24%, 20=0.46%, 50=2.85%, 100=6.27%, 250=16.04%
>     lat (msec) : 500=24.00%, 750=20.47%, 1000=12.35%, 2000=15.14%,
> >=2000=2.17%
>   cpu          : usr=0.18%, sys=0.49%, ctx=291909, majf=0, minf=55
>   IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%,
> >=64=100.0%
>      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
>      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%,
> >=64=0.0%
>      issued    : total=r=0/w=371246/d=0, short=r=0/w=0/d=0
> 
> Run status group 0 (all jobs):
>   WRITE: io=46406MB, aggrb=26383KB/s, minb=26383KB/s, maxb=26383KB/s,
> mint=1801073msec, maxt=1801073msec
> 
> Disk stats (read/write):
>   vdb: ios=0/371958, merge=0/358, ticks=0/111668288, in_queue=111672480,
> util=100.00%
> 
> And here is some "iostat -xt 10" from the start of the ZFS machine doing a
> snapshot receive:  (vdb = the Ceph RBD)
> 
> 04/24/2015 12:12:50 AM
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            0.10    0.00    0.30    0.00    0.00   99.60
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
> vda               0.00     0.00    0.00    0.10     0.00     0.40
> 8.00     0.00    0.00    0.00    0.00   0.00   0.00
> vdb               0.00     0.00    0.00    0.00     0.00     0.00
> 0.00     0.00    0.00    0.00    0.00   0.00   0.00
> vdc               0.00     0.00    0.00    0.00     0.00     0.00
> 0.00     0.00    0.00    0.00    0.00   0.00   0.00
> 
> 04/24/2015 12:13:00 AM
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            0.60    0.00    1.20    9.27    0.00   88.93
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
> vda               0.00     0.00    0.20    1.70     2.40     6.80
> 9.68     0.01    3.37   20.00    1.41   3.37   0.64
> vdb               0.00     0.00    0.20   13.50     0.50   187.10
> 27.39     0.26   18.86  112.00   17.48  13.55  18.56
> vdc               0.00     0.00    0.00    0.00     0.00     0.00
> 0.00     0.00    0.00    0.00    0.00   0.00   0.00
> 
> 04/24/2015 12:13:10 AM
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            6.97    0.00    4.46   70.78    0.05   17.74
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
> vda               0.00     1.10    0.00    0.50     0.00     6.40
> 25.60     0.00    8.80    0.00    8.80   8.80   0.44
> vdb               0.00     0.00   91.00   27.90   348.00   247.45
> 10.02     1.73   14.55   10.82   26.74   8.32  98.88
> vdc               0.00     0.00    0.00    0.00     0.00     0.00
> 0.00     0.00    0.00    0.00    0.00   0.00   0.00
> 
> 04/24/2015 12:13:20 AM
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            3.52    0.00    4.52   72.23    0.10   19.64
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
> vda               0.00     0.20    0.00    0.40     0.00     2.40
> 12.00     0.00    9.00    0.00    9.00   9.00   0.36
> vdb               0.00     0.00  107.30   42.00   299.75  3150.00
> 46.21     2.18   14.57   10.93   23.88   6.68  99.68
> vdc               0.00     0.00    0.00    0.00     0.00     0.00
> 0.00     0.00    0.00    0.00    0.00   0.00   0.00
> 
> 04/24/2015 12:13:30 AM
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            3.32    0.00    6.10   81.31    0.10    9.17
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
> vda               0.00     0.20    0.00    0.40     0.00     2.40
> 12.00     0.00    9.00    0.00    9.00   9.00   0.36
> vdb               0.00     0.00  111.50   40.30   342.05  2023.25
> 31.16     2.03   13.37    9.55   23.92   6.46  98.04
> vdc               0.00     0.00    0.00    0.00     0.00     0.00
> 0.00     0.00    0.00    0.00    0.00   0.00   0.00
> 
> 04/24/2015 12:13:40 AM
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            3.77    0.00    4.63   77.62    0.05   13.93
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
> vda               0.00     0.00    0.00    1.30     0.00     5.20
> 8.00     0.01    5.54    0.00    5.54   5.54   0.72
> vdb               0.00     0.00   99.20   42.70   362.30  1653.00
> 28.40     2.10   14.67   11.04   23.09   7.02  99.68
> vdc               0.00     0.00    0.00    0.00     0.00     0.00
> 0.00     0.00    0.00    0.00    0.00   0.00   0.00
> 
> 04/24/2015 12:13:50 AM
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            0.70    0.00    1.96   93.98    0.10    3.26
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
> vda               0.00     0.20    0.00    0.40     0.00     2.40
> 12.00     0.01   15.00    0.00   15.00  15.00   0.60
> vdb               0.00     0.00   62.20   41.20   128.75  1604.05
> 33.52     2.05   20.03   16.49   25.38   9.67  99.96
> vdc               0.00     0.00    0.00    0.00     0.00     0.00
> 0.00     0.00    0.00    0.00    0.00   0.00   0.00
> 
> From the above, avgqu-sz seems to park at 2.  With 4 receives running
> simultaneously, it looks like this:
> 
> 04/24/2015 12:18:20 AM
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>           53.38    0.00   26.97   15.29    2.56    1.80
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
> vda               0.00     0.20    0.00    2.30     0.00     9.60
> 8.35     0.00    3.65    0.00    3.65   1.04   0.24
> vdb               0.00     0.00  244.90  117.20  1720.30  8975.55
> 59.08    10.80   29.71   12.49   65.71   2.71  98.28
> vdc               0.00     0.00    0.00    0.00     0.00     0.00
> 0.00     0.00    0.00    0.00    0.00   0.00   0.00
> 
> 04/24/2015 12:18:30 AM
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>           16.48    0.00   11.05   44.73    0.21   27.53
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
> vda               0.00     0.00    0.00    0.00     0.00     0.00
> 0.00     0.00    0.00    0.00    0.00   0.00   0.00
> vdb               0.00     0.00  119.10  155.40   609.55  1597.00
> 16.08    11.10   40.92    9.59   64.93   3.64 100.00
> vdc               0.00     0.00    0.00    0.00     0.00     0.00
> 0.00     0.00    0.00    0.00    0.00   0.00   0.00
> 
> 04/24/2015 12:18:40 AM
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            7.76    0.00   28.24   40.30    0.10   23.60
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
> vda               0.00     0.00    0.00    0.00     0.00     0.00
> 0.00     0.00    0.00    0.00    0.00   0.00   0.00
> vdb               0.00     0.00  106.60  105.10   619.85  2152.20
> 26.19     8.51   40.04   11.31   69.17   4.15  87.96
> vdc               0.00     0.00    0.00    0.00     0.00     0.00
> 0.00     0.00    0.00    0.00    0.00   0.00   0.00
> 
> 04/24/2015 12:18:50 AM
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>           12.34    0.00   24.95   52.41    0.05   10.25
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
> vda               0.00     0.00    0.00    0.00     0.00     0.00
> 0.00     0.00    0.00    0.00    0.00   0.00   0.00
> vdb               0.00     0.00  171.90  144.20   834.75 14364.55
> 96.17    11.74   37.27   10.79   68.82   3.16  99.96
> vdc               0.00     0.00    0.00    0.00     0.00     0.00
> 0.00     0.00    0.00    0.00    0.00   0.00   0.00
> 
> 04/24/2015 12:19:00 AM
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>           40.51    0.00   26.40   30.74    0.05    2.30
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
> vda               0.00     0.00    0.00    0.00     0.00     0.00
> 0.00     0.00    0.00    0.00    0.00   0.00   0.00
> vdb               0.00     0.00  163.40  149.70  1415.25 15792.25
> 109.92    12.00   38.26   14.11   64.63   3.16  99.04
> vdc               0.00     0.00    0.00    0.00     0.00     0.00
> 0.00     0.00    0.00    0.00    0.00   0.00   0.00
> 
> 04/24/2015 12:19:10 AM
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>           20.83    0.00    7.27   45.39    0.46   26.05
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
> vda               0.00     0.00    0.00    0.00     0.00     0.00
> 0.00     0.00    0.00    0.00    0.00   0.00   0.00
> vdb               0.00     0.00   42.00  148.20   274.05 15151.65
> 162.21    10.39   54.74   10.75   67.21   5.25  99.88
> vdc               0.00     0.00    0.00    0.00     0.00     0.00
> 0.00     0.00    0.00    0.00    0.00   0.00   0.00
> 
> 04/24/2015 12:19:20 AM
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>           23.65    0.00    7.06   47.36    0.41   21.52
> 
> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
> vda               0.00     0.00    0.00    0.00     0.00     0.00
> 0.00     0.00    0.00    0.00    0.00   0.00   0.00
> vdb               0.00     0.00   18.00  160.40   141.85  7149.35
> 81.74    10.25   57.28   16.56   61.85   5.61 100.00
> vdc               0.00     0.00    0.00    0.00     0.00     0.00
> 0.00     0.00    0.00    0.00    0.00   0.00   0.00
> 
> Thanks!
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to