Re: [ceph-users] Fwd: [ceph bad performance], can't find a bottleneck

2018-03-13 Thread Sergey Kotov
Hi, Maged

Not a big difference in both cases.

Performance of 4 nodes pool with 5x PM863a each is:
4k bs - 33-37kIOPS krbd 128 threads and 42-51kIOPS vs 1024  threads (fio
numjobs 128-256-512)
the same situation happens when we try to increase rbd workload, 3 rbd gets
the same iops #.
Dead end & limit )

Thank you!

2018-03-12 21:49 GMT+03:00 Maged Mokhtar :

> Hi,
>
> Try increasing the queue depth from default 128 to 1024:
>
> rbd map image-XX  -o queue_depth=1024
>
>
> Also if you run multiple rbd images/fio tests, do you get higher combined
> performance ?
>
> Maged
>
>
> On 2018-03-12 17:16, Sergey Kotov wrote:
>
> Dear moderator, i subscribed to ceph list today, could you please post my
> message?
>
> -- Forwarded message --
> From: Sergey Kotov 
> Date: 2018-03-06 10:52 GMT+03:00
> Subject: [ceph bad performance], can't find a bottleneck
> To: ceph-users@lists.ceph.com
> Cc: Житенев Алексей , Anna Anikina <
> anik...@gmail.com>
>
>
> Good day.
>
> Can you please help us to find bottleneck in our ceph installations.
> We have 3 SSD-only clusters with different HW, but situation is the same -
> overall i/o operations between client & ceph lower than 1/6 of summary
> performance all ssd.
>
> For example -
> One of our cluster has 4-nodes with ssd Toshiba 2Tb Enterprise drives,
> installed on Ubuntu server 16.04.
> Servers are connected to the 10G switches. Latency between modes is about
> 0.1ms. Ethernet utilisation is low.
>
> # uname -a
> Linux storage01 4.4.0-101-generic #124-Ubuntu SMP Fri Nov 10 18:29:59 UTC
> 2017 x86_64 x86_64 x86_64 GNU/Linux
>
> # ceph osd versions
> {
> "ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e)
> luminous (stable)": 55
> }
>
>
> When we map rbd image direct on the storage nodes via krbd, performance is
> not good enough.
> We use fio for testing. Even we run randwrite with 4k block size test in
> multiple thread mode, our drives don't have utilisation higher then 30% and
> lat is ok.
>
> At the same time iostat tool displays 100% utilisation on /dev/rbdX.
>
> Also we can't enable rbd_cache, because of using scst iscsi over rbd
> mapped images.
>
> How can we resolve the issue?
>
> Ceph config:
>
> [global]
> fsid = beX482fX-6a91-46dX-ad22-21a8a2696abX
> mon_initial_members = storage01, storage02, storage03
> mon_host = X.Y.Z.1,X.Y.Z.2,X.Y.Z.3
> auth_cluster_required = cephx
> auth_service_required = cephx
> auth_client_required = cephx
> public_network = X.Y.Z.0/24
> filestore_xattr_use_omap = true
> osd_pool_default_size = 2
> osd_pool_default_min_size = 1
> osd_pool_default_pg_num = 1024
> osd_journal_size = 10240
> osd_mkfs_type = xfs
> filestore_op_threads = 16
> filestore_wbthrottle_enable = False
> throttler_perf_counter = False
> osd crush update on start = false
>
> [osd]
> osd_scrub_begin_hour = 1
> osd_scrub_end_hour = 6
> osd_scrub_priority = 1
>
> osd_enable_op_tracker = False
> osd_max_backfills = 1
> osd heartbeat grace = 20
> osd heartbeat interval = 5
> osd recovery max active = 1
> osd recovery max single start = 1
> osd recovery op priority = 1
> osd recovery threads = 1
> osd backfill scan max = 16
> osd backfill scan min = 4
> osd max scrubs = 1
> osd scrub interval randomize ratio = 1.0
> osd disk thread ioprio class = idle
> osd disk thread ioprio priority = 0
> osd scrub chunk max = 1
> osd scrub chunk min = 1
> osd deep scrub stride = 1048576
> osd scrub load threshold = 5.0
> osd scrub sleep = 0.1
>
> [client]
> rbd_cache = false
>
>
> Sample fio tests:
>
> root@storage04:~# fio --name iops --rw randread --bs 4k --filename
> /dev/rbd2 --numjobs 12 --ioengine=libaio --group_reporting
> iops: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1
> ...
> fio-2.2.10
> Starting 12 processes
> ^Cbs: 12 (f=12): [r(12)] [1.2% done] [128.4MB/0KB/0KB /s] [32.9K/0/0 iops]
> [eta 16m:49s]
> fio: terminating on signal 2
>
> iops: (groupid=0, jobs=12): err= 0: pid=29812: Sun Feb 11 23:59:19 2018
>   read : io=1367.8MB, bw=126212KB/s, iops=31553, runt= 11097msec
> slat (usec): min=1, max=59700, avg=375.92, stdev=495.19
> clat (usec): min=0, max=377, avg= 1.12, stdev= 3.16
>  lat (usec): min=1, max=59702, avg=377.61, stdev=495.32
> clat percentiles (usec):
>  |  1.00th=[0],  5.00th=[0], 10.00th=[1], 20.00th=[1],
>  | 30.00th=[1], 40.00th=[1], 50.00th=[1], 60.00th=[1],
>  | 70.00th=[1], 80.00th=[1], 90.00th=[1], 95.00th=[2],
>  | 99.00th=[2], 99.50th=[2], 99.90th=[   73], 99.95th=[   78],
>  | 99.99th=[  115]
> bw (KB  /s): min= 8536, max=11944, per=8.33%, avg=10516.45,
> stdev=635.32
> lat (usec) : 2=91.74%, 4=7.93%, 10=0.14%, 20=0.09%, 50=0.01%
> lat (usec) : 100=0.07%, 250=0.03%, 500=0.01%
>   cpu  : usr=1.32%, sys=3.69%, ctx=329556, majf=0, minf=134
>   IO depths: 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
> >=64=0.0%
> 

Re: [ceph-users] Fwd: [ceph bad performance], can't find a bottleneck

2018-03-12 Thread Maged Mokhtar
Hi, 

Try increasing the queue depth from default 128 to 1024: 

rbd map image-XX  -o queue_depth=1024 

Also if you run multiple rbd images/fio tests, do you get higher
combined performance ? 

Maged 

On 2018-03-12 17:16, Sergey Kotov wrote:

> Dear moderator, i subscribed to ceph list today, could you please post my 
> message? 
> 
> -- Forwarded message --
> From: SERGEY KOTOV 
> Date: 2018-03-06 10:52 GMT+03:00
> Subject: [ceph bad performance], can't find a bottleneck
> To: ceph-users@lists.ceph.com
> Cc: Житенев Алексей , Anna Anikina 
> 
> Good day. 
> 
> Can you please help us to find bottleneck in our ceph installations. 
> We have 3 SSD-only clusters with different HW, but situation is the same - 
> overall i/o operations between client & ceph lower than 1/6 of summary 
> performance all ssd.  
> 
> For example - 
> One of our cluster has 4-nodes with ssd Toshiba 2Tb Enterprise drives, 
> installed on Ubuntu server 16.04.
> Servers are connected to the 10G switches. Latency between modes is about 
> 0.1ms. Ethernet utilisation is low.
> 
> # uname -a
> Linux storage01 4.4.0-101-generic #124-Ubuntu SMP Fri Nov 10 18:29:59 UTC 
> 2017 x86_64 x86_64 x86_64 GNU/Linux
> 
> # ceph osd versions
> {
> "ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous 
> (stable)": 55
> }
> 
> When we map rbd image direct on the storage nodes via krbd, performance is 
> not good enough.
> We use fio for testing. Even we run randwrite with 4k block size test in 
> multiple thread mode, our drives don't have utilisation higher then 30% and 
> lat is ok.
> 
> At the same time iostat tool displays 100% utilisation on /dev/rbdX.
> 
> Also we can't enable rbd_cache, because of using scst iscsi over rbd mapped 
> images.
> 
> How can we resolve the issue?
> 
> Ceph config:
> 
> [global]
> fsid = beX482fX-6a91-46dX-ad22-21a8a2696abX
> mon_initial_members = storage01, storage02, storage03
> mon_host = X.Y.Z.1,X.Y.Z.2,X.Y.Z.3
> auth_cluster_required = cephx
> auth_service_required = cephx
> auth_client_required = cephx
> public_network = X.Y.Z.0/24
> filestore_xattr_use_omap = true
> osd_pool_default_size = 2
> osd_pool_default_min_size = 1
> osd_pool_default_pg_num = 1024
> osd_journal_size = 10240
> osd_mkfs_type = xfs
> filestore_op_threads = 16
> filestore_wbthrottle_enable = False
> throttler_perf_counter = False
> osd crush update on start = false
> 
> [osd]
> osd_scrub_begin_hour = 1
> osd_scrub_end_hour = 6
> osd_scrub_priority = 1
> 
> osd_enable_op_tracker = False
> osd_max_backfills = 1
> osd heartbeat grace = 20
> osd heartbeat interval = 5
> osd recovery max active = 1
> osd recovery max single start = 1
> osd recovery op priority = 1
> osd recovery threads = 1
> osd backfill scan max = 16
> osd backfill scan min = 4
> osd max scrubs = 1
> osd scrub interval randomize ratio = 1.0
> osd disk thread ioprio class = idle
> osd disk thread ioprio priority = 0
> osd scrub chunk max = 1
> osd scrub chunk min = 1
> osd deep scrub stride = 1048576
> osd scrub load threshold = 5.0
> osd scrub sleep = 0.1
> 
> [client]
> rbd_cache = false
> 
> Sample fio tests:
> 
> root@storage04:~# fio --name iops --rw randread --bs 4k --filename /dev/rbd2 
> --numjobs 12 --ioengine=libaio --group_reporting 
> iops: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1 
> ... 
> fio-2.2.10 
> Starting 12 processes 
> ^Cbs: 12 (f=12): [r(12)] [1.2% done] [128.4MB/0KB/0KB /s] [32.9K/0/0 iops] 
> [eta 16m:49s] 
> fio: terminating on signal 2 
> 
> iops: (groupid=0, jobs=12): err= 0: pid=29812: Sun Feb 11 23:59:19 2018 
> read : io=1367.8MB, bw=126212KB/s, iops=31553, runt= 11097msec 
> slat (usec): min=1, max=59700, avg=375.92, stdev=495.19 
> clat (usec): min=0, max=377, avg= 1.12, stdev= 3.16 
> lat (usec): min=1, max=59702, avg=377.61, stdev=495.32 
> clat percentiles (usec): 
> |  1.00th=[0],  5.00th=[0], 10.00th=[1], 20.00th=[1], 
> | 30.00th=[1], 40.00th=[1], 50.00th=[1], 60.00th=[1], 
> | 70.00th=[1], 80.00th=[1], 90.00th=[1], 95.00th=[2], 
> | 99.00th=[2], 99.50th=[2], 99.90th=[   73], 99.95th=[   78], 
> | 99.99th=[  115] 
> bw (KB  /s): min= 8536, max=11944, per=8.33%, avg=10516.45, stdev=635.32 
> lat (usec) : 2=91.74%, 4=7.93%, 10=0.14%, 20=0.09%, 50=0.01% 
> lat (usec) : 100=0.07%, 250=0.03%, 500=0.01% 
> cpu  : usr=1.32%, sys=3.69%, ctx=329556, majf=0, minf=134 
> IO depths: 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% 
> submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% 
> complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% 
> issued: total=r=350144/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 
> latency   : target=0, window=0, percentile=100.00%, depth=1 
> 
> Run status group 0 (all jobs): 
> READ: io=1367.8MB, aggrb=126212KB/s, minb=126212KB/s, maxb=126212KB/s, 
> mint=11097msec, 

[ceph-users] Fwd: [ceph bad performance], can't find a bottleneck

2018-03-12 Thread Sergey Kotov
Dear moderator, i subscribed to ceph list today, could you please post my
message?

-- Forwarded message --
From: Sergey Kotov 
Date: 2018-03-06 10:52 GMT+03:00
Subject: [ceph bad performance], can't find a bottleneck
To: ceph-users@lists.ceph.com
Cc: Житенев Алексей , Anna Anikina 


Good day.

Can you please help us to find bottleneck in our ceph installations.
We have 3 SSD-only clusters with different HW, but situation is the same -
overall i/o operations between client & ceph lower than 1/6 of summary
performance all ssd.

For example -
One of our cluster has 4-nodes with ssd Toshiba 2Tb Enterprise drives,
installed on Ubuntu server 16.04.
Servers are connected to the 10G switches. Latency between modes is about
0.1ms. Ethernet utilisation is low.

# uname -a
Linux storage01 4.4.0-101-generic #124-Ubuntu SMP Fri Nov 10 18:29:59 UTC
2017 x86_64 x86_64 x86_64 GNU/Linux

# ceph osd versions
{
"ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e)
luminous (stable)": 55
}


When we map rbd image direct on the storage nodes via krbd, performance is
not good enough.
We use fio for testing. Even we run randwrite with 4k block size test in
multiple thread mode, our drives don't have utilisation higher then 30% and
lat is ok.

At the same time iostat tool displays 100% utilisation on /dev/rbdX.

Also we can't enable rbd_cache, because of using scst iscsi over rbd mapped
images.

How can we resolve the issue?

Ceph config:

[global]
fsid = beX482fX-6a91-46dX-ad22-21a8a2696abX
mon_initial_members = storage01, storage02, storage03
mon_host = X.Y.Z.1,X.Y.Z.2,X.Y.Z.3
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
public_network = X.Y.Z.0/24
filestore_xattr_use_omap = true
osd_pool_default_size = 2
osd_pool_default_min_size = 1
osd_pool_default_pg_num = 1024
osd_journal_size = 10240
osd_mkfs_type = xfs
filestore_op_threads = 16
filestore_wbthrottle_enable = False
throttler_perf_counter = False
osd crush update on start = false

[osd]
osd_scrub_begin_hour = 1
osd_scrub_end_hour = 6
osd_scrub_priority = 1

osd_enable_op_tracker = False
osd_max_backfills = 1
osd heartbeat grace = 20
osd heartbeat interval = 5
osd recovery max active = 1
osd recovery max single start = 1
osd recovery op priority = 1
osd recovery threads = 1
osd backfill scan max = 16
osd backfill scan min = 4
osd max scrubs = 1
osd scrub interval randomize ratio = 1.0
osd disk thread ioprio class = idle
osd disk thread ioprio priority = 0
osd scrub chunk max = 1
osd scrub chunk min = 1
osd deep scrub stride = 1048576
osd scrub load threshold = 5.0
osd scrub sleep = 0.1

[client]
rbd_cache = false


Sample fio tests:

root@storage04:~# fio --name iops --rw randread --bs 4k --filename
/dev/rbd2 --numjobs 12 --ioengine=libaio --group_reporting
iops: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1
...
fio-2.2.10
Starting 12 processes
^Cbs: 12 (f=12): [r(12)] [1.2% done] [128.4MB/0KB/0KB /s] [32.9K/0/0 iops]
[eta 16m:49s]
fio: terminating on signal 2

iops: (groupid=0, jobs=12): err= 0: pid=29812: Sun Feb 11 23:59:19 2018
  read : io=1367.8MB, bw=126212KB/s, iops=31553, runt= 11097msec
slat (usec): min=1, max=59700, avg=375.92, stdev=495.19
clat (usec): min=0, max=377, avg= 1.12, stdev= 3.16
 lat (usec): min=1, max=59702, avg=377.61, stdev=495.32
clat percentiles (usec):
 |  1.00th=[0],  5.00th=[0], 10.00th=[1], 20.00th=[1],
 | 30.00th=[1], 40.00th=[1], 50.00th=[1], 60.00th=[1],
 | 70.00th=[1], 80.00th=[1], 90.00th=[1], 95.00th=[2],
 | 99.00th=[2], 99.50th=[2], 99.90th=[   73], 99.95th=[   78],
 | 99.99th=[  115]
bw (KB  /s): min= 8536, max=11944, per=8.33%, avg=10516.45, stdev=635.32
lat (usec) : 2=91.74%, 4=7.93%, 10=0.14%, 20=0.09%, 50=0.01%
lat (usec) : 100=0.07%, 250=0.03%, 500=0.01%
  cpu  : usr=1.32%, sys=3.69%, ctx=329556, majf=0, minf=134
  IO depths: 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
>=64=0.0%
 submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
 complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
 issued: total=r=350144/w=0/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
 latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: io=1367.8MB, aggrb=126212KB/s, minb=126212KB/s, maxb=126212KB/s,
mint=11097msec, maxt=11097msec

Disk stats (read/write):
  rbd2: ios=323072/0, merge=0/0, ticks=124268/0, in_queue=124680,
util=99.31%


root@storage04:~# fio --name iops --rw randwrite --bs 4k --filename
/dev/rbd2 --numjobs 12 --ioengine=libaio --group_reporting
iops: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=1
...
fio-2.2.10
Starting 12 processes
^Cbs: 12 (f=12): [w(12)] [25.0% done] [0KB/713.5MB/0KB /s] [0/183K/0 iops]
[eta 00m:45s]
fio: