Re: [ceph-users] ceph luminous performance - disks at 100% , low network utilization
Unfortunately, even after removing all my kernel configuration , the performance did not improve Currently GRUB_CMDLINE_LINUX="crashkernel=auto rhgb quiet net.ifnames=0 biosdevname=0 ipv6.disable=1 " Before GRUB_CMDLINE_LINUX="crashkernel=auto rhgb quiet net.ifnames=0 biosdevname=0 ipv6.disable=1 intel_pstate=disable intel_idle.max_cstate=0 processor.max_cstate=0 idle=poll numa=off" This is extremely puzzling - any ideas, suggestions for troubleshooting it will be GREATLY appreciated Steven On 2 February 2018 at 10:51, Steven Vacaroaia <ste...@gmail.com> wrote: > Hi Mark, > > Thanks > My pools are using replication =2 > > I'll re enable numa and report back > > Steven > > On 2 February 2018 at 10:48, Marc Roos <m.r...@f1-outsourcing.eu> wrote: > >> >> Not sure if this info is of any help, please beware I am also just in a >> testing phase with ceph. >> >> I don’t know how numa=off is interpreted by the os. If it just hides >> the numa, you still could run into the 'known issues'. That is why I >> have numad running. >> Furthermore I have put an osd 'out' that gives also a 0 in the reweight >> column. So I guess your osd.1 is also not participating? If so, could >> not be nice if your are testing 3x replication with 2 disks? >> >> >> I have got this on SATA 5400rpm disks, replicated pool size 3. >> >> rados bench -p rbd 30 write --id rbd >> sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg >> lat(s) >>20 16 832 816 163.178 1800.157838 >> 0.387074 >>21 16 867 851 162.073 1400.157289 >> 0.38817 >>22 16 900 884 160.705 1320.224024 >> 0.393674 >>23 16 953 937 162.934 2120.530274 >> 0.388189 >>24 16 989 973 162.144 1440.209806 >> 0.389644 >>25 16 1028 1012 161.898 1560.118438 >> 0.391057 >>26 16 1067 1051161.67 1560.248463 >> 0.38977 >>27 16 1112 1096 162.348 1800.754184 >> 0.392159 >>28 16 1143 1127 160.977 1240.439342 >> 0.393641 >>29 16 1185 1169 161.219 168 0.0801006 >> 0.393004 >>30 16 1221 1205 160.644 1440.224278 >> 0.39363 >> Total time run: 30.339270 >> Total writes made: 1222 >> Write size: 4194304 >> Object size:4194304 >> Bandwidth (MB/sec): 161.111 >> Stddev Bandwidth: 24.6819 >> Max bandwidth (MB/sec): 212 >> Min bandwidth (MB/sec): 120 >> Average IOPS: 40 >> Stddev IOPS:6 >> Max IOPS: 53 >> Min IOPS: 30 >> Average Latency(s): 0.396239 >> Stddev Latency(s): 0.249998 >> Max latency(s): 1.29482 >> Min latency(s): 0.06875 >> >> >> -Original Message- >> From: Steven Vacaroaia [mailto:ste...@gmail.com] >> Sent: vrijdag 2 februari 2018 15:25 >> To: ceph-users >> Subject: [ceph-users] ceph luminous performance - disks at 100% , low >> network utilization >> >> Hi, >> >> I have been struggling to get my test cluster to behave ( from a >> performance perspective) >> Dell R620, 64 GB RAM, 1 CPU, numa=off , PERC H710, Raid0, Enterprise 10K >> disks >> >> No SSD - just plain HDD >> >> Local tests ( dd, hdparm ) confirm my disks are capable of delivering >> 200 MBs >> Fio with 15 jobs indicate 100 MBs >> Ceph tell shows 400MBs >> >> rados bench with 1 thread provide 3 MB >> rados bench with 32 threads, 2 OSDs ( one per server) , barely touch 10 >> MB >> Adding a third server / OSD improve performance slightly ( 11 MB) >> >> atop shows disk usage at 100% for extended period of time >> Network usage is very low >> Nothing else is "red" >> >> I have removed all TCP setting and left ceph.conf mostly with defaults >> >> What am I missing ? >> >> Many thanks >> >> Steven >> >> >> ceph osd tree >> ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF >> >> 0 hdd 0.54529 osd.0 up 1.0 1.0 >> -5 0.54529 host osd02 >> 1 hdd 0.54529 osd.1 up0 1.0 >> -7 0 host osd04 >> -17 0.54529 host osd05 >> 2 hdd 0.54529 os
Re: [ceph-users] ceph luminous performance - disks at 100% , low network utilization
Hi Mark, Thanks My pools are using replication =2 I'll re enable numa and report back Steven On 2 February 2018 at 10:48, Marc Roos <m.r...@f1-outsourcing.eu> wrote: > > Not sure if this info is of any help, please beware I am also just in a > testing phase with ceph. > > I don’t know how numa=off is interpreted by the os. If it just hides > the numa, you still could run into the 'known issues'. That is why I > have numad running. > Furthermore I have put an osd 'out' that gives also a 0 in the reweight > column. So I guess your osd.1 is also not participating? If so, could > not be nice if your are testing 3x replication with 2 disks? > > > I have got this on SATA 5400rpm disks, replicated pool size 3. > > rados bench -p rbd 30 write --id rbd > sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg > lat(s) >20 16 832 816 163.178 1800.157838 > 0.387074 >21 16 867 851 162.073 1400.157289 > 0.38817 >22 16 900 884 160.705 1320.224024 > 0.393674 >23 16 953 937 162.934 2120.530274 > 0.388189 >24 16 989 973 162.144 1440.209806 > 0.389644 >25 16 1028 1012 161.898 1560.118438 > 0.391057 >26 16 1067 1051161.67 1560.248463 > 0.38977 >27 16 1112 1096 162.348 1800.754184 > 0.392159 >28 16 1143 1127 160.977 1240.439342 > 0.393641 >29 16 1185 1169 161.219 168 0.0801006 > 0.393004 >30 16 1221 1205 160.644 1440.224278 > 0.39363 > Total time run: 30.339270 > Total writes made: 1222 > Write size: 4194304 > Object size:4194304 > Bandwidth (MB/sec): 161.111 > Stddev Bandwidth: 24.6819 > Max bandwidth (MB/sec): 212 > Min bandwidth (MB/sec): 120 > Average IOPS: 40 > Stddev IOPS:6 > Max IOPS: 53 > Min IOPS: 30 > Average Latency(s): 0.396239 > Stddev Latency(s): 0.249998 > Max latency(s): 1.29482 > Min latency(s): 0.06875 > > > -----Original Message----- > From: Steven Vacaroaia [mailto:ste...@gmail.com] > Sent: vrijdag 2 februari 2018 15:25 > To: ceph-users > Subject: [ceph-users] ceph luminous performance - disks at 100% , low > network utilization > > Hi, > > I have been struggling to get my test cluster to behave ( from a > performance perspective) > Dell R620, 64 GB RAM, 1 CPU, numa=off , PERC H710, Raid0, Enterprise 10K > disks > > No SSD - just plain HDD > > Local tests ( dd, hdparm ) confirm my disks are capable of delivering > 200 MBs > Fio with 15 jobs indicate 100 MBs > Ceph tell shows 400MBs > > rados bench with 1 thread provide 3 MB > rados bench with 32 threads, 2 OSDs ( one per server) , barely touch 10 > MB > Adding a third server / OSD improve performance slightly ( 11 MB) > > atop shows disk usage at 100% for extended period of time > Network usage is very low > Nothing else is "red" > > I have removed all TCP setting and left ceph.conf mostly with defaults > > What am I missing ? > > Many thanks > > Steven > > > ceph osd tree > ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF > > 0 hdd 0.54529 osd.0 up 1.0 1.0 > -5 0.54529 host osd02 > 1 hdd 0.54529 osd.1 up0 1.0 > -7 0 host osd04 > -17 0.54529 host osd05 > 2 hdd 0.54529 osd.2 up 1.0 1.0 > > [root@osd01 ~]# ceph tell osd.0 bench > { > "bytes_written": 1073741824, > "blocksize": 4194304, > "bytes_per_sec": 452125657 > } > > [root@osd01 ~]# ceph tell osd.2 bench > { > "bytes_written": 1073741824, > "blocksize": 4194304, > "bytes_per_sec": 340553488 > } > > > hdparm -tT /dev/sdc > > /dev/sdc: > Timing cached reads: 5874 MB in 1.99 seconds = 2948.51 MB/sec > Timing buffered disk reads: 596 MB in 3.01 seconds = 198.17 MB/sec > > fio --filename=/dev/sdc --direct=1 --sync=1 --rw=write --bs=4k > --numjobs=15 --iodepth=1 --runtime=60 --time_based --group_reporting > --name=journal-test > journal-test: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, > iodepth=1 > ... > fio-2.2.8 > Starting 15 processes > Jobs: 15 (f=15): [W(15)] [100.0% done] [0KB/104.9MB/0KB /s] [0/26.9K/0 > iops] [eta 00m:00s] > > > fio --fil
Re: [ceph-users] ceph luminous performance - disks at 100% , low network utilization
Not sure if this info is of any help, please beware I am also just in a testing phase with ceph. I don’t know how numa=off is interpreted by the os. If it just hides the numa, you still could run into the 'known issues'. That is why I have numad running. Furthermore I have put an osd 'out' that gives also a 0 in the reweight column. So I guess your osd.1 is also not participating? If so, could not be nice if your are testing 3x replication with 2 disks? I have got this on SATA 5400rpm disks, replicated pool size 3. rados bench -p rbd 30 write --id rbd sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s) 20 16 832 816 163.178 1800.157838 0.387074 21 16 867 851 162.073 1400.157289 0.38817 22 16 900 884 160.705 1320.224024 0.393674 23 16 953 937 162.934 2120.530274 0.388189 24 16 989 973 162.144 1440.209806 0.389644 25 16 1028 1012 161.898 1560.118438 0.391057 26 16 1067 1051161.67 1560.248463 0.38977 27 16 1112 1096 162.348 1800.754184 0.392159 28 16 1143 1127 160.977 1240.439342 0.393641 29 16 1185 1169 161.219 168 0.0801006 0.393004 30 16 1221 1205 160.644 1440.224278 0.39363 Total time run: 30.339270 Total writes made: 1222 Write size: 4194304 Object size:4194304 Bandwidth (MB/sec): 161.111 Stddev Bandwidth: 24.6819 Max bandwidth (MB/sec): 212 Min bandwidth (MB/sec): 120 Average IOPS: 40 Stddev IOPS:6 Max IOPS: 53 Min IOPS: 30 Average Latency(s): 0.396239 Stddev Latency(s): 0.249998 Max latency(s): 1.29482 Min latency(s): 0.06875 -Original Message- From: Steven Vacaroaia [mailto:ste...@gmail.com] Sent: vrijdag 2 februari 2018 15:25 To: ceph-users Subject: [ceph-users] ceph luminous performance - disks at 100% , low network utilization Hi, I have been struggling to get my test cluster to behave ( from a performance perspective) Dell R620, 64 GB RAM, 1 CPU, numa=off , PERC H710, Raid0, Enterprise 10K disks No SSD - just plain HDD Local tests ( dd, hdparm ) confirm my disks are capable of delivering 200 MBs Fio with 15 jobs indicate 100 MBs Ceph tell shows 400MBs rados bench with 1 thread provide 3 MB rados bench with 32 threads, 2 OSDs ( one per server) , barely touch 10 MB Adding a third server / OSD improve performance slightly ( 11 MB) atop shows disk usage at 100% for extended period of time Network usage is very low Nothing else is "red" I have removed all TCP setting and left ceph.conf mostly with defaults What am I missing ? Many thanks Steven ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF 0 hdd 0.54529 osd.0 up 1.0 1.0 -5 0.54529 host osd02 1 hdd 0.54529 osd.1 up0 1.0 -7 0 host osd04 -17 0.54529 host osd05 2 hdd 0.54529 osd.2 up 1.0 1.0 [root@osd01 ~]# ceph tell osd.0 bench { "bytes_written": 1073741824, "blocksize": 4194304, "bytes_per_sec": 452125657 } [root@osd01 ~]# ceph tell osd.2 bench { "bytes_written": 1073741824, "blocksize": 4194304, "bytes_per_sec": 340553488 } hdparm -tT /dev/sdc /dev/sdc: Timing cached reads: 5874 MB in 1.99 seconds = 2948.51 MB/sec Timing buffered disk reads: 596 MB in 3.01 seconds = 198.17 MB/sec fio --filename=/dev/sdc --direct=1 --sync=1 --rw=write --bs=4k --numjobs=15 --iodepth=1 --runtime=60 --time_based --group_reporting --name=journal-test journal-test: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1 ... fio-2.2.8 Starting 15 processes Jobs: 15 (f=15): [W(15)] [100.0% done] [0KB/104.9MB/0KB /s] [0/26.9K/0 iops] [eta 00m:00s] fio --filename=/dev/sdc --direct=1 --sync=1 --rw=write --bs=4k --numjobs=5 --iodepth=1 --runtime=60 --time_based --group_reporting --name=journal-test journal-test: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1 ... fio-2.2.8 Starting 5 processes Jobs: 5 (f=5): [W(5)] [100.0% done] [0KB/83004KB/0KB /s] [0/20.8K/0 iops] [eta 00m:00s] ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] ceph luminous performance - disks at 100% , low network utilization
Hi, I have been struggling to get my test cluster to behave ( from a performance perspective) Dell R620, 64 GB RAM, 1 CPU, numa=off , PERC H710, Raid0, Enterprise 10K disks No SSD - just plain HDD Local tests ( dd, hdparm ) confirm my disks are capable of delivering 200 MBs Fio with 15 jobs indicate 100 MBs Ceph tell shows 400MBs rados bench with 1 thread provide 3 MB rados bench with 32 threads, 2 OSDs ( one per server) , barely touch 10 MB Adding a third server / OSD improve performance slightly ( 11 MB) atop shows disk usage at 100% for extended period of time Network usage is very low Nothing else is "red" I have removed all TCP setting and left ceph.conf mostly with defaults What am I missing ? Many thanks Steven ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF 0 hdd 0.54529 osd.0 up 1.0 1.0 -5 0.54529 host osd02 1 hdd 0.54529 osd.1 up0 1.0 -7 0 host osd04 -17 0.54529 host osd05 2 hdd 0.54529 osd.2 up 1.0 1.0 [root@osd01 ~]# ceph tell osd.0 bench { "bytes_written": 1073741824, "blocksize": 4194304, "bytes_per_sec": 452125657 } [root@osd01 ~]# ceph tell osd.2 bench { "bytes_written": 1073741824, "blocksize": 4194304, "bytes_per_sec": 340553488 } hdparm -tT /dev/sdc /dev/sdc: Timing cached reads: 5874 MB in 1.99 seconds = 2948.51 MB/sec Timing buffered disk reads: 596 MB in 3.01 seconds = 198.17 MB/sec fio --filename=/dev/sdc --direct=1 --sync=1 --rw=write --bs=4k --numjobs=15 --iodepth=1 --runtime=60 --time_based --group_reporting --name=journal-test journal-test: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1 ... fio-2.2.8 Starting 15 processes Jobs: 15 (f=15): [W(15)] [100.0% done] [0KB/104.9MB/0KB /s] [0/26.9K/0 iops] [eta 00m:00s] fio --filename=/dev/sdc --direct=1 --sync=1 --rw=write --bs=4k --numjobs=5 --iodepth=1 --runtime=60 --time_based --group_reporting --name=journal-test journal-test: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1 ... fio-2.2.8 Starting 5 processes Jobs: 5 (f=5): [W(5)] [100.0% done] [0KB/83004KB/0KB /s] [0/20.8K/0 iops] [eta 00m:00s] ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com