Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]
Hi Vitaliy, The drive is a Micron 5200 ECO 3.84TB This is from the msecli utility: Device Name : /dev/sde Model No : Micron_5200_MTFDDAK3T8TDC Serial No: FW-Rev : D1MU404 Total Size : 3840.00GB Drive Status : Drive is in good health Sata Link Speed : Gen3 (6.0 Gbps) Sata Link Max Speed : Gen3 (6.0 Gbps) Temp(C) : 26 The server motherboard is: SuperMicro X10DRU-i+ Drives are connected to SATA connectors on the motherboard. Processors are: Xeon E5-2690v4 Eric From: Виталий Филиппов [mailto:vita...@yourcmc.ru] Sent: Tuesday, January 21, 2020 3:43 PM To: Eric K. Miller Cc: ceph-users@lists.ceph.com Subject: RE: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext] Hi! Thanks. The parameter gets reset when you reconnect the SSD so in fact it requires not to power cycle it after changing the parameter :-) Ok, this case seems lucky, ~2x change isn't a lot. Can you tell the exact model and capacity of this Micron, and what controller was used in this test? I'll add it to the spreadsheet. -- With best regards, Vitaliy Filippov ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]
Hi! Thanks. The parameter gets reset when you reconnect the SSD so in fact it requires not to power cycle it after changing the parameter :-) Ok, this case seems lucky, ~2x change isn't a lot. Can you tell the exact model and capacity of this Micron, and what controller was used in this test? I'll add it to the spreadsheet. -- With best regards, Vitaliy Filippov___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]
We were able to isolate an individual Micron 5200 and perform Vitaliy's tests in his spreadsheet. An interesting item - write cache changes do NOT require a power cycle to take effect, at least on a Micron 5200. The complete results from fio are included at the end of this message for the individual tests, for both write enabled and disabled. The shortened version of the results: Journal IOPS (sync=1) - Write cache ON: write: IOPS=19.7k, BW=76.0MiB/s (80.7MB/s)(4618MiB/60001msec) lat (usec): min=42, max=1273, avg=50.18, stdev= 6.40 Write cache OFF: write: IOPS=32.3k, BW=126MiB/s (132MB/s)(7560MiB/60001msec) lat (usec): min=25, max=7079, avg=30.55, stdev= 7.94 Journal IOPS (fsync=1) - Write cache ON: write: IOPS=16.9k, BW=66.2MiB/s (69.4MB/s)(3971MiB/60001msec) lat (usec): min=24, max=5068, avg=31.77, stdev= 7.82 Write cache OFF: write: IOPS=32.1k, BW=126MiB/s (132MB/s)(7533MiB/60001msec) lat (usec): min=24, max=7076, avg=29.41, stdev= 7.52 Parallel random (sync) - Write cache ON: write: IOPS=43.9k, BW=172MiB/s (180MB/s)(10.1GiB/60001msec) lat (usec): min=220, max=14767, avg=727.61, stdev=313.36 Write cache OFF: write: IOPS=44.3k, BW=173MiB/s (181MB/s)(10.1GiB/60001msec) lat (usec): min=134, max=4941, avg=721.96, stdev=311.46 Parallel random (fsync) - Write cache ON: write: IOPS=44.4k, BW=173MiB/s (182MB/s)(10.2GiB/60001msec) lat (usec): min=109, max=4349, avg=703.01, stdev=303.69 Write cache OFF: write: IOPS=44.6k, BW=174MiB/s (183MB/s)(10.2GiB/60001msec) lat (usec): min=26, max=7288, avg=716.32, stdev=300.48 Non-txn random - Write cache ON: write: IOPS=43.1k, BW=168MiB/s (177MB/s)(9.87GiB/60004msec) lat (usec): min=350, max=41703, avg=2967.89, stdev=1682.28 Write cache OFF: write: IOPS=43.4k, BW=170MiB/s (178MB/s)(9.93GiB/60004msec) lat (usec): min=177, max=42795, avg=2947.52, stdev=1666.24 Linear write - Write cache ON: write: IOPS=126, BW=505MiB/s (530MB/s)(29.6GiB/60027msec) lat (msec): min=226, max=281, avg=253.26, stdev= 3.51 Write cache OFF: write: IOPS=126, BW=507MiB/s (531MB/s)(29.8GiB/60254msec) lat (msec): min=7, max=492, avg=252.52, stdev=13.16 So, we can determine that some improvement can be seen with the write cache disabled (specifically on a Micron 5200), it is not enough that will likely change much in terms of Ceph's performance unless journal latency, IOPS, and bandwidth are a bottleneck. The "Journal IOPS (sync=1)" test shows the most dramatic difference, where disabling the write cache reduces the I/O latency by 39% (a reduction from 50.18us to 30.55us with a difference of 0.02ms) which respectively raises the IOPS and throughput of synchronous I/O. The "Journal IOPS (fsync=1)" test also shows a dramatic difference, but in terms of IOPS and throughput (approximately +90%), not latency. Hope this helps! I would love to hear feedback. Eric ### # Journal IOPS (sync=1) ### # Write cache ENABLED hdparm -W 1 /dev/sde fio --ioengine=libaio -sync=1 --direct=1 --name=test --bs=4k --iodepth=1 --readwrite=write --runtime 60 --filename=/dev/sde test: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1 fio-3.7 Starting 1 process Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=75.6MiB/s][r=0,w=19.3k IOPS][eta 00m:00s] test: (groupid=0, jobs=1): err= 0: pid=38269: Tue Jan 21 14:20:01 2020 write: IOPS=19.7k, BW=76.0MiB/s (80.7MB/s)(4618MiB/60001msec) slat (usec): min=2, max=180, avg= 4.43, stdev= 1.86 clat (nsec): min=1950, max=1262.3k, avg=45662.55, stdev=5778.88 lat (usec): min=42, max=1273, avg=50.18, stdev= 6.40 clat percentiles (usec): | 1.00th=[ 42], 5.00th=[ 42], 10.00th=[ 43], 20.00th=[ 43], | 30.00th=[ 43], 40.00th=[ 44], 50.00th=[ 44], 60.00th=[ 45], | 70.00th=[ 47], 80.00th=[ 48], 90.00th=[ 51], 95.00th=[ 55], | 99.00th=[ 66], 99.50th=[ 74], 99.90th=[ 91], 99.95th=[ 104], | 99.99th=[ 167] bw ( KiB/s): min=70152, max=81704, per=100.00%, avg=78835.97, stdev=2929.71, samples=119 iops: min=17538, max=20426, avg=19708.98, stdev=732.40, samples=119 lat (usec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=88.53% lat (usec) : 100=11.41%, 250=0.06%, 500=0.01%, 750=0.01%, 1000=0.01% lat (msec) : 2=0.01% cpu : usr=3.07%, sys=13.62%, ctx=1182324, majf=0, minf=27 IO depths
Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]
OK, now I understand. Yes, the cache setting will take effect immediately. Its more about do you trust the disk firmware to apply the change correctly in all situations when production IO is active at the same time (will volatile cache be flushed correctly or not)? I would not and rather change the setting while the OSD is down. During benchmarks on raw disks I just switched cache on and off when I needed. There was nothing running on the disks and the fio benchmark is destructive any ways. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Sasha Litvak Sent: 21 January 2020 10:19 To: Frank Schilder Cc: ceph-users Subject: Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext] Frank, Sorry for the confusion. I thought that turning off cache using hdparm -W 0 /dev/sdx takes effect right away and in case of non-raid controllers and Seagate or Micron SSDs I would see a difference starting fio benchmark right after executing hdparm. So I wonder it makes a difference whether cache turned off before OSD started or after. On Tue, Jan 21, 2020, 2:07 AM Frank Schilder mailto:fr...@dtu.dk>> wrote: > So hdparam -W 0 /dev/sdx doesn't work or it makes no difference? I wrote "We found the raw throughput in fio benchmarks to be very different for write-cache enabled and disabled, exactly as explained in the performance article.", so yes, it makes a huge difference. > Also I am not sure I understand why it should happen before OSD have been > started. > At least in my experience hdparam does it to hardware regardless. I'm not sure I understand this question. Ideally it happens at boot time and if this doesn't work, at least sometimes before the OSD is started. Why and how else would one want this to happen? Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]
Frank, Sorry for the confusion. I thought that turning off cache using hdparm -W 0 /dev/sdx takes effect right away and in case of non-raid controllers and Seagate or Micron SSDs I would see a difference starting fio benchmark right after executing hdparm. So I wonder it makes a difference whether cache turned off before OSD started or after. On Tue, Jan 21, 2020, 2:07 AM Frank Schilder wrote: > > So hdparam -W 0 /dev/sdx doesn't work or it makes no difference? > > I wrote "We found the raw throughput in fio benchmarks to be very > different for write-cache enabled and disabled, exactly as explained in the > performance article.", so yes, it makes a huge difference. > > > Also I am not sure I understand why it should happen before OSD have > been started. > > At least in my experience hdparam does it to hardware regardless. > > I'm not sure I understand this question. Ideally it happens at boot time > and if this doesn't work, at least sometimes before the OSD is started. Why > and how else would one want this to happen? > > Best regards, > > = > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]
> So hdparam -W 0 /dev/sdx doesn't work or it makes no difference? I wrote "We found the raw throughput in fio benchmarks to be very different for write-cache enabled and disabled, exactly as explained in the performance article.", so yes, it makes a huge difference. > Also I am not sure I understand why it should happen before OSD have been > started. > At least in my experience hdparam does it to hardware regardless. I'm not sure I understand this question. Ideally it happens at boot time and if this doesn't work, at least sometimes before the OSD is started. Why and how else would one want this to happen? Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]
Hi Vitaliy, > You say you don't have access to raw drives. What does it mean? Do you > run Ceph OSDs inside VMs? In that case you should probably disable > Micron caches on the hosts, not just in VMs. Sorry, I should have been more clear. This cluster is in production, so I needed to schedule a maintenance window to do some tests, including "out"ing an OSD and removing the OSD from Ceph, so I can perform some tests, as well as maintenance to remove a host from the cluster, so power off/on tests can be performed. Right now, all I have access to is the VM-level and the ability to enable/disable the write cache on the 5200's using hdparm (but no read/write tests directly on the 5200's, of course, which would require destructive operations). > Yes, disabling the write cache only takes place upon a power cycle... or > upon the next hotplug of the drive itself. I have a suspicion this is the reason we didn't see any change! :) Definitely an important item. Once I have test results, I will report back. May be something you want to add to your wiki article. > If you get the same ~15k or more iops with -rw=randwrite -fsync=1 > -iodepth=1 with both hdparm -W 0 and -W 1 you're good :) if you have > cache problems you'll get much less. Once I have a 5200 available to play with, I will definitely let you know the results. > About Micron 5300's, please benchmark them when you have them as > described here > https://docs.google.com/spreadsheets/d/1E9-eXjzsKboiCCX- > 0u0r5fAjjufLKayaut_FOPxYZjc/edit > (instructions in the end of the sheet) Most definitely. I suspect it will be another month before we get them unfortunately. :( Eric ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]
Hi Eric, You say you don't have access to raw drives. What does it mean? Do you run Ceph OSDs inside VMs? In that case you should probably disable Micron caches on the hosts, not just in VMs. Yes, disabling the write cache only takes place upon a power cycle... or upon the next hotplug of the drive itself. In some cases - I mean, with some HBAs/RAID controllers - disabling the write cache may not have any impact on performance. As I understand this is because some controllers disable drive write cache themselves by default. Just benchmark your drives with fio and compare IOPS with https://docs.google.com/spreadsheets/d/1E9-eXjzsKboiCCX-0u0r5fAjjufLKayaut_FOPxYZjc/edit If you get the same ~15k or more iops with -rw=randwrite -fsync=1 -iodepth=1 with both hdparm -W 0 and -W 1 you're good :) if you have cache problems you'll get much less. About Micron 5300's, please benchmark them when you have them as described here https://docs.google.com/spreadsheets/d/1E9-eXjzsKboiCCX-0u0r5fAjjufLKayaut_FOPxYZjc/edit (instructions in the end of the sheet) Hi Vitaliy, Similar to Stefan, we have a bunch of Micron 5200's (3.84TB ECO SATA version) in a Ceph cluster (Nautilus) and performance seems less than optimal. I have followed all instructions on your site (thank you for your wonderful article btw!!), but I haven't seen much change. The only thing I could think of is that "maybe" disabling the write cache only takes place upon a reboot or power cycle? Is that necessary? Or is it a "live" change? I have tested with the cache disabled as well as enabled on all drives. We're using fio running in a QEMU/KVM VM in an OpenStack cluster, so not "raw" access to the Micron 5200's. OSD (Bluestore) nodes run CentOS 7 using a 4.18.x kernel. Testing doesn't show any, or much, difference, enough that the variations could be considered "noise" in the results. Certainly no change that anyone could tell. Thought I'd check to see if you, or anyone else, might have any suggestions specific to the Micron 5200. We have some Micron 5300's inbound, but probably won't have them here for another few weeks due to Micron's manufacturing delays, so will be able to test these raw drives soon. I will report back after, but if you know anything about these, I'm all ears. :) Thank you! Eric FROM: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] ON BEHALF OF Stefan Bauer SENT: Tuesday, January 14, 2020 10:28 AM TO: undisclosed-recipients CC: ceph-users@lists.ceph.com SUBJECT: Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext] Thank you all, performance is indeed better now. Can now go back to sleep ;) KR Stefan -Ursprüngliche Nachricht- VON: Виталий Филиппов GESENDET: Dienstag 14 Januar 2020 10:28 AN: Wido den Hollander ; Stefan Bauer CC: ceph-users@lists.ceph.com BETREFF: Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext] ...disable signatures and rbd cache. I didn't mention it in the email to not repeat myself. But I have it in the article :-) -- With best regards, Vitaliy Filippov ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]
So hdparam -W 0 /dev/sdx doesn't work or it makes no difference? Also I am not sure I understand why it should happen before OSD have been started. At least in my experience hdparam does it to hardware regardless. On Mon, Jan 20, 2020, 2:25 AM Frank Schilder wrote: > We are using Micron 5200 PRO, 1.92TB for RBD images on KVM and are very > happy with the performance. We are using EC 6+2 pools, which really eat up > IOPs. Still, we get enough performance out to run 20-50 VMs per disk, which > results in good space utilisation as well since our default image size is > 50GB and we take rolling snapshots. I was thinking about 4TB disks also, > but am concerned that their IOPs/TB performance is too low for images on EC > pools. > > We found the raw throughput in fio benchmarks to be very different for > write-cache enabled and disabled, exactly as explained in the performance > article. Changing write cache settings is a boot-time operation. > Unfortunately, I couldn't find a reliable way to disable write cache at > boot time (I was looking for tuned configs) and ended up adding this to a > container startup script: > > if [[ "$1" == "osd_ceph_disk_activate" && -n "${OSD_DEVICE}" ]] ; then > echo "Disabling write cache on ${OSD_DEVICE}" > /usr/sbin/smartctl -s wcache=off "${OSD_DEVICE}" > fi > > This works for both, SAS and SATA drives and ensures that write cache is > disabled before an OSD daemon starts. > > Best regards, > > = > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > ________________ > From: ceph-users on behalf of Eric K. > Miller > Sent: 19 January 2020 04:24:33 > To: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] low io with enterprise SSDs ceph luminous - can > we expect more? [klartext] > > Hi Vitaliy, > > Similar to Stefan, we have a bunch of Micron 5200's (3.84TB ECO SATA > version) in a Ceph cluster (Nautilus) and performance seems less than > optimal. I have followed all instructions on your site (thank you for your > wonderful article btw!!), but I haven't seen much change. > > The only thing I could think of is that "maybe" disabling the write cache > only takes place upon a reboot or power cycle? Is that necessary? Or is > it a "live" change? > > I have tested with the cache disabled as well as enabled on all drives. > We're using fio running in a QEMU/KVM VM in an OpenStack cluster, so not > "raw" access to the Micron 5200's. OSD (Bluestore) nodes run CentOS 7 > using a 4.18.x kernel. Testing doesn't show any, or much, difference, > enough that the variations could be considered "noise" in the results. > Certainly no change that anyone could tell. > > Thought I'd check to see if you, or anyone else, might have any > suggestions specific to the Micron 5200. > > We have some Micron 5300's inbound, but probably won't have them here for > another few weeks due to Micron's manufacturing delays, so will be able to > test these raw drives soon. I will report back after, but if you know > anything about these, I'm all ears. :) > > Thank you! > > Eric > > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Stefan Bauer > Sent: Tuesday, January 14, 2020 10:28 AM > To: undisclosed-recipients > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] low io with enterprise SSDs ceph luminous - can > we expect more? [klartext] > > > Thank you all, > > > > performance is indeed better now. Can now go back to sleep ;) > > > > KR > > > > Stefan > > > -Ursprüngliche Nachricht- > Von: Виталий Филиппов > Gesendet: Dienstag 14 Januar 2020 10:28 > An: Wido den Hollander ; Stefan Bauer < > stefan.ba...@cubewerk.de> > CC: ceph-users@lists.ceph.com > Betreff: Re: [ceph-users] low io with enterprise SSDs ceph luminous - can > we expect more? [klartext] > > ...disable signatures and rbd cache. I didn't mention it in the email to > not repeat myself. But I have it in the article :-) > -- > With best regards, > Vitaliy Filippov > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]
We are using Micron 5200 PRO, 1.92TB for RBD images on KVM and are very happy with the performance. We are using EC 6+2 pools, which really eat up IOPs. Still, we get enough performance out to run 20-50 VMs per disk, which results in good space utilisation as well since our default image size is 50GB and we take rolling snapshots. I was thinking about 4TB disks also, but am concerned that their IOPs/TB performance is too low for images on EC pools. We found the raw throughput in fio benchmarks to be very different for write-cache enabled and disabled, exactly as explained in the performance article. Changing write cache settings is a boot-time operation. Unfortunately, I couldn't find a reliable way to disable write cache at boot time (I was looking for tuned configs) and ended up adding this to a container startup script: if [[ "$1" == "osd_ceph_disk_activate" && -n "${OSD_DEVICE}" ]] ; then echo "Disabling write cache on ${OSD_DEVICE}" /usr/sbin/smartctl -s wcache=off "${OSD_DEVICE}" fi This works for both, SAS and SATA drives and ensures that write cache is disabled before an OSD daemon starts. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: ceph-users on behalf of Eric K. Miller Sent: 19 January 2020 04:24:33 To: ceph-users@lists.ceph.com Subject: Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext] Hi Vitaliy, Similar to Stefan, we have a bunch of Micron 5200's (3.84TB ECO SATA version) in a Ceph cluster (Nautilus) and performance seems less than optimal. I have followed all instructions on your site (thank you for your wonderful article btw!!), but I haven't seen much change. The only thing I could think of is that "maybe" disabling the write cache only takes place upon a reboot or power cycle? Is that necessary? Or is it a "live" change? I have tested with the cache disabled as well as enabled on all drives. We're using fio running in a QEMU/KVM VM in an OpenStack cluster, so not "raw" access to the Micron 5200's. OSD (Bluestore) nodes run CentOS 7 using a 4.18.x kernel. Testing doesn't show any, or much, difference, enough that the variations could be considered "noise" in the results. Certainly no change that anyone could tell. Thought I'd check to see if you, or anyone else, might have any suggestions specific to the Micron 5200. We have some Micron 5300's inbound, but probably won't have them here for another few weeks due to Micron's manufacturing delays, so will be able to test these raw drives soon. I will report back after, but if you know anything about these, I'm all ears. :) Thank you! Eric From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Stefan Bauer Sent: Tuesday, January 14, 2020 10:28 AM To: undisclosed-recipients Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext] Thank you all, performance is indeed better now. Can now go back to sleep ;) KR Stefan -Ursprüngliche Nachricht- Von: Виталий Филиппов Gesendet: Dienstag 14 Januar 2020 10:28 An: Wido den Hollander ; Stefan Bauer CC: ceph-users@lists.ceph.com Betreff: Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext] ...disable signatures and rbd cache. I didn't mention it in the email to not repeat myself. But I have it in the article :-) -- With best regards, Vitaliy Filippov ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]
Hi Vitaliy, Similar to Stefan, we have a bunch of Micron 5200's (3.84TB ECO SATA version) in a Ceph cluster (Nautilus) and performance seems less than optimal. I have followed all instructions on your site (thank you for your wonderful article btw!!), but I haven't seen much change. The only thing I could think of is that "maybe" disabling the write cache only takes place upon a reboot or power cycle? Is that necessary? Or is it a "live" change? I have tested with the cache disabled as well as enabled on all drives. We're using fio running in a QEMU/KVM VM in an OpenStack cluster, so not "raw" access to the Micron 5200's. OSD (Bluestore) nodes run CentOS 7 using a 4.18.x kernel. Testing doesn't show any, or much, difference, enough that the variations could be considered "noise" in the results. Certainly no change that anyone could tell. Thought I'd check to see if you, or anyone else, might have any suggestions specific to the Micron 5200. We have some Micron 5300's inbound, but probably won't have them here for another few weeks due to Micron's manufacturing delays, so will be able to test these raw drives soon. I will report back after, but if you know anything about these, I'm all ears. :) Thank you! Eric From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Stefan Bauer Sent: Tuesday, January 14, 2020 10:28 AM To: undisclosed-recipients Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext] Thank you all, performance is indeed better now. Can now go back to sleep ;) KR Stefan -Ursprüngliche Nachricht- Von: Виталий Филиппов Gesendet: Dienstag 14 Januar 2020 10:28 An: Wido den Hollander ; Stefan Bauer CC: ceph-users@lists.ceph.com Betreff: Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext] ...disable signatures and rbd cache. I didn't mention it in the email to not repeat myself. But I have it in the article :-) -- With best regards, Vitaliy Filippov ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]
Thank you all, performance is indeed better now. Can now go back to sleep ;) KR Stefan -Ursprüngliche Nachricht- Von: Виталий Филиппов Gesendet: Dienstag 14 Januar 2020 10:28 An: Wido den Hollander ; Stefan Bauer CC: ceph-users@lists.ceph.com Betreff: Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext] ...disable signatures and rbd cache. I didn't mention it in the email to not repeat myself. But I have it in the article :-) -- With best regards, Vitaliy Filippov___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]
Yes, that's it, see the end of the article. You'll have to disable signature checks, too. cephx_require_signatures = false cephx_cluster_require_signatures = false cephx_sign_messages = false Hi Vitaliy, thank you for your time. Do you mean cephx sign messages = false with "diable signatures" ? KR Stefan -Ursprüngliche Nachricht- VON: Виталий Филиппов GESENDET: Dienstag 14 Januar 2020 10:28 AN: Wido den Hollander ; Stefan Bauer CC: ceph-users@lists.ceph.com BETREFF: Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext] ...disable signatures and rbd cache. I didn't mention it in the email to not repeat myself. But I have it in the article :-) -- With best regards, Vitaliy Filippov ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]
Hi Vitaliy, thank you for your time. Do you mean cephx sign messages = false with "diable signatures" ? KR Stefan -Ursprüngliche Nachricht- Von: Виталий Филиппов Gesendet: Dienstag 14 Januar 2020 10:28 An: Wido den Hollander ; Stefan Bauer CC: ceph-users@lists.ceph.com Betreff: Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext] ...disable signatures and rbd cache. I didn't mention it in the email to not repeat myself. But I have it in the article :-) -- With best regards, Vitaliy Filippov___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]
Hi Stefan, thank you for your time. "temporary write through" does not seem to be a legit parameter. However write through is already set: root@proxmox61:~# echo "temporary write through" > /sys/block/sdb/device/scsi_disk/*/cache_type root@proxmox61:~# cat /sys/block/sdb/device/scsi_disk/2\:0\:0\:0/cache_type write through is that, what you meant? Thank you. KR Stefan -Ursprüngliche Nachricht- Von: Stefan Priebe - Profihost AG this has something todo with the firmware and how the manufacturer handles syncs / flushes. Intel just ignores sync / flush commands for drives which have a capacitor. Samsung does not. The problem is that Ceph sends a lot of flush commands which slows down drives without capacitor. You can make linux to ignore those userspace requests with the following command: echo "temporary write through" > /sys/block/sdX/device/scsi_disk/*/cache_type Greets, Stefan Priebe Profihost AG > Thank you. > > > Stefan > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]
...disable signatures and rbd cache. I didn't mention it in the email to not repeat myself. But I have it in the article :-) -- With best regards, Vitaliy Filippov___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]
On 1/13/20 6:37 PM, vita...@yourcmc.ru wrote: >> Hi, >> >> we're playing around with ceph but are not quite happy with the IOs. >> on average 5000 iops / write >> on average 13000 iops / read >> >> We're expecting more. :( any ideas or is that all we can expect? > > With server SSD you can expect up to ~1 write / ~25000 read iops per > a single client. > > https://yourcmc.ru/wiki/Ceph_performance > >> money is NOT a problem for this test-bed, any ideas howto gain more >> IOS is greatly appreciated. > > Grab some server NVMes and best possible CPUs :) And then: - Disable all powersaving - Pin the CPUs in C-State 1 That might even increase performance even more. But due to the synchronous nature of Ceph the performance and latency of a single thread will be limited. Wido > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]
Hi, we're playing around with ceph but are not quite happy with the IOs. on average 5000 iops / write on average 13000 iops / read We're expecting more. :( any ideas or is that all we can expect? With server SSD you can expect up to ~1 write / ~25000 read iops per a single client. https://yourcmc.ru/wiki/Ceph_performance money is NOT a problem for this test-bed, any ideas howto gain more IOS is greatly appreciated. Grab some server NVMes and best possible CPUs :) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]
Hi Stefan, Am 13.01.20 um 17:09 schrieb Stefan Bauer: > Hi, > > > we're playing around with ceph but are not quite happy with the IOs. > > > 3 node ceph / proxmox cluster with each: > > > LSI HBA 3008 controller > > 4 x MZILT960HAHQ/007 Samsung SSD > > Transport protocol: SAS (SPL-3) > > 40G fibre Intel 520 Network controller on Unifi Switch > > Ping roundtrip to partner node is 0.040ms average. > > > Transport protocol: SAS (SPL-3) > > > fio reports on a virtual machine with > > > --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test > --filename=test --bs=4k --iodepth=64 --size=4G --readwrite=randrw > --rwmixread=75 > > > on average 5000 iops / write > > on average 13000 iops / read > > > > We're expecting more. :( any ideas or is that all we can expect? > > > money is *not* a problem for this test-bed, any ideas howto gain more > IOS is greatly appreciated. this has something todo with the firmware and how the manufacturer handles syncs / flushes. Intel just ignores sync / flush commands for drives which have a capacitor. Samsung does not. The problem is that Ceph sends a lot of flush commands which slows down drives without capacitor. You can make linux to ignore those userspace requests with the following command: echo "temporary write through" > /sys/block/sdX/device/scsi_disk/*/cache_type Greets, Stefan Priebe Profihost AG > Thank you. > > > Stefan > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]
Do those SSD's have capacitors (aka power loss protection)? I took a look at the spec sheet on samsung's site and I don't see it mentioned. If that's the case it could certainly explain the performance you're seeing. Not all enterprise SSD's have it and it's a must have for Ceph since it syncs every write directly to disk. You may also want to look for something with a higher DWPD so you can get more life out of them. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com