Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]

2020-01-21 Thread Eric K. Miller
Hi Vitaliy,

The drive is a Micron 5200 ECO 3.84TB

This is from the msecli utility:

Device Name  : /dev/sde
Model No : Micron_5200_MTFDDAK3T8TDC
Serial No: 
FW-Rev   : D1MU404
Total Size   : 3840.00GB
Drive Status : Drive is in good health
Sata Link Speed  : Gen3 (6.0 Gbps)
Sata Link Max Speed  : Gen3 (6.0 Gbps)
Temp(C)  : 26

The server motherboard is: SuperMicro X10DRU-i+

Drives are connected to SATA connectors on the motherboard.

Processors are:  Xeon E5-2690v4

Eric


From: Виталий Филиппов [mailto:vita...@yourcmc.ru] 
Sent: Tuesday, January 21, 2020 3:43 PM
To: Eric K. Miller
Cc: ceph-users@lists.ceph.com
Subject: RE: [ceph-users] low io with enterprise SSDs ceph luminous - can we 
expect more? [klartext]

Hi! Thanks.

The parameter gets reset when you reconnect the SSD so in fact it requires not 
to power cycle it after changing the parameter :-)

Ok, this case seems lucky, ~2x change isn't a lot. Can you tell the exact model 
and capacity of this Micron, and what controller was used in this test? I'll 
add it to the spreadsheet.
-- 
With best regards,
Vitaliy Filippov
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]

2020-01-21 Thread Виталий Филиппов
Hi! Thanks.

The parameter gets reset when you reconnect the SSD so in fact it requires not 
to power cycle it after changing the parameter :-)

Ok, this case seems lucky, ~2x change isn't a lot. Can you tell the exact model 
and capacity of this Micron, and what controller was used in this test? I'll 
add it to the spreadsheet.
-- 
With best regards,
  Vitaliy Filippov___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]

2020-01-21 Thread Eric K. Miller
We were able to isolate an individual Micron 5200 and perform Vitaliy's
tests in his spreadsheet.

An interesting item - write cache changes do NOT require a power cycle
to take effect, at least on a Micron 5200.

The complete results from fio are included at the end of this message
for the individual tests, for both write enabled and disabled.

The shortened version of the results:

Journal IOPS (sync=1)

-
Write cache ON:  
  write: IOPS=19.7k, BW=76.0MiB/s (80.7MB/s)(4618MiB/60001msec)
 lat (usec): min=42, max=1273, avg=50.18, stdev= 6.40

Write cache OFF:
  write: IOPS=32.3k, BW=126MiB/s (132MB/s)(7560MiB/60001msec)
 lat (usec): min=25, max=7079, avg=30.55, stdev= 7.94


Journal IOPS (fsync=1)

-
Write cache ON:
  write: IOPS=16.9k, BW=66.2MiB/s (69.4MB/s)(3971MiB/60001msec)
 lat (usec): min=24, max=5068, avg=31.77, stdev= 7.82

Write cache OFF:
  write: IOPS=32.1k, BW=126MiB/s (132MB/s)(7533MiB/60001msec)
 lat (usec): min=24, max=7076, avg=29.41, stdev= 7.52


Parallel random (sync)

-
Write cache ON:
  write: IOPS=43.9k, BW=172MiB/s (180MB/s)(10.1GiB/60001msec)
 lat (usec): min=220, max=14767, avg=727.61, stdev=313.36

Write cache OFF:
  write: IOPS=44.3k, BW=173MiB/s (181MB/s)(10.1GiB/60001msec)
 lat (usec): min=134, max=4941, avg=721.96, stdev=311.46


Parallel random (fsync)

-
Write cache ON:
  write: IOPS=44.4k, BW=173MiB/s (182MB/s)(10.2GiB/60001msec)
 lat (usec): min=109, max=4349, avg=703.01, stdev=303.69

Write cache OFF:
  write: IOPS=44.6k, BW=174MiB/s (183MB/s)(10.2GiB/60001msec)
 lat (usec): min=26, max=7288, avg=716.32, stdev=300.48


Non-txn random

-
Write cache ON:
  write: IOPS=43.1k, BW=168MiB/s (177MB/s)(9.87GiB/60004msec)
 lat (usec): min=350, max=41703, avg=2967.89, stdev=1682.28

Write cache OFF:
  write: IOPS=43.4k, BW=170MiB/s (178MB/s)(9.93GiB/60004msec)
 lat (usec): min=177, max=42795, avg=2947.52, stdev=1666.24


Linear write

-
Write cache ON:
  write: IOPS=126, BW=505MiB/s (530MB/s)(29.6GiB/60027msec)
 lat (msec): min=226, max=281, avg=253.26, stdev= 3.51

Write cache OFF:
  write: IOPS=126, BW=507MiB/s (531MB/s)(29.8GiB/60254msec)
 lat (msec): min=7, max=492, avg=252.52, stdev=13.16


So, we can determine that some improvement can be seen with the write
cache disabled (specifically on a Micron 5200), it is not enough that
will likely change much in terms of Ceph's performance unless journal
latency, IOPS, and bandwidth are a bottleneck.

The "Journal IOPS (sync=1)" test shows the most dramatic difference,
where disabling the write cache reduces the I/O latency by 39% (a
reduction from 50.18us to 30.55us with a difference of 0.02ms) which
respectively raises the IOPS and throughput of synchronous I/O.

The "Journal IOPS (fsync=1)" test also shows a dramatic difference, but
in terms of IOPS and throughput (approximately +90%), not latency.

Hope this helps!  I would love to hear feedback.

Eric



###
# Journal IOPS (sync=1)
###

# Write cache ENABLED
hdparm -W 1 /dev/sde

fio --ioengine=libaio -sync=1 --direct=1 --name=test --bs=4k --iodepth=1
--readwrite=write  --runtime 60 --filename=/dev/sde

test: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
4096B-4096B, ioengine=libaio, iodepth=1
fio-3.7
Starting 1 process
Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=75.6MiB/s][r=0,w=19.3k
IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=38269: Tue Jan 21 14:20:01 2020
  write: IOPS=19.7k, BW=76.0MiB/s (80.7MB/s)(4618MiB/60001msec)
slat (usec): min=2, max=180, avg= 4.43, stdev= 1.86
clat (nsec): min=1950, max=1262.3k, avg=45662.55, stdev=5778.88
 lat (usec): min=42, max=1273, avg=50.18, stdev= 6.40
clat percentiles (usec):
 |  1.00th=[   42],  5.00th=[   42], 10.00th=[   43], 20.00th=[
43],
 | 30.00th=[   43], 40.00th=[   44], 50.00th=[   44], 60.00th=[
45],
 | 70.00th=[   47], 80.00th=[   48], 90.00th=[   51], 95.00th=[
55],
 | 99.00th=[   66], 99.50th=[   74], 99.90th=[   91], 99.95th=[
104],
 | 99.99th=[  167]
   bw (  KiB/s): min=70152, max=81704, per=100.00%, avg=78835.97,
stdev=2929.71, samples=119
   iops: min=17538, max=20426, avg=19708.98, stdev=732.40,
samples=119
  lat (usec)   : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=88.53%
  lat (usec)   : 100=11.41%, 250=0.06%, 500=0.01%, 750=0.01%, 1000=0.01%
  lat (msec)   : 2=0.01%
  cpu  : usr=3.07%, sys=13.62%, ctx=1182324, majf=0, minf=27
  IO depths

Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]

2020-01-21 Thread Frank Schilder
OK, now I understand. Yes, the cache setting will take effect immediately. Its 
more about do you trust the disk firmware to apply the change correctly in all 
situations when production IO is active at the same time (will volatile cache 
be flushed correctly or not)? I would not and rather change the setting while 
the OSD is down.

During benchmarks on raw disks I just switched cache on and off when I needed. 
There was nothing running on the disks and the fio benchmark is destructive any 
ways.

Best regards,

=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Sasha Litvak 
Sent: 21 January 2020 10:19
To: Frank Schilder
Cc: ceph-users
Subject: Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we 
expect more? [klartext]

Frank,

Sorry for the confusion.  I thought that turning off cache using hdparm -W 0 
/dev/sdx takes effect right away and in case of non-raid controllers and 
Seagate or Micron SSDs I would see a difference starting fio benchmark right 
after executing hdparm.  So I wonder it makes a difference whether cache turned 
off before OSD started or after.



On Tue, Jan 21, 2020, 2:07 AM Frank Schilder 
mailto:fr...@dtu.dk>> wrote:
> So hdparam -W 0 /dev/sdx doesn't work or it makes no difference?

I wrote "We found the raw throughput in fio benchmarks to be very different for 
write-cache enabled and disabled, exactly as explained in the performance 
article.", so yes, it makes a huge difference.

> Also I am not sure I understand why it should happen before OSD have been 
> started.
> At least in my experience hdparam does it to hardware regardless.

I'm not sure I understand this question. Ideally it happens at boot time and if 
this doesn't work, at least sometimes before the OSD is started. Why and how 
else would one want this to happen?

Best regards,

=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]

2020-01-21 Thread Sasha Litvak
Frank,

Sorry for the confusion.  I thought that turning off cache using hdparm -W
0 /dev/sdx takes effect right away and in case of non-raid controllers and
Seagate or Micron SSDs I would see a difference starting fio benchmark
right after executing hdparm.  So I wonder it makes a difference whether
cache turned off before OSD started or after.



On Tue, Jan 21, 2020, 2:07 AM Frank Schilder  wrote:

> > So hdparam -W 0 /dev/sdx doesn't work or it makes no difference?
>
> I wrote "We found the raw throughput in fio benchmarks to be very
> different for write-cache enabled and disabled, exactly as explained in the
> performance article.", so yes, it makes a huge difference.
>
> > Also I am not sure I understand why it should happen before OSD have
> been started.
> > At least in my experience hdparam does it to hardware regardless.
>
> I'm not sure I understand this question. Ideally it happens at boot time
> and if this doesn't work, at least sometimes before the OSD is started. Why
> and how else would one want this to happen?
>
> Best regards,
>
> =
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]

2020-01-21 Thread Frank Schilder
> So hdparam -W 0 /dev/sdx doesn't work or it makes no difference? 

I wrote "We found the raw throughput in fio benchmarks to be very different for 
write-cache enabled and disabled, exactly as explained in the performance 
article.", so yes, it makes a huge difference.

> Also I am not sure I understand why it should happen before OSD have been 
> started. 
> At least in my experience hdparam does it to hardware regardless.

I'm not sure I understand this question. Ideally it happens at boot time and if 
this doesn't work, at least sometimes before the OSD is started. Why and how 
else would one want this to happen?

Best regards,

=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]

2020-01-20 Thread Eric K. Miller
Hi Vitaliy,

> You say you don't have access to raw drives. What does it mean? Do you
> run Ceph OSDs inside VMs? In that case you should probably disable
> Micron caches on the hosts, not just in VMs.

Sorry, I should have been more clear.  This cluster is in production, so I 
needed to schedule a maintenance window to do some tests, including "out"ing an 
OSD and removing the OSD from Ceph, so I can perform some tests, as well as 
maintenance to remove a host from the cluster, so power off/on tests can be 
performed.  Right now, all I have access to is the VM-level and the ability to 
enable/disable the write cache on the 5200's using hdparm (but no read/write 
tests directly on the 5200's, of course, which would require destructive 
operations).


> Yes, disabling the write cache only takes place upon a power cycle... or
> upon the next hotplug of the drive itself.

I have a suspicion this is the reason we didn't see any change! :)  Definitely 
an important item.  Once I have test results, I will report back.  May be 
something you want to add to your wiki article.


> If you get the same ~15k or more iops with -rw=randwrite -fsync=1
> -iodepth=1 with both hdparm -W 0 and -W 1 you're good :) if you have
> cache problems you'll get much less.

Once I have a 5200 available to play with, I will definitely let you know the 
results.


> About Micron 5300's, please benchmark them when you have them as
> described here
> https://docs.google.com/spreadsheets/d/1E9-eXjzsKboiCCX-
> 0u0r5fAjjufLKayaut_FOPxYZjc/edit
> (instructions in the end of the sheet)

Most definitely.  I suspect it will be another month before we get them 
unfortunately. :(

Eric


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]

2020-01-20 Thread vitalif

Hi Eric,

You say you don't have access to raw drives. What does it mean? Do you 
run Ceph OSDs inside VMs? In that case you should probably disable 
Micron caches on the hosts, not just in VMs.


Yes, disabling the write cache only takes place upon a power cycle... or 
upon the next hotplug of the drive itself.


In some cases - I mean, with some HBAs/RAID controllers - disabling the 
write cache may not have any impact on performance. As I understand this 
is because some controllers disable drive write cache themselves by 
default.


Just benchmark your drives with fio and compare IOPS with 
https://docs.google.com/spreadsheets/d/1E9-eXjzsKboiCCX-0u0r5fAjjufLKayaut_FOPxYZjc/edit


If you get the same ~15k or more iops with -rw=randwrite -fsync=1 
-iodepth=1 with both hdparm -W 0 and -W 1 you're good :) if you have 
cache problems you'll get much less.


About Micron 5300's, please benchmark them when you have them as 
described here 
https://docs.google.com/spreadsheets/d/1E9-eXjzsKboiCCX-0u0r5fAjjufLKayaut_FOPxYZjc/edit 
(instructions in the end of the sheet)



Hi Vitaliy,

Similar to Stefan, we have a bunch of Micron 5200's (3.84TB ECO SATA
version) in a Ceph cluster (Nautilus) and performance seems less than
optimal.  I have followed all instructions on your site (thank you for
your wonderful article btw!!), but I haven't seen much change.

The only thing I could think of is that "maybe" disabling the write
cache only takes place upon a reboot or power cycle?  Is that
necessary?  Or is it a "live" change?

I have tested with the cache disabled as well as enabled on all
drives.  We're using fio running in a QEMU/KVM VM in an OpenStack
cluster, so not "raw" access to the Micron 5200's.  OSD (Bluestore)
nodes run CentOS 7 using a 4.18.x kernel.  Testing doesn't show any,
or much, difference, enough that the variations could be considered
"noise" in the results.  Certainly no change that anyone could tell.

Thought I'd check to see if you, or anyone else, might have any
suggestions specific to the Micron 5200.

We have some Micron 5300's inbound, but probably won't have them here
for another few weeks due to Micron's manufacturing delays, so will be
able to test these raw drives soon.  I will report back after, but if
you know anything about these, I'm all ears. :)

Thank you!

Eric

FROM: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] ON BEHALF
OF Stefan Bauer
SENT: Tuesday, January 14, 2020 10:28 AM
TO: undisclosed-recipients
CC: ceph-users@lists.ceph.com
SUBJECT: Re: [ceph-users] low io with enterprise SSDs ceph luminous -
can we expect more? [klartext]

Thank you all,

performance is indeed better now. Can now go back to sleep ;)

KR

Stefan


-Ursprüngliche Nachricht-
VON: Виталий Филиппов 
GESENDET: Dienstag 14 Januar 2020 10:28
AN: Wido den Hollander ; Stefan Bauer

CC: ceph-users@lists.ceph.com
BETREFF: Re: [ceph-users] low io with enterprise SSDs ceph luminous
- can we expect more? [klartext]

...disable signatures and rbd cache. I didn't mention it in the
email to not repeat myself. But I have it in the article :-)
--
With best regards,
Vitaliy Filippov

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]

2020-01-20 Thread Sasha Litvak
So hdparam -W 0 /dev/sdx doesn't work or it makes no difference?  Also I am
not sure I understand why it should happen before OSD have been started.
At least in my experience hdparam does it to hardware regardless.

On Mon, Jan 20, 2020, 2:25 AM Frank Schilder  wrote:

> We are using Micron 5200 PRO, 1.92TB for RBD images on KVM and are very
> happy with the performance. We are using EC 6+2 pools, which really eat up
> IOPs. Still, we get enough performance out to run 20-50 VMs per disk, which
> results in good space utilisation as well since our default image size is
> 50GB and we take rolling snapshots. I was thinking about 4TB disks also,
> but am concerned that their IOPs/TB performance is too low for images on EC
> pools.
>
> We found the raw throughput in fio benchmarks to be very different for
> write-cache enabled and disabled, exactly as explained in the performance
> article. Changing write cache settings is a boot-time operation.
> Unfortunately, I couldn't find a reliable way to disable write cache at
> boot time (I was looking for tuned configs) and ended up adding this to a
> container startup script:
>
>   if [[ "$1" == "osd_ceph_disk_activate" && -n "${OSD_DEVICE}" ]] ; then
> echo "Disabling write cache on ${OSD_DEVICE}"
> /usr/sbin/smartctl -s wcache=off "${OSD_DEVICE}"
>   fi
>
> This works for both, SAS and SATA drives and ensures that write cache is
> disabled before an OSD daemon starts.
>
> Best regards,
>
> =
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> ________________
> From: ceph-users  on behalf of Eric K.
> Miller 
> Sent: 19 January 2020 04:24:33
> To: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] low io with enterprise SSDs ceph luminous - can
> we expect more? [klartext]
>
> Hi Vitaliy,
>
> Similar to Stefan, we have a bunch of Micron 5200's (3.84TB ECO SATA
> version) in a Ceph cluster (Nautilus) and performance seems less than
> optimal.  I have followed all instructions on your site (thank you for your
> wonderful article btw!!), but I haven't seen much change.
>
> The only thing I could think of is that "maybe" disabling the write cache
> only takes place upon a reboot or power cycle?  Is that necessary?  Or is
> it a "live" change?
>
> I have tested with the cache disabled as well as enabled on all drives.
> We're using fio running in a QEMU/KVM VM in an OpenStack cluster, so not
> "raw" access to the Micron 5200's.  OSD (Bluestore) nodes run CentOS 7
> using a 4.18.x kernel.  Testing doesn't show any, or much, difference,
> enough that the variations could be considered "noise" in the results.
> Certainly no change that anyone could tell.
>
> Thought I'd check to see if you, or anyone else, might have any
> suggestions specific to the Micron 5200.
>
> We have some Micron 5300's inbound, but probably won't have them here for
> another few weeks due to Micron's manufacturing delays, so will be able to
> test these raw drives soon.  I will report back after, but if you know
> anything about these, I'm all ears. :)
>
> Thank you!
>
> Eric
>
>
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Stefan Bauer
> Sent: Tuesday, January 14, 2020 10:28 AM
> To: undisclosed-recipients
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] low io with enterprise SSDs ceph luminous - can
> we expect more? [klartext]
>
>
> Thank you all,
>
>
>
> performance is indeed better now. Can now go back to sleep ;)
>
>
>
> KR
>
>
>
> Stefan
>
>
> -Ursprüngliche Nachricht-
> Von: Виталий Филиппов 
> Gesendet: Dienstag 14 Januar 2020 10:28
> An: Wido den Hollander ; Stefan Bauer <
> stefan.ba...@cubewerk.de>
> CC: ceph-users@lists.ceph.com
> Betreff: Re: [ceph-users] low io with enterprise SSDs ceph luminous - can
> we expect more? [klartext]
>
> ...disable signatures and rbd cache. I didn't mention it in the email to
> not repeat myself. But I have it in the article :-)
> --
> With best regards,
> Vitaliy Filippov
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]

2020-01-20 Thread Frank Schilder
We are using Micron 5200 PRO, 1.92TB for RBD images on KVM and are very happy 
with the performance. We are using EC 6+2 pools, which really eat up IOPs. 
Still, we get enough performance out to run 20-50 VMs per disk, which results 
in good space utilisation as well since our default image size is 50GB and we 
take rolling snapshots. I was thinking about 4TB disks also, but am concerned 
that their IOPs/TB performance is too low for images on EC pools.

We found the raw throughput in fio benchmarks to be very different for 
write-cache enabled and disabled, exactly as explained in the performance 
article. Changing write cache settings is a boot-time operation. Unfortunately, 
I couldn't find a reliable way to disable write cache at boot time (I was 
looking for tuned configs) and ended up adding this to a container startup 
script:

  if [[ "$1" == "osd_ceph_disk_activate" && -n "${OSD_DEVICE}" ]] ; then
echo "Disabling write cache on ${OSD_DEVICE}"
/usr/sbin/smartctl -s wcache=off "${OSD_DEVICE}"
  fi

This works for both, SAS and SATA drives and ensures that write cache is 
disabled before an OSD daemon starts.

Best regards,

=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: ceph-users  on behalf of Eric K. 
Miller 
Sent: 19 January 2020 04:24:33
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we 
expect more? [klartext]

Hi Vitaliy,

Similar to Stefan, we have a bunch of Micron 5200's (3.84TB ECO SATA version) 
in a Ceph cluster (Nautilus) and performance seems less than optimal.  I have 
followed all instructions on your site (thank you for your wonderful article 
btw!!), but I haven't seen much change.

The only thing I could think of is that "maybe" disabling the write cache only 
takes place upon a reboot or power cycle?  Is that necessary?  Or is it a 
"live" change?

I have tested with the cache disabled as well as enabled on all drives.  We're 
using fio running in a QEMU/KVM VM in an OpenStack cluster, so not "raw" access 
to the Micron 5200's.  OSD (Bluestore) nodes run CentOS 7 using a 4.18.x 
kernel.  Testing doesn't show any, or much, difference, enough that the 
variations could be considered "noise" in the results.  Certainly no change 
that anyone could tell.

Thought I'd check to see if you, or anyone else, might have any suggestions 
specific to the Micron 5200.

We have some Micron 5300's inbound, but probably won't have them here for 
another few weeks due to Micron's manufacturing delays, so will be able to test 
these raw drives soon.  I will report back after, but if you know anything 
about these, I'm all ears. :)

Thank you!

Eric


From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Stefan 
Bauer
Sent: Tuesday, January 14, 2020 10:28 AM
To: undisclosed-recipients
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we 
expect more? [klartext]


Thank you all,



performance is indeed better now. Can now go back to sleep ;)



KR



Stefan


-Ursprüngliche Nachricht-
Von: Виталий Филиппов 
Gesendet: Dienstag 14 Januar 2020 10:28
An: Wido den Hollander ; Stefan Bauer 
CC: ceph-users@lists.ceph.com
Betreff: Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we 
expect more? [klartext]

...disable signatures and rbd cache. I didn't mention it in the email to not 
repeat myself. But I have it in the article :-)
--
With best regards,
Vitaliy Filippov
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]

2020-01-18 Thread Eric K. Miller
Hi Vitaliy,

 

Similar to Stefan, we have a bunch of Micron 5200's (3.84TB ECO SATA version) 
in a Ceph cluster (Nautilus) and performance seems less than optimal.  I have 
followed all instructions on your site (thank you for your wonderful article 
btw!!), but I haven't seen much change.

 

The only thing I could think of is that "maybe" disabling the write cache only 
takes place upon a reboot or power cycle?  Is that necessary?  Or is it a 
"live" change?

 

I have tested with the cache disabled as well as enabled on all drives.  We're 
using fio running in a QEMU/KVM VM in an OpenStack cluster, so not "raw" access 
to the Micron 5200's.  OSD (Bluestore) nodes run CentOS 7 using a 4.18.x 
kernel.  Testing doesn't show any, or much, difference, enough that the 
variations could be considered "noise" in the results.  Certainly no change 
that anyone could tell.

 

Thought I'd check to see if you, or anyone else, might have any suggestions 
specific to the Micron 5200.

 

We have some Micron 5300's inbound, but probably won't have them here for 
another few weeks due to Micron's manufacturing delays, so will be able to test 
these raw drives soon.  I will report back after, but if you know anything 
about these, I'm all ears. :)

 

Thank you!

 

Eric

 

 

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Stefan 
Bauer
Sent: Tuesday, January 14, 2020 10:28 AM
To: undisclosed-recipients
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we 
expect more? [klartext]

 

Thank you all,

 

performance is indeed better now. Can now go back to sleep ;)

 

KR

 

Stefan

 

-Ursprüngliche Nachricht-
Von: Виталий Филиппов 
Gesendet: Dienstag 14 Januar 2020 10:28
An: Wido den Hollander ; Stefan Bauer 

CC: ceph-users@lists.ceph.com
    Betreff: Re: [ceph-users] low io with enterprise SSDs ceph luminous - 
can we expect more? [klartext]

...disable signatures and rbd cache. I didn't mention it in the email 
to not repeat myself. But I have it in the article :-)
-- 
With best regards,
Vitaliy Filippov 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]

2020-01-14 Thread Stefan Bauer
Thank you all,



performance is indeed better now. Can now go back to sleep ;)



KR



Stefan



-Ursprüngliche Nachricht-
Von: Виталий Филиппов 
Gesendet: Dienstag 14 Januar 2020 10:28
An: Wido den Hollander ; Stefan Bauer 
CC: ceph-users@lists.ceph.com
Betreff: Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we 
expect more? [klartext]

...disable signatures and rbd cache. I didn't mention it in the email to not 
repeat myself. But I have it in the article :-)
--
With best regards,
Vitaliy Filippov___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]

2020-01-14 Thread vitalif
Yes, that's it, see the end of the article. You'll have to disable 
signature checks, too.


cephx_require_signatures = false
cephx_cluster_require_signatures = false
cephx_sign_messages = false


Hi Vitaliy,

thank you for your time. Do you mean

cephx sign messages = false

with "diable signatures" ?

KR

Stefan


-Ursprüngliche Nachricht-
VON: Виталий Филиппов 
GESENDET: Dienstag 14 Januar 2020 10:28
AN: Wido den Hollander ; Stefan Bauer

CC: ceph-users@lists.ceph.com
BETREFF: Re: [ceph-users] low io with enterprise SSDs ceph luminous
- can we expect more? [klartext]

...disable signatures and rbd cache. I didn't mention it in the
email to not repeat myself. But I have it in the article :-)
--
With best regards,
Vitaliy Filippov

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]

2020-01-14 Thread Stefan Bauer
Hi Vitaliy,



thank you for your time. Do you mean



cephx sign messages = false

with "diable signatures" ?



KR

Stefan





-Ursprüngliche Nachricht-
Von: Виталий Филиппов 
Gesendet: Dienstag 14 Januar 2020 10:28
An: Wido den Hollander ; Stefan Bauer 
CC: ceph-users@lists.ceph.com
Betreff: Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we 
expect more? [klartext]

...disable signatures and rbd cache. I didn't mention it in the email to not 
repeat myself. But I have it in the article :-)
--
With best regards,
Vitaliy Filippov___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]

2020-01-14 Thread Stefan Bauer
Hi Stefan,



thank you for your time.



"temporary write through" does not seem to be a legit parameter.



However write through is already set:



root@proxmox61:~# echo "temporary write through" > 
/sys/block/sdb/device/scsi_disk/*/cache_type
root@proxmox61:~# cat /sys/block/sdb/device/scsi_disk/2\:0\:0\:0/cache_type
write through



is that, what you meant?



Thank you.



KR



Stefan



-Ursprüngliche Nachricht-
Von: Stefan Priebe - Profihost AG 
 
this has something todo with the firmware and how the manufacturer
handles syncs / flushes.

Intel just ignores sync / flush commands for drives which have a
capacitor. Samsung does not.

The problem is that Ceph sends a lot of flush commands which slows down
drives without capacitor.

You can make linux to ignore those userspace requests with the following

command:
echo "temporary write through" >
/sys/block/sdX/device/scsi_disk/*/cache_type

Greets,
Stefan Priebe
Profihost AG


> Thank you.
>
>
> Stefan
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]

2020-01-14 Thread Виталий Филиппов
...disable signatures and rbd cache. I didn't mention it in the email to not 
repeat myself. But I have it in the article :-)
-- 
With best regards,
  Vitaliy Filippov___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]

2020-01-14 Thread Wido den Hollander



On 1/13/20 6:37 PM, vita...@yourcmc.ru wrote:
>> Hi,
>>
>> we're playing around with ceph but are not quite happy with the IOs.
>> on average 5000 iops / write
>> on average 13000 iops / read
>>
>> We're expecting more. :( any ideas or is that all we can expect?
> 
> With server SSD you can expect up to ~1 write / ~25000 read iops per
> a single client.
> 
> https://yourcmc.ru/wiki/Ceph_performance
> 
>> money is NOT a problem for this test-bed, any ideas howto gain more
>> IOS is greatly appreciated.
> 
> Grab some server NVMes and best possible CPUs :)

And then:

- Disable all powersaving
- Pin the CPUs in C-State 1

That might even increase performance even more. But due to the
synchronous nature of Ceph the performance and latency of a single
thread will be limited.

Wido

> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]

2020-01-13 Thread vitalif

Hi,

we're playing around with ceph but are not quite happy with the IOs.
on average 5000 iops / write
on average 13000 iops / read

We're expecting more. :( any ideas or is that all we can expect?


With server SSD you can expect up to ~1 write / ~25000 read iops per 
a single client.


https://yourcmc.ru/wiki/Ceph_performance


money is NOT a problem for this test-bed, any ideas howto gain more
IOS is greatly appreciated.


Grab some server NVMes and best possible CPUs :)
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]

2020-01-13 Thread Stefan Priebe - Profihost AG
Hi Stefan,

Am 13.01.20 um 17:09 schrieb Stefan Bauer:
> Hi,
> 
> 
> we're playing around with ceph but are not quite happy with the IOs.
> 
> 
> 3 node ceph / proxmox cluster with each:
> 
> 
> LSI HBA 3008 controller
> 
> 4 x MZILT960HAHQ/007 Samsung SSD
> 
> Transport protocol:   SAS (SPL-3)
> 
> 40G fibre Intel 520 Network controller on Unifi Switch
> 
> Ping roundtrip to partner node is 0.040ms average.
> 
> 
> Transport protocol:   SAS (SPL-3)
> 
> 
> fio reports on a virtual machine with
> 
> 
> --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test
> --filename=test --bs=4k --iodepth=64 --size=4G --readwrite=randrw
> --rwmixread=75
> 
> 
> on average 5000 iops / write
> 
> on average 13000 iops / read
> 
> 
> 
> We're expecting more. :( any ideas or is that all we can expect?
> 
> 
> money is *not* a problem for this test-bed, any ideas howto gain more
> IOS is greatly appreciated.

this has something todo with the firmware and how the manufacturer
handles syncs / flushes.

Intel just ignores sync / flush commands for drives which have a
capacitor. Samsung does not.

The problem is that Ceph sends a lot of flush commands which slows down
drives without capacitor.

You can make linux to ignore those userspace requests with the following
command:
echo "temporary write through" >
/sys/block/sdX/device/scsi_disk/*/cache_type

Greets,
Stefan Priebe
Profihost AG


> Thank you.
> 
> 
> Stefan
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]

2020-01-13 Thread John Petrini
Do those SSD's have capacitors (aka power loss protection)? I took a
look at the spec sheet on samsung's site and I don't see it mentioned.
If that's the case it could certainly explain the performance you're
seeing. Not all enterprise SSD's have it and it's a must have for Ceph
since it syncs every write directly to disk.

You may also want to look for something with a higher DWPD so you can
get more life out of them.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com