Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]

2020-01-21 Thread Eric K. Miller
Hi Vitaliy,

The drive is a Micron 5200 ECO 3.84TB

This is from the msecli utility:

Device Name  : /dev/sde
Model No : Micron_5200_MTFDDAK3T8TDC
Serial No: 
FW-Rev   : D1MU404
Total Size   : 3840.00GB
Drive Status : Drive is in good health
Sata Link Speed  : Gen3 (6.0 Gbps)
Sata Link Max Speed  : Gen3 (6.0 Gbps)
Temp(C)  : 26

The server motherboard is: SuperMicro X10DRU-i+

Drives are connected to SATA connectors on the motherboard.

Processors are:  Xeon E5-2690v4

Eric


From: Виталий Филиппов [mailto:vita...@yourcmc.ru] 
Sent: Tuesday, January 21, 2020 3:43 PM
To: Eric K. Miller
Cc: ceph-users@lists.ceph.com
Subject: RE: [ceph-users] low io with enterprise SSDs ceph luminous - can we 
expect more? [klartext]

Hi! Thanks.

The parameter gets reset when you reconnect the SSD so in fact it requires not 
to power cycle it after changing the parameter :-)

Ok, this case seems lucky, ~2x change isn't a lot. Can you tell the exact model 
and capacity of this Micron, and what controller was used in this test? I'll 
add it to the spreadsheet.
-- 
With best regards,
Vitaliy Filippov
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]

2020-01-21 Thread Eric K. Miller
We were able to isolate an individual Micron 5200 and perform Vitaliy's
tests in his spreadsheet.

An interesting item - write cache changes do NOT require a power cycle
to take effect, at least on a Micron 5200.

The complete results from fio are included at the end of this message
for the individual tests, for both write enabled and disabled.

The shortened version of the results:

Journal IOPS (sync=1)

-
Write cache ON:  
  write: IOPS=19.7k, BW=76.0MiB/s (80.7MB/s)(4618MiB/60001msec)
 lat (usec): min=42, max=1273, avg=50.18, stdev= 6.40

Write cache OFF:
  write: IOPS=32.3k, BW=126MiB/s (132MB/s)(7560MiB/60001msec)
 lat (usec): min=25, max=7079, avg=30.55, stdev= 7.94


Journal IOPS (fsync=1)

-
Write cache ON:
  write: IOPS=16.9k, BW=66.2MiB/s (69.4MB/s)(3971MiB/60001msec)
 lat (usec): min=24, max=5068, avg=31.77, stdev= 7.82

Write cache OFF:
  write: IOPS=32.1k, BW=126MiB/s (132MB/s)(7533MiB/60001msec)
 lat (usec): min=24, max=7076, avg=29.41, stdev= 7.52


Parallel random (sync)

-
Write cache ON:
  write: IOPS=43.9k, BW=172MiB/s (180MB/s)(10.1GiB/60001msec)
 lat (usec): min=220, max=14767, avg=727.61, stdev=313.36

Write cache OFF:
  write: IOPS=44.3k, BW=173MiB/s (181MB/s)(10.1GiB/60001msec)
 lat (usec): min=134, max=4941, avg=721.96, stdev=311.46


Parallel random (fsync)

-
Write cache ON:
  write: IOPS=44.4k, BW=173MiB/s (182MB/s)(10.2GiB/60001msec)
 lat (usec): min=109, max=4349, avg=703.01, stdev=303.69

Write cache OFF:
  write: IOPS=44.6k, BW=174MiB/s (183MB/s)(10.2GiB/60001msec)
 lat (usec): min=26, max=7288, avg=716.32, stdev=300.48


Non-txn random

-
Write cache ON:
  write: IOPS=43.1k, BW=168MiB/s (177MB/s)(9.87GiB/60004msec)
 lat (usec): min=350, max=41703, avg=2967.89, stdev=1682.28

Write cache OFF:
  write: IOPS=43.4k, BW=170MiB/s (178MB/s)(9.93GiB/60004msec)
 lat (usec): min=177, max=42795, avg=2947.52, stdev=1666.24


Linear write

-
Write cache ON:
  write: IOPS=126, BW=505MiB/s (530MB/s)(29.6GiB/60027msec)
 lat (msec): min=226, max=281, avg=253.26, stdev= 3.51

Write cache OFF:
  write: IOPS=126, BW=507MiB/s (531MB/s)(29.8GiB/60254msec)
 lat (msec): min=7, max=492, avg=252.52, stdev=13.16


So, we can determine that some improvement can be seen with the write
cache disabled (specifically on a Micron 5200), it is not enough that
will likely change much in terms of Ceph's performance unless journal
latency, IOPS, and bandwidth are a bottleneck.

The "Journal IOPS (sync=1)" test shows the most dramatic difference,
where disabling the write cache reduces the I/O latency by 39% (a
reduction from 50.18us to 30.55us with a difference of 0.02ms) which
respectively raises the IOPS and throughput of synchronous I/O.

The "Journal IOPS (fsync=1)" test also shows a dramatic difference, but
in terms of IOPS and throughput (approximately +90%), not latency.

Hope this helps!  I would love to hear feedback.

Eric



###
# Journal IOPS (sync=1)
###

# Write cache ENABLED
hdparm -W 1 /dev/sde

fio --ioengine=libaio -sync=1 --direct=1 --name=test --bs=4k --iodepth=1
--readwrite=write  --runtime 60 --filename=/dev/sde

test: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
4096B-4096B, ioengine=libaio, iodepth=1
fio-3.7
Starting 1 process
Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=75.6MiB/s][r=0,w=19.3k
IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=38269: Tue Jan 21 14:20:01 2020
  write: IOPS=19.7k, BW=76.0MiB/s (80.7MB/s)(4618MiB/60001msec)
slat (usec): min=2, max=180, avg= 4.43, stdev= 1.86
clat (nsec): min=1950, max=1262.3k, avg=45662.55, stdev=5778.88
 lat (usec): min=42, max=1273, avg=50.18, stdev= 6.40
clat percentiles (usec):
 |  1.00th=[   42],  5.00th=[   42], 10.00th=[   43], 20.00th=[
43],
 | 30.00th=[   43], 40.00th=[   44], 50.00th=[   44], 60.00th=[
45],
 | 70.00th=[   47], 80.00th=[   48], 90.00th=[   51], 95.00th=[
55],
 | 99.00th=[   66], 99.50th=[   74], 99.90th=[   91], 99.95th=[
104],
 | 99.99th=[  167]
   bw (  KiB/s): min=70152, max=81704, per=100.00%, avg=78835.97,
stdev=2929.71, samples=119
   iops: min=17538, max=20426, avg=19708.98, stdev=732.40,
samples=119
  lat (usec)   : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=88.53%
  lat (usec)   : 100=11.41%, 250=0.06%, 500=0.01%, 750=0.01%, 1000=0.01%
  lat (msec)   : 2=0.01%
  cpu  : usr=3.07%, sys=13.62%, ctx=1182324, majf=0, minf=27
  IO depths:

Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]

2020-01-20 Thread Eric K. Miller
Hi Vitaliy,

> You say you don't have access to raw drives. What does it mean? Do you
> run Ceph OSDs inside VMs? In that case you should probably disable
> Micron caches on the hosts, not just in VMs.

Sorry, I should have been more clear.  This cluster is in production, so I 
needed to schedule a maintenance window to do some tests, including "out"ing an 
OSD and removing the OSD from Ceph, so I can perform some tests, as well as 
maintenance to remove a host from the cluster, so power off/on tests can be 
performed.  Right now, all I have access to is the VM-level and the ability to 
enable/disable the write cache on the 5200's using hdparm (but no read/write 
tests directly on the 5200's, of course, which would require destructive 
operations).


> Yes, disabling the write cache only takes place upon a power cycle... or
> upon the next hotplug of the drive itself.

I have a suspicion this is the reason we didn't see any change! :)  Definitely 
an important item.  Once I have test results, I will report back.  May be 
something you want to add to your wiki article.


> If you get the same ~15k or more iops with -rw=randwrite -fsync=1
> -iodepth=1 with both hdparm -W 0 and -W 1 you're good :) if you have
> cache problems you'll get much less.

Once I have a 5200 available to play with, I will definitely let you know the 
results.


> About Micron 5300's, please benchmark them when you have them as
> described here
> https://docs.google.com/spreadsheets/d/1E9-eXjzsKboiCCX-
> 0u0r5fAjjufLKayaut_FOPxYZjc/edit
> (instructions in the end of the sheet)

Most definitely.  I suspect it will be another month before we get them 
unfortunately. :(

Eric


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]

2020-01-18 Thread Eric K. Miller
Hi Vitaliy,

 

Similar to Stefan, we have a bunch of Micron 5200's (3.84TB ECO SATA version) 
in a Ceph cluster (Nautilus) and performance seems less than optimal.  I have 
followed all instructions on your site (thank you for your wonderful article 
btw!!), but I haven't seen much change.

 

The only thing I could think of is that "maybe" disabling the write cache only 
takes place upon a reboot or power cycle?  Is that necessary?  Or is it a 
"live" change?

 

I have tested with the cache disabled as well as enabled on all drives.  We're 
using fio running in a QEMU/KVM VM in an OpenStack cluster, so not "raw" access 
to the Micron 5200's.  OSD (Bluestore) nodes run CentOS 7 using a 4.18.x 
kernel.  Testing doesn't show any, or much, difference, enough that the 
variations could be considered "noise" in the results.  Certainly no change 
that anyone could tell.

 

Thought I'd check to see if you, or anyone else, might have any suggestions 
specific to the Micron 5200.

 

We have some Micron 5300's inbound, but probably won't have them here for 
another few weeks due to Micron's manufacturing delays, so will be able to test 
these raw drives soon.  I will report back after, but if you know anything 
about these, I'm all ears. :)

 

Thank you!

 

Eric

 

 

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Stefan 
Bauer
Sent: Tuesday, January 14, 2020 10:28 AM
To: undisclosed-recipients
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we 
expect more? [klartext]

 

Thank you all,

 

performance is indeed better now. Can now go back to sleep ;)

 

KR

 

Stefan

 

-Ursprüngliche Nachricht-
Von: Виталий Филиппов 
Gesendet: Dienstag 14 Januar 2020 10:28
An: Wido den Hollander ; Stefan Bauer 

CC: ceph-users@lists.ceph.com
Betreff: Re: [ceph-users] low io with enterprise SSDs ceph luminous - 
can we expect more? [klartext]

...disable signatures and rbd cache. I didn't mention it in the email 
to not repeat myself. But I have it in the article :-)
-- 
With best regards,
Vitaliy Filippov 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com