So, after testing SSD (i wipe 1 SSD, and used it for tests)
root@ix-s2:~# sudo fio --filename=/dev/sda --direct=1 --sync=1 --rw=write
--bs=4k --numjobs=1 --iodepth=1 --runtime=60 --time_based --gr[53/1800]
ting --name=journal-test
journal-test: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync,
iodepth=1
fio-2.1.3
Starting 1 process
Jobs: 1 (f=1): [W] [100.0% done] [0KB/1152KB/0KB /s] [0/288/0 iops] [eta
00m:00s]
journal-test: (groupid=0, jobs=1): err= 0: pid=2849460: Thu Aug 13 10:46:42
2015
write: io=68972KB, bw=1149.6KB/s, iops=287, runt= 60001msec
clat (msec): min=2, max=15, avg= 3.48, stdev= 1.08
lat (msec): min=2, max=15, avg= 3.48, stdev= 1.08
clat percentiles (usec):
| 1.00th=[ 2704], 5.00th=[ 2800], 10.00th=[ 2864], 20.00th=[ 2928],
| 30.00th=[ 3024], 40.00th=[ 3088], 50.00th=[ 3280], 60.00th=[ 3408],
| 70.00th=[ 3504], 80.00th=[ 3728], 90.00th=[ 3856], 95.00th=[ 4016],
| 99.00th=[ 9024], 99.50th=[ 9280], 99.90th=[ 9792], 99.95th=[10048],
| 99.99th=[14912]
bw (KB /s): min= 1064, max= 1213, per=100.00%, avg=1150.07, stdev=34.31
lat (msec) : 4=94.99%, 10=4.96%, 20=0.05%
cpu : usr=0.13%, sys=0.57%, ctx=17248, majf=0, minf=7
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
>=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
issued : total=r=0/w=17243/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
WRITE: io=68972KB, aggrb=1149KB/s, minb=1149KB/s, maxb=1149KB/s,
mint=60001msec, maxt=60001msec
Disk stats (read/write):
sda: ios=0/17224, merge=0/0, ticks=0/59584, in_queue=59576, util=99.30%
So, it's pain... SSD do only 287 iops on 4K... 1,1 MB/s
I try to change cache mode :
echo temporary write through > /sys/class/scsi_disk/2:0:0:0/cache_type
echo temporary write through > /sys/class/scsi_disk/3:0:0:0/cache_type
no luck, still same shit results, also i found this article:
https://lkml.org/lkml/2013/11/20/264 pointed to old very simple patch,
which disable CMD_FLUSH
https://gist.github.com/TheCodeArtist/93dddcd6a21dc81414ba
Has everybody better ideas, how to improve this? (or disable CMD_FLUSH
without recompile kernel, i used ubuntu and 4.0.4 for now (4.x branch
because SSD 850 Pro have issue with NCQ TRIM< and before 4.0.4 this
exception was not included into libsata.c)
2015-08-12 19:17 GMT+03:00 Pieter Koorts <[email protected]>:
> Hi Igor
>
> I suspect you have very much the same problem as me.
>
> https://www.mail-archive.com/[email protected]/msg22260.html
>
> Basically Samsung drives (like many SATA SSD's) are very much hit and miss
> so you will need to test them like described here to see if they are any
> good.
> http://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/
>
> To give you an idea my average performance went from 11MB/s (with Samsung
> SSD) to 30MB/s (without any SSD) on write performance. This is a very small
> cluster.
>
> Pieter
>
> On Aug 12, 2015, at 04:33 PM, Voloshanenko Igor <
> [email protected]> wrote:
>
> Hi all, we have setup CEPH cluster with 60 OSD (2 diff types) (5 nodes, 12
> disks on each, 10 HDD, 2 SSD)
>
> Also we cover this with custom crushmap with 2 root leaf
>
> ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
> -100 5.00000 root ssd
> -102 1.00000 host ix-s2-ssd
> 2 1.00000 osd.2 up 1.00000 1.00000
> 9 1.00000 osd.9 up 1.00000 1.00000
> -103 1.00000 host ix-s3-ssd
> 3 1.00000 osd.3 up 1.00000 1.00000
> 7 1.00000 osd.7 up 1.00000 1.00000
> -104 1.00000 host ix-s5-ssd
> 1 1.00000 osd.1 up 1.00000 1.00000
> 6 1.00000 osd.6 up 1.00000 1.00000
> -105 1.00000 host ix-s6-ssd
> 4 1.00000 osd.4 up 1.00000 1.00000
> 8 1.00000 osd.8 up 1.00000 1.00000
> -106 1.00000 host ix-s7-ssd
> 0 1.00000 osd.0 up 1.00000 1.00000
> 5 1.00000 osd.5 up 1.00000 1.00000
> -1 5.00000 root platter
> -2 1.00000 host ix-s2-platter
> 13 1.00000 osd.13 up 1.00000 1.00000
> 17 1.00000 osd.17 up 1.00000 1.00000
> 21 1.00000 osd.21 up 1.00000 1.00000
> 27 1.00000 osd.27 up 1.00000 1.00000
> 32 1.00000 osd.32 up 1.00000 1.00000
> 37 1.00000 osd.37 up 1.00000 1.00000
> 44 1.00000 osd.44 up 1.00000 1.00000
> 48 1.00000 osd.48 up 1.00000 1.00000
> 55 1.00000 osd.55 up 1.00000 1.00000
> 59 1.00000 osd.59 up 1.00000 1.00000
> -3 1.00000 host ix-s3-platter
> 14 1.00000 osd.14 up 1.00000 1.00000
> 18 1.00000 osd.18 up 1.00000 1.00000
> 23 1.00000 osd.23 up 1.00000 1.00000
> 28 1.00000 osd.28 up 1.00000 1.00000
> 33 1.00000 osd.33 up 1.00000 1.00000
> 39 1.00000 osd.39 up 1.00000 1.00000
> 43 1.00000 osd.43 up 1.00000 1.00000
> 47 1.00000 osd.47 up 1.00000 1.00000
> 54 1.00000 osd.54 up 1.00000 1.00000
> 58 1.00000 osd.58 up 1.00000 1.00000
> -4 1.00000 host ix-s5-platter
> 11 1.00000 osd.11 up 1.00000 1.00000
> 16 1.00000 osd.16 up 1.00000 1.00000
> 22 1.00000 osd.22 up 1.00000 1.00000
> 26 1.00000 osd.26 up 1.00000 1.00000
> 31 1.00000 osd.31 up 1.00000 1.00000
> 36 1.00000 osd.36 up 1.00000 1.00000
> 41 1.00000 osd.41 up 1.00000 1.00000
> 46 1.00000 osd.46 up 1.00000 1.00000
> 51 1.00000 osd.51 up 1.00000 1.00000
> 56 1.00000 osd.56 up 1.00000 1.00000
> -5 1.00000 host ix-s6-platter
> 12 1.00000 osd.12 up 1.00000 1.00000
> 19 1.00000 osd.19 up 1.00000 1.00000
> 24 1.00000 osd.24 up 1.00000 1.00000
> 29 1.00000 osd.29 up 1.00000 1.00000
> 34 1.00000 osd.34 up 1.00000 1.00000
> 38 1.00000 osd.38 up 1.00000 1.00000
> 42 1.00000 osd.42 up 1.00000 1.00000
> 50 1.00000 osd.50 up 1.00000 1.00000
> 53 1.00000 osd.53 up 1.00000 1.00000
> 57 1.00000 osd.57 up 1.00000 1.00000
> -6 1.00000 host ix-s7-platter
> 10 1.00000 osd.10 up 1.00000 1.00000
> 15 1.00000 osd.15 up 1.00000 1.00000
> 20 1.00000 osd.20 up 1.00000 1.00000
> 25 1.00000 osd.25 up 1.00000 1.00000
> 30 1.00000 osd.30 up 1.00000 1.00000
> 35 1.00000 osd.35 up 1.00000 1.00000
> 40 1.00000 osd.40 up 1.00000 1.00000
> 45 1.00000 osd.45 up 1.00000 1.00000
> 49 1.00000 osd.49 up 1.00000 1.00000
> 52 1.00000 osd.52 up 1.00000 1.00000
>
>
> Then create 2 pools, 1 on HDD (platters), 1 on SSD/
> and put SSD pul in from of HDD pool (cache tier)
>
> now we receive very bad performance results from cluster.
> Even with rados bench we received very unstable performance with even zero
> speed. So it's create very big issues for our clients.
>
> I try to tune all possible values, including OSD, but still no luck.
>
> Also very unbelievble situation, when i do
> ceph tell... bench on SSD OSD - i receive about 20MB/s
> If for HDD - 67 MB/s...
>
> I don;t understand why cache pools which consist of SSD works so bad... We
> used Samsung 850 Pro 256 Gb as SSDs
>
> Can you guys give me advice please...
>
> Also very idiotic thing, when i set cache-mode to forward and try to
> flush-evict all object (not all object evicted, some busy (locked on KVM
> sides). but now i receive quite stable results for rados bench
>
> Total time run: 30.275871
> Total writes made: 2076
> Write size: 4194304
> Bandwidth (MB/sec): 274.278
>
> Stddev Bandwidth: 75.1445
> Max bandwidth (MB/sec): 368
> Min bandwidth (MB/sec): 0
> Average Latency: 0.232892
> Stddev Latency: 0.240356
> Max latency: 2.01436
> Min latency: 0.0716344
>
> Without zeros, etc... So i don't understand how it's possible.
>
> Also interesting thing, when i disable overlay for pool, rados bench
> become around 70MB/s as for ordinary HDD, but in same time rados bench for
> SSD pool, which not used anymore show same bad results...
>
> So please, give me some direction to deeg...
>
>
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com