Hi,
On one of our test clusters, I have a node with 4 OSDs with SAS / non-SSD
drives (sdb, sdc, sdd, sde) and 2 SSD drives (sdf and sdg) for journals to
serve the 4 OSDs (2 each).
Model: ATA ST100FM0012 (scsi)
Disk /dev/sdf: 100GB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Number Start End Size File system Name Flags
1 1049kB 10.7GB 10.7GB ceph journal
2 10.7GB 21.5GB 10.7GB ceph journal
Model: ATA ST100FM0012 (scsi)
Disk /dev/sdg: 100GB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Number Start End Size File system Name Flags
1 1049kB 10.7GB 10.7GB ceph journal
2 10.7GB 21.5GB 10.7GB ceph journal
When I did a rados bench test, I noted that the two SSD drives are always
overloaded with I/O requests, thus the performance is very bad. A ceph tell
osd.X bench test will only give around 25 MB/s of throughput and iostat
shows the two SSD journal drives are overloaded:
====
avg-cpu: %user %nice %system %iowait %steal %idle
2.41 0.00 1.65 3.00 0.00 92.93
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz
avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdc 0.00 5.67 3.33 166.00 22.67 14923.50
176.53 25.16 148.58 9.60 151.37 2.39 40.40
sdb 0.00 0.00 2.00 0.33 13.33 2.17
13.29 0.01 6.29 4.67 16.00 5.14 1.20
sdd 0.00 0.00 2.00 20.33 10.67 1526.33
137.64 0.12 5.49 10.00 5.05 4.72 10.53
sde 0.00 5.00 4.67 67.67 34.67 5837.00
162.35 8.92 124.06 14.00 131.65 2.49 18.00
sdg 0.00 0.00 0.00 54.67 0.00 25805.33
944.10 36.41 655.88 0.00 655.88 18.29 *100.00*
sdf 0.00 0.00 0.00 53.67 0.00 25252.00
941.07 35.61 636.07 0.00 636.07 18.63 *100.00*
avg-cpu: %user %nice %system %iowait %steal %idle
2.01 0.00 1.28 2.15 0.00 94.56
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz
avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdc 0.00 0.00 2.67 14.33 13.33 4.17
2.06 0.03 2.12 8.50 0.93 1.33 2.27
sdb 0.00 0.00 2.00 12.33 10.67 4130.00
577.77 0.09 6.33 4.00 6.70 4.09 5.87
sdd 0.00 0.00 2.33 36.67 18.67 12425.17
638.15 3.77 96.58 9.14 102.15 3.93 15.33
sde 0.00 0.33 1.67 104.33 9.33 11484.00
216.86 11.96 161.61 33.60 163.65 2.93 31.07
sdg 0.00 0.00 0.00 54.33 0.00 25278.67
930.50 33.55 644.54 0.00 644.54 18.38 *99.87*
sdf 0.00 0.00 0.00 58.33 0.00 25493.33
874.06 22.67 422.26 0.00 422.26 17.10 *99.73*
avg-cpu: %user %nice %system %iowait %steal %idle
2.19 0.00 1.60 3.95 0.00 92.26
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz
avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdc 0.00 0.00 2.67 9.00 17.33 1431.00
248.29 0.07 6.17 8.00 5.63 5.03 5.87
sdb 0.00 0.33 1.67 88.33 8.00 30435.33
676.52 17.16 139.64 85.60 140.66 3.30 29.73
sdd 0.00 0.00 2.33 17.33 13.33 3040.17
310.53 0.11 5.42 7.43 5.15 4.47 8.80
sde 0.00 0.00 2.67 7.67 14.67 2767.00
538.39 0.08 8.00 8.50 7.83 5.16 5.33
sdg 0.00 0.00 0.00 60.00 0.00 24841.33
828.04 21.26 332.27 0.00 332.27 16.51 99.07
sdf 0.00 0.00 0.00 56.33 0.00 24365.33
865.04 25.00 449.21 0.00 449.21 17.70 99.73
====
Anyone can advise what could be the problem? I am using 100 GB SSD drives
and journal size is only 10 GB, meaning both only occupied 20 GB of the
space for each disk. The server is using SATA 3 connectors.
I am not able to do a dd test on the SSDs since it's not mounted as
filesystem, but dd on the OSD (non-SSD) drives gives normal result.
I am using Ceph v0.67.7, latest stable version of Dumpling.
Any advice is appreciated.
Looking forward to your reply, thank you.
Cheers.
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com