Re: [ceph-users] SSD journal overload?

Indra Pramana Mon, 28 Apr 2014 04:21:13 -0700

Hi Udo and Irek,

Good day to you, and thank you for your emails.


>perhaps due IOs from the journal?
>You can test with iostat (like "iostat -dm 5 sdg").

Yes, I have shared the iostat result earlier on this same thread. At times
the utilisation of the 2 journal drives will hit 100%, especially when I
simulate writing data using rados bench command. Any suggestions what could
be the cause of the I/O issue?

====
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.85    0.00    1.65    3.14    0.00   93.36

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz
avgqu-sz   await r_await w_await  svctm  %util
sdg               0.00     0.00    0.00   55.00     0.00 25365.33
922.38    34.22  568.90    0.00  568.90  17.82  98.00
sdf               0.00     0.00    0.00   55.67     0.00 25022.67
899.02    29.76  500.57    0.00  500.57  17.60  98.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           2.10    0.00    1.37    2.07    0.00   94.46

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz
avgqu-sz   await r_await w_await  svctm  %util
sdg               0.00     0.00    0.00   56.67     0.00 25220.00
890.12    23.60  412.14    0.00  412.14  17.62  99.87
sdf               0.00     0.00    0.00   52.00     0.00 24637.33
947.59    33.65  587.41    0.00  587.41  19.23 100.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           2.21    0.00    1.77    6.75    0.00   89.27

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz
avgqu-sz   await r_await w_await  svctm  %util
sdg               0.00     0.00    0.00   54.33     0.00 24802.67
912.98    25.75  486.36    0.00  486.36  18.40 100.00
sdf               0.00     0.00    0.00   53.00     0.00 24716.00
932.68    35.26  669.89    0.00  669.89  18.87 100.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.87    0.00    1.67    5.25    0.00   91.21

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz
avgqu-sz   await r_await w_await  svctm  %util
sdg               0.00     0.00    0.00   94.33     0.00 26257.33
556.69    18.29  208.44    0.00  208.44  10.50  99.07
sdf               0.00     0.00    0.00   51.33     0.00 24470.67
953.40    32.75  684.62    0.00  684.62  19.51 100.13

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.51    0.00    1.34    7.25    0.00   89.89

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz
avgqu-sz   await r_await w_await  svctm  %util
sdg               0.00     0.00    0.00   52.00     0.00 22565.33
867.90    24.73  446.51    0.00  446.51  19.10  99.33
sdf               0.00     0.00    0.00   64.67     0.00 24892.00
769.86    19.50  330.02    0.00  330.02  15.32  99.07
====

>You what model SSD?

For this one, I am using Seagate 100GB SSD, model: HDS-2TM-ST100FM0012

>Which version of the kernel?

Ubuntu 13.04, Linux kernel version: 3.8.0-19-generic #30-Ubuntu SMP Wed May
1 16:35:23 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

Looking forward to your reply, thank you.

Cheers.



On Mon, Apr 28, 2014 at 4:45 PM, Irek Fasikhov <[email protected]> wrote:

> You what model SSD?
> Which version of the kernel?
>
>
>
> 2014-04-28 12:35 GMT+04:00 Udo Lembke <[email protected]>:
>
>> Hi,
>> perhaps due IOs from the journal?
>> You can test with iostat (like "iostat -dm 5 sdg").
>>
>> on debian iostat is in the package sysstat.
>>
>> Udo
>>
>> Am 28.04.2014 07:38, schrieb Indra Pramana:
>> > Hi Craig,
>> >
>> > Good day to you, and thank you for your enquiry.
>> >
>> > As per your suggestion, I have created a 3rd partition on the SSDs and
>> did
>> > the dd test directly into the device, and the result is very slow.
>> >
>> > ====
>> > root@ceph-osd-08:/mnt# dd bs=1M count=128 if=/dev/zero of=/dev/sdg3
>> > conv=fdatasync oflag=direct
>> > 128+0 records in
>> > 128+0 records out
>> > 134217728 bytes (134 MB) copied, 19.5223 s, 6.9 MB/s
>> >
>> > root@ceph-osd-08:/mnt# dd bs=1M count=128 if=/dev/zero of=/dev/sdf3
>> > conv=fdatasync oflag=direct
>> > 128+0 records in
>> > 128+0 records out
>> > 134217728 bytes (134 MB) copied, 5.34405 s, 25.1 MB/s
>> > ====
>> >
>> > I did a test onto another server with exactly similar specification and
>> > similar SSD drive (Seagate SSD 100 GB) but not added into the cluster
>> yet
>> > (thus no load), and the result is fast:
>> >
>> > ====
>> > root@ceph-osd-09:/home/indra# dd bs=1M count=128 if=/dev/zero
>> of=/dev/sdf1
>> > conv=fdatasync oflag=direct
>> > 128+0 records in
>> > 128+0 records out
>> > 134217728 bytes (134 MB) copied, 0.742077 s, 181 MB/s
>> > ====
>> >
>> > Is the Ceph journal load really takes up a lot of the SSD resources? I
>> > don't understand how come the performance can drop significantly.
>> > Especially since the two Ceph journals are only taking the first 20 GB
>> out
>> > of the 100 GB of the SSD total capacity.
>> >
>> > Any advice is greatly appreciated.
>> >
>> > Looking forward to your reply, thank you.
>> >
>> > Cheers.
>> >
>> >
>> >
>> _______________________________________________
>> ceph-users mailing list
>> [email protected]
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
>
> --
> С уважением, Фасихов Ирек Нургаязович
> Моб.: +79229045757
>

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] SSD journal overload?

Reply via email to