Hi Mariusz,

Thanks a lot for the ideas, I've rebooted the client server, map again the rbd and launch the fio test again, this time it work... very rare....while running the test I run also:

ceph@cephmon01:~$ ceph osd perf
osdid fs_commit_latency(ms) fs_apply_latency(ms)
   0                   506                   22
   1                   465                   26
   2                   490                    3
   3                   623                   13
   4                   548                   68
   5                   484                   16
   6                   448                    2
   7                   523                   27
   8                   489                   30
   9                   498                   52
  10                   472                   12
  11                   407                    7
  12                   315                    0
  13                   540                   17
  14                   599                   18
  15                   420                   14
  16                   515                    7
  17                   395                    3
  18                   565                   14
  19                   557                   59
  20                   515                    7
  21                   689                   56
  22                   474                   10
  23                   142                    1
  24                   364                    7
  25                   390                    6
  26                   507                  107
  27                   573                   20
  28                   158                    1
  29                   490                   25
  30                   301                    0
  31                   381                   15
  32                   440                   27
  33                   482                   16
  34                   323                    9
  35                   414                   21

I don't see any suspicious here. The fio command was:


$ sudo fio --filename=/dev/rbd0 --direct=1 --rw=write --bs=4m --size=10G --iodepth=16 --ioengine=libaio --runtime=60 --group_reporting --name=fileB fileB: (g=0): rw=write, bs=4M-4M/4M-4M/4M-4M, ioengine=libaio, iodepth=16
fio-2.1.3
Starting 1 process
Jobs: 1 (f=1): [W] [100.0% done] [0KB/748.0MB/0KB /s] [0/187/0 iops] [eta 00m:00s]
fileB: (groupid=0, jobs=1): err= 0: pid=2172: Thu Aug 14 10:21:13 2014
 write: io=10240MB, bw=741672KB/s, iops=181, runt= 14138msec
   slat (usec): min=569, max=2747, avg=1741.44, stdev=507.08
   clat (msec): min=19, max=465, avg=86.55, stdev=35.16
    lat (msec): min=20, max=466, avg=88.30, stdev=34.92
   clat percentiles (msec):
| 1.00th=[ 39], 5.00th=[ 54], 10.00th=[ 60], 20.00th=[ 64], | 30.00th=[ 69], 40.00th=[ 75], 50.00th=[ 81], 60.00th=[ 85], | 70.00th=[ 92], 80.00th=[ 102], 90.00th=[ 124], 95.00th=[ 147], | 99.00th=[ 217], 99.50th=[ 258], 99.90th=[ 424], 99.95th=[ 441],
    | 99.99th=[  465]
bw (KB /s): min=686754, max=783298, per=99.81%, avg=740262.96, stdev=19845.43
   lat (msec) : 20=0.04%, 50=3.36%, 100=75.51%, 250=20.51%, 500=0.59%
 cpu          : usr=6.18%, sys=12.97%, ctx=11554, majf=0, minf=2225
IO depths : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.3%, 16=99.4%, 32=0.0%,
=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%,
=64=0.0%
    issued    : total=r=0/w=2560/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
WRITE: io=10240MB, aggrb=741672KB/s, minb=741672KB/s, maxb=741672KB/s, mint=14138msec, maxt=14138msec

Disk stats (read/write):
rbd0: ios=182/20459, merge=0/0, ticks=92/1213748, in_queue=1214796, util=99.80%
ceph@mail02-old:~$





German Anders


















--- Original message ---
Asunto: Re: [ceph-users] Performance really drops from 700MB/s to 10MB/s
De: Mariusz Gronczewski <[email protected]>
Para: German Anders <[email protected]>
Cc: <[email protected]>
Fecha: Thursday, 14/08/2014 10:56

Actual OSD (/var/log/ceph/ceph-osd.$id) logs would be more useful.

Few ideas:

* do 'ceph health detail' to get detail of which OSD is stalling
* 'ceph osd perf' to see latency of each osd
* 'ceph --admin-daemon /var/run/ceph/ceph-osd.$id.asok dump_historic_ops' shows "recent slow" ops

I actually have very similiar problem, cluster goes full speed (sometimes even for hours) and suddenly everything stops for a minute or 5, no disk IO, no IO wait (so disks are fine), no IO errors in kernel log, and OSDs only complain that other OSD subop is slow (but on that OSD everything looks fine too)

On Wed, 13 Aug 2014 16:04:30 -0400, German Anders
<[email protected]> wrote:


Also, even a "ls -ltr" could be done inside the /mnt of the RBD that
it freeze the prompt. Any ideas? I've attach some syslogs from one of
the OSD servers and also from the client. Both are running Ubuntu
14.04LTS with Kernel  3.15.8.
The cluster is not usable at this point, since I can't run a "ls" on
the rbd.

Thanks in advance,

Best regards,


German Anders

















--- Original message ---
Asunto: Re: [ceph-users] Performance really drops from 700MB/s to
10MB/s
De: German Anders <[email protected]>
Para: Mark Nelson <[email protected]>
Cc: <[email protected]>
Fecha: Wednesday, 13/08/2014 11:09


Actually is very strange, since if i run the fio test on the client,
and also un parallel run a iostat on all the OSD servers, i don't see
any workload going on over the disks, I mean... nothing! 0.00....and
also the fio script on the client is reacting very rare too:


$ sudo fio --filename=/dev/rbd1 --direct=1 --rw=write --bs=4m
--size=10G --iodepth=16 --ioengine=libaio --runtime=60
--group_reporting --name=file99
file99: (g=0): rw=write, bs=4M-4M/4M-4M/4M-4M, ioengine=libaio,
iodepth=16
fio-2.1.3
Starting 1 process
Jobs: 1 (f=1): [W] [2.1% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta
01h:26m:43s]

It's seems like is doing nothing..



German Anders



















--- Original message ---
Asunto: Re: [ceph-users] Performance really drops from 700MB/s to
10MB/s
De: Mark Nelson <[email protected]>
Para: <[email protected]>
Fecha: Wednesday, 13/08/2014 11:00

On 08/13/2014 08:19 AM, German Anders wrote:


Hi to all,

I'm having a particular behavior on a new Ceph cluster.
I've map
a RBD to a client and issue some performance tests with fio, at this
point everything goes just fine (also the results :) ), but then I try
to run another new test on a new RBD on the same client, and suddenly
the performance goes below 10MB/s and it took almost 10 minutes to
complete a 10G file test, if I issue a *ceph -w* I don't see anything
suspicious, any idea what can be happening here?

When things are going fast, are your disks actually writing data out
as
fast as your client IO would indicate? (don't forgot to count
replication!)  It may be that the great speed is just writing data
into
the tmpfs journals (if the test is only 10GB and spread across 36
OSDs,
it could finish pretty quickly writing to tmpfs!).  FWIW, tmpfs
journals
aren't very safe.  It's not something you want to use outside of
testing
except in unusual circumstances.

In your tests, when things are bad: it's generally worth checking to
see
if any one disk/osd is backed up relative to the others.  There are a
couple of ways to accomplish this.  the Ceph admin socket can tell you
information about each OSD ie how many outstanding IOs and a history
of
slow ops.  You can also look at per-disk statistics with something
like
iostat or collectl.

Hope this helps!




                               The cluster is made of:

3 x MON Servers
4 x OSD Servers (3TB SAS 6G disks for OSD daemons & tmpfs for Journal
->
there's one tmpfs of 36GB that is share by 9 OSD daemons, on each
server)
2 x Network SW (Cluster and Public)
10GbE speed on both networks

                               The ceph.conf file is the following:

[global]
fsid = 56e56e4c-ea59-4157-8b98-acae109bebe1
mon_initial_members = cephmon01, cephmon02, cephmon03
mon_host = 10.97.10.1,10.97.10.2,10.97.10.3
auth_client_required = cephx
auth_cluster_required = cephx
auth_service_required = cephx
filestore_xattr_use_omap = true
public_network = 10.97.0.0/16
cluster_network = 192.168.10.0/24
osd_pool_default_size = 2
glance_api_version = 2

[mon]
debug_optracker = 0

[mon.cephmon01]
host = cephmon01
mon_addr = 10.97.10.1:6789

[mon.cephmon02]
host = cephmon02
mon_addr = 10.97.10.2:6789

[mon.cephmon03]
host = cephmon03
mon_addr = 10.97.10.3:6789

[osd]
journal_dio = false
osd_journal_size = 4096
fstype = btrfs
debug_optracker = 0

[osd.0]
host = cephosd01
devs = /dev/sdc1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.1]
host = cephosd01
devs = /dev/sdd1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.2]
host = cephosd01
devs = /dev/sdf1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.3]
host = cephosd01
devs = /dev/sdg1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.4]
host = cephosd01
devs = /dev/sdi1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.5]
host = cephosd01
devs = /dev/sdj1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.6]
host = cephosd01
devs = /dev/sdl1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.7]
host = cephosd01
devs = /dev/sdm1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.8]
host = cephosd01
devs = /dev/sdn1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.9]
host = cephosd02
devs = /dev/sdc1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.10]
host = cephosd02
devs = /dev/sdd1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.11]
host = cephosd02
devs = /dev/sdf1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.12]
host = cephosd02
devs = /dev/sdg1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.13]
host = cephosd02
devs = /dev/sdi1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.14]
host = cephosd02
devs = /dev/sdj1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.15]
host = cephosd02
devs = /dev/sdl1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.16]
host = cephosd02
devs = /dev/sdm1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.17]
host = cephosd02
devs = /dev/sdn1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.18]
host = cephosd03
devs = /dev/sdc1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.19]
host = cephosd03
devs = /dev/sdd1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.20]
host = cephosd03
devs = /dev/sdf1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.21]
host = cephosd03
devs = /dev/sdg1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.22]
host = cephosd03
devs = /dev/sdi1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.23]
host = cephosd03
devs = /dev/sdj1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.24]
host = cephosd03
devs = /dev/sdl1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.25]
host = cephosd03
devs = /dev/sdm1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.26]
host = cephosd03
devs = /dev/sdn1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.27]
host = cephosd04
devs = /dev/sdc1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.28]
host = cephosd04
devs = /dev/sdd1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.29]
host = cephosd04
devs = /dev/sdf1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.30]
host = cephosd04
devs = /dev/sdg1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.31]
host = cephosd04
devs = /dev/sdi1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.32]
host = cephosd04
devs = /dev/sdj1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.33]
host = cephosd04
devs = /dev/sdl1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.34]
host = cephosd04
devs = /dev/sdm1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[osd.35]
host = cephosd04
devs = /dev/sdn1
osd_journal = /mnt/ramdisk/$cluster-$id-journal

[client.volumes]
keyring = /etc/ceph/ceph.client.volumes.keyring


Thanks in advance,

Best regards,

*German Anders
*


















_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





--
Mariusz Gronczewski, Administrator

Efigence S. A.
ul. WoĊ‚oska 9a, 02-583 Warszawa
T: [+48] 22 380 13 13
F: [+48] 22 380 13 14
E: [email protected]
<mailto:[email protected]>


_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to