On 21 December 2015 at 03:23, Yan, Zheng <uker...@gmail.com> wrote: > On Sat, Dec 19, 2015 at 4:34 AM, Don Waterloo <don.water...@gmail.com> > wrote: > > I have 3 systems w/ a cephfs mounted on them. > > And i am seeing material 'lag'. By 'lag' i mean it hangs for little bits > of > > time (1s, sometimes 5s). > > But very non repeatable. > > > > If i run > > time find . -type f -print0 | xargs -0 stat > /dev/null > > it might take ~130ms. > > But, it might take 10s. Once i've done it, it tends to stay @ the ~130ms, > > suggesting whatever data is now in cache. On the cases it hangs, if i > remove > > the stat, its hanging on the find of one file. It might hiccup 1 or 2 > times > > in the find across 10k files. > > > > > When operation hangs, do you see any 'slow request ...' log message in > the cluster log. Besides, do have have multiple clients accessing the > filesystem? which version of ceph do you use? > > Regards > Yan, Zheng > > There are some 'slow...' log:
ceph.log.1.gz:2015-12-20 21:48:51.047945 osd.5 10.100.10.124:6801/46249 561 : cluster [WRN] slow request 30.492476 seconds old, received at 2015-12-20 21:48:20.555383: osd_op(client.1294098.1:315704 10000056ffe.00000000 [write 0~12475] 13.bf7fb0aa snapc 1=[] ondisk+write e2459) currently waiting for subops from 1 Its ceph 0.94.5-0ubuntu0.15.10.1 on Ubuntu 15.10 w/ kernel 4.3.0-040300-generic What does the 'slow request' mean? The file system is mounted on 3 hosts. The others might be doing some minor access I suppose, but nothing systemic. I've had smokeping running between all the osd machines and have 0 loss, ~0 latency at all times. E.g. its 200us average, +- 75us.
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com