On Mon, Apr 1, 2019 at 1:46 PM Yan, Zheng <uker...@gmail.com> wrote:
>
> On Mon, Apr 1, 2019 at 6:45 PM Dan van der Ster <d...@vanderster.com> wrote:
> >
> > Hi all,
> >
> > We have been benchmarking a hyperconverged cephfs cluster (kernel
> > clients + osd on same machines) for awhile. Over the weekend (for the
> > first time) we had one cephfs mount deadlock while some clients were
> > running ior.
> >
> > All the ior processes are stuck in D state with this stack:
> >
> > [<ffffffffafdb53a3>] wait_on_page_bit+0x83/0xa0
> > [<ffffffffafdb54d1>] __filemap_fdatawait_range+0x111/0x190
> > [<ffffffffafdb5564>] filemap_fdatawait_range+0x14/0x30
> > [<ffffffffafdb79e6>] filemap_write_and_wait_range+0x56/0x90
> > [<ffffffffc0f11575>] ceph_fsync+0x55/0x420 [ceph]
> > [<ffffffffafe76247>] do_fsync+0x67/0xb0
> > [<ffffffffafe76530>] SyS_fsync+0x10/0x20
> > [<ffffffffb0372d5b>] system_call_fastpath+0x22/0x27
> > [<ffffffffffffffff>] 0xffffffffffffffff
> >
>
> are there hang osd requests in /sys/kernel/debug/ceph/xxx/osdc?

We never managed to reproduce on this cluster.

But on a separate (not co-located) cluster we had a similar issue. A
client was stuck like this for several hours:

HEALTH_WARN 1 clients failing to respond to capability release; 1 MDSs
report slow requests
MDS_CLIENT_LATE_RELEASE 1 clients failing to respond to capability release
    mdscephflax-mds-2a4cfd0e2c(mds.1): Client hpc070.cern.ch:hpcscid02
failing to respond to capability release client_id: 69092525
MDS_SLOW_REQUEST 1 MDSs report slow requests
    mdscephflax-mds-2a4cfd0e2c(mds.1): 1 slow requests are blocked > 30 sec


Indeed there was a hung write on hpc070.cern.ch:

245540  osd100  1.9443e2a5 1.2a5   [100,1,75]/100  [100,1,75]/100
e74658  fsvolumens_393f2dcc-6b09-44d7-8d20-0e84b072ed26/2000b2f5905.00000001
0x400024        1 write

I restarted osd.100 and the deadlocked request went away.
Does this sound like a known issue?

Thanks, Dan
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to