Hi All We had a faulty OSD that was going up and down for a few hours until Ceph marked it out. During this time Cephfs was accessible, however, for about 10 mins all NFS processes (kernel NFSv3) on a server exporting Cephfs were hung, locking up all the NFS clients. The cluster was healthy before the faulty OSD. I'm trying to understand if this is expected behaviour, a bug or something else. Any insights would be appreciated.
MDS active/passive Jewel 10.2.2 Ceph client 3.10.0-514.6.1.el7.x86_64 Cephfs mount: (rw,relatime,name=admin,secret=<hidden>,acl) I can see some slow requests in the MDS log during the time the NFS processes were hung, some for setattr calls: 2017-06-15 04:29:37.081175 7f889401f700 0 log_channel(cluster) log [WRN] : slow request 60.974528 seconds old, received at 2017-06-15 04: 28:36.106598: client_request(client.2622511:116375892 setattr size=0 #100025b3554 2017-06-15 04:28:36.104928) currently acquired locks and some for getattr: 2017-06-15 04:29:42.081224 7f889401f700 0 log_channel(cluster) log [WRN] : slow request 32.225883 seconds old, received at 2017-06-15 04: 29:09.855302: client_request(client.2622511:116380541 getattr pAsLsXsFs #100025b4d37 2017-06-15 04:29:09.853772) currently failed to rdloc k, waiting And a "client not responding to mclientcaps revoke" warning: 2017-06-15 04:31:12.084561 7f889401f700 0 log_channel(cluster) log [WRN] : client.2344872 isn't responding to mclientcaps(revoke), ino 100025b4d37 pending pAsxLsXsxFcb issued pAsxLsXsxFsxcrwb, sent 122.229172 seconds ag These issues seemed to have cleared once the faulty OSD was marked out. In general I have noticed the NFS processes exporting Cephfs do seem to spend a lot of time in 'D' state, with WCHAN as 'lock_page', compared with a NFS server exporting a local file system. Also, NFS performance hasn't been great with small reads/writes, particularly writes with the default sync export option, I've had to export with async for the time-being. I haven't had a chance to troubleshoot this in any depth yet, just mentioning in case it's relevant. Thanks, David
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com