Re: [ceph-users] CephFS MDS stuck (failed to rdlock when getattr / lookup)

Oliver Freyermuth Mon, 16 Apr 2018 00:59:17 -0700

Am 16.04.2018 um 02:43 schrieb Oliver Freyermuth:
> Am 15.04.2018 um 23:04 schrieb John Spray:
>> On Fri, Apr 13, 2018 at 5:16 PM, Oliver Freyermuth
>> <[email protected]> wrote:
>>> Dear Cephalopodians,
>>>
>>> in our cluster (CentOS 7.4, EC Pool, Snappy compression, Luminous 12.2.4),
>>> we often have all (~40) clients accessing one file in readonly mode, even 
>>> with multiple processes per client doing that.
>>>
>>> Sometimes (I do not yet know when, nor why!) the MDS ends up in a situation 
>>> like:
>>> -----------------------------------------------------------
>>> 2018-04-13 18:08:34.378888 7f1ce4472700  0 log_channel(cluster) log [WRN] : 
>>> 292 slow requests, 5 included below; oldest blocked for > 1745.864417 secs
>>> 2018-04-13 18:08:34.378900 7f1ce4472700  0 log_channel(cluster) log [WRN] : 
>>> slow request 960.563534 seconds old, received at 2018-04-13 
>>> 17:52:33.815273: client_request(client.34720:16487379 getattr pAsLsXsFs 
>>> #0x1000009ff6d 2018-04-13 17:52:33.814904 caller_uid=94894, 
>>> caller_gid=513{513,}) currently failed to rdlock, waiting
>>> 2018-04-13 18:08:34.378904 7f1ce4472700  0 log_channel(cluster) log [WRN] : 
>>> slow request 30.636678 seconds old, received at 2018-04-13 18:08:03.742128: 
>>> client_request(client.34302:16453640 getattr pAsLsXsFs #0x1000009ff6d 
>>> 2018-04-13 18:08:03.741630 caller_uid=94894, caller_gid=513{513,}) 
>>> currently failed to rdlock, waiting
>>> 2018-04-13 18:08:34.378908 7f1ce4472700  0 log_channel(cluster) log [WRN] : 
>>> slow request 972.648926 seconds old, received at 2018-04-13 
>>> 17:52:21.729881: client_request(client.34720:16487334 lookup 
>>> #0x1000001fcab/sometarball.tar.gz 2018-04-13 17:52:21.729450 
>>> caller_uid=94894, caller_gid=513{513,}) currently failed to rdlock, waiting
>>> 2018-04-13 18:08:34.378913 7f1ce4472700  0 log_channel(cluster) log [WRN] : 
>>> slow request 1685.953657 seconds old, received at 2018-04-13 
>>> 17:40:28.425149: client_request(client.34810:16564864 lookup 
>>> #0x1000001fcab/sometarball.tar.gz 2018-04-13 17:40:28.424961 
>>> caller_uid=94894, caller_gid=513{513,}) currently failed to rdlock, waiting
>>> 2018-04-13 18:08:34.378918 7f1ce4472700  0 log_channel(cluster) log [WRN] : 
>>> slow request 1552.743795 seconds old, received at 2018-04-13 
>>> 17:42:41.635012: client_request(client.34302:16453566 getattr pAsLsXsFs 
>>> #0x1000009ff6d 2018-04-13 17:42:41.634726 caller_uid=94894, 
>>> caller_gid=513{513,}) currently failed to rdlock, waiting
>>> -----------------------------------------------------------
>>> As you can see (oldest blocked for > 1745.864417 secs) it stays in that 
>>> situation for quite a while.
>>> The number of blocked requests is also not decreasing, but instead slowly 
>>> increasing whenever a new request is added to the queue.
>>>
>>> We have a setup of one active MDS, a standby-replay, and a standby.
>>> Triggering a failover does not help, it only resets the "oldest blocked" 
>>> time.
>>
>> Sounds like a client issue (a client is holding a lock on a file but
>> failing to relinquish it for another client's request to be
>> processed).
>>
>> Are these kernel (and what version?) or fuse clients?
> 
> The full cluster is running with Fuse clients, all on 12.2.4. 
> Additionally, there are 6 NFS Ganesha servers using FSAL_CEPH, i.e. 
> libcephfs. 
> 
> Cheers,
>       Oliver


Related to that: In case it happens again, which it surely will sooner or 
later, how can I diagnose
which client is holding a lock and not relinquishing it? 
Is there a way to dump all held locks, ideally with the time period how long 
they were held? 

ceph daemon mds.XXX help
does not yield anything obvious. 

Cheers and thanks,
        Oliver

> 
>>
>> John
>>
>>>
>>> I checked the following things on the active MDS:
>>> -----------------------------------------------------------
>>> # ceph daemon mds.mon001 objecter_requests
>>> {
>>>     "ops": [],
>>>     "linger_ops": [],
>>>     "pool_ops": [],
>>>     "pool_stat_ops": [],
>>>     "statfs_ops": [],
>>>     "command_ops": []
>>> }
>>> # ceph daemon mds.mon001 ops | grep event | grep -v "initiated" | grep -v 
>>> "failed to rdlock" | grep -v events
>>> => no output, only "initiated" and "rdlock" are in the queue.
>>> -----------------------------------------------------------
>>>
>>> There's also almost no CPU load, almost no other I/O, and ceph is 
>>> deep-scrubbing ~pg (this also finishes and the next pg is scrubbed fine),
>>> and the scrubbing is not even happening in the metadata pool (easy to see 
>>> in the Luminous dashboard):
>>> -----------------------------------------------------------
>>> # ceph -s
>>>   cluster:
>>>     id:     some_funny_hash
>>>     health: HEALTH_WARN
>>>             1 MDSs report slow requests
>>>
>>>   services:
>>>     mon: 3 daemons, quorum mon003,mon001,mon002
>>>     mgr: mon001(active), standbys: mon002, mon003
>>>     mds: cephfs_baf-1/1/1 up  {0=mon001=up:active}, 1 up:standby-replay, 1 
>>> up:standby
>>>     osd: 196 osds: 196 up, 196 in
>>>
>>>   data:
>>>     pools:   2 pools, 4224 pgs
>>>     objects: 15649k objects, 61761 GB
>>>     usage:   114 TB used, 586 TB / 700 TB avail
>>>     pgs:     4223 active+clean
>>>              1    active+clean+scrubbing+deep
>>>
>>>   io:
>>>     client:   175 kB/s rd, 3 op/s rd, 0 op/s wr
>>> -----------------------------------------------------------
>>>
>>> Does anybody have any idea what's going on here?
>>>
>>> Yesterday, this also happened, but resolved itself after about 1 hour.
>>> Right now, it's going on for about half an hour...
>>>
>>> Cheers,
>>>         Oliver
>>>
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> [email protected]
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
> 
> 
> 
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] CephFS MDS stuck (failed to rdlock when getattr / lookup)

Reply via email to