Re: [ceph-users] CephFS MDS stuck (failed to rdlock when getattr / lookup)
Am 15.04.2018 um 23:04 schrieb John Spray: > On Fri, Apr 13, 2018 at 5:16 PM, Oliver Freyermuth >wrote: >> Dear Cephalopodians, >> >> in our cluster (CentOS 7.4, EC Pool, Snappy compression, Luminous 12.2.4), >> we often have all (~40) clients accessing one file in readonly mode, even >> with multiple processes per client doing that. >> >> Sometimes (I do not yet know when, nor why!) the MDS ends up in a situation >> like: >> --- >> 2018-04-13 18:08:34.37 7f1ce4472700 0 log_channel(cluster) log [WRN] : >> 292 slow requests, 5 included below; oldest blocked for > 1745.864417 secs >> 2018-04-13 18:08:34.378900 7f1ce4472700 0 log_channel(cluster) log [WRN] : >> slow request 960.563534 seconds old, received at 2018-04-13 17:52:33.815273: >> client_request(client.34720:16487379 getattr pAsLsXsFs #0x109ff6d >> 2018-04-13 17:52:33.814904 caller_uid=94894, caller_gid=513{513,}) currently >> failed to rdlock, waiting >> 2018-04-13 18:08:34.378904 7f1ce4472700 0 log_channel(cluster) log [WRN] : >> slow request 30.636678 seconds old, received at 2018-04-13 18:08:03.742128: >> client_request(client.34302:16453640 getattr pAsLsXsFs #0x109ff6d >> 2018-04-13 18:08:03.741630 caller_uid=94894, caller_gid=513{513,}) currently >> failed to rdlock, waiting >> 2018-04-13 18:08:34.378908 7f1ce4472700 0 log_channel(cluster) log [WRN] : >> slow request 972.648926 seconds old, received at 2018-04-13 17:52:21.729881: >> client_request(client.34720:16487334 lookup >> #0x101fcab/sometarball.tar.gz 2018-04-13 17:52:21.729450 >> caller_uid=94894, caller_gid=513{513,}) currently failed to rdlock, waiting >> 2018-04-13 18:08:34.378913 7f1ce4472700 0 log_channel(cluster) log [WRN] : >> slow request 1685.953657 seconds old, received at 2018-04-13 >> 17:40:28.425149: client_request(client.34810:16564864 lookup >> #0x101fcab/sometarball.tar.gz 2018-04-13 17:40:28.424961 >> caller_uid=94894, caller_gid=513{513,}) currently failed to rdlock, waiting >> 2018-04-13 18:08:34.378918 7f1ce4472700 0 log_channel(cluster) log [WRN] : >> slow request 1552.743795 seconds old, received at 2018-04-13 >> 17:42:41.635012: client_request(client.34302:16453566 getattr pAsLsXsFs >> #0x109ff6d 2018-04-13 17:42:41.634726 caller_uid=94894, >> caller_gid=513{513,}) currently failed to rdlock, waiting >> --- >> As you can see (oldest blocked for > 1745.864417 secs) it stays in that >> situation for quite a while. >> The number of blocked requests is also not decreasing, but instead slowly >> increasing whenever a new request is added to the queue. >> >> We have a setup of one active MDS, a standby-replay, and a standby. >> Triggering a failover does not help, it only resets the "oldest blocked" >> time. > > Sounds like a client issue (a client is holding a lock on a file but > failing to relinquish it for another client's request to be > processed). > > Are these kernel (and what version?) or fuse clients? The full cluster is running with Fuse clients, all on 12.2.4. Additionally, there are 6 NFS Ganesha servers using FSAL_CEPH, i.e. libcephfs. Cheers, Oliver > > John > >> >> I checked the following things on the active MDS: >> --- >> # ceph daemon mds.mon001 objecter_requests >> { >> "ops": [], >> "linger_ops": [], >> "pool_ops": [], >> "pool_stat_ops": [], >> "statfs_ops": [], >> "command_ops": [] >> } >> # ceph daemon mds.mon001 ops | grep event | grep -v "initiated" | grep -v >> "failed to rdlock" | grep -v events >> => no output, only "initiated" and "rdlock" are in the queue. >> --- >> >> There's also almost no CPU load, almost no other I/O, and ceph is >> deep-scrubbing ~pg (this also finishes and the next pg is scrubbed fine), >> and the scrubbing is not even happening in the metadata pool (easy to see in >> the Luminous dashboard): >> --- >> # ceph -s >> cluster: >> id: some_funny_hash >> health: HEALTH_WARN >> 1 MDSs report slow requests >> >> services: >> mon: 3 daemons, quorum mon003,mon001,mon002 >> mgr: mon001(active), standbys: mon002, mon003 >> mds: cephfs_baf-1/1/1 up {0=mon001=up:active}, 1 up:standby-replay, 1 >> up:standby >> osd: 196 osds: 196 up, 196 in >> >> data: >> pools: 2 pools, 4224 pgs >> objects: 15649k objects, 61761 GB >> usage: 114 TB used, 586 TB / 700 TB avail >> pgs: 4223 active+clean >> 1active+clean+scrubbing+deep >> >> io: >> client: 175 kB/s rd, 3 op/s rd, 0 op/s wr >> --- >> >> Does anybody have any idea what's going on here? >> >> Yesterday, this also happened, but
Re: [ceph-users] CephFS MDS stuck (failed to rdlock when getattr / lookup)
On Fri, Apr 13, 2018 at 5:16 PM, Oliver Freyermuthwrote: > Dear Cephalopodians, > > in our cluster (CentOS 7.4, EC Pool, Snappy compression, Luminous 12.2.4), > we often have all (~40) clients accessing one file in readonly mode, even > with multiple processes per client doing that. > > Sometimes (I do not yet know when, nor why!) the MDS ends up in a situation > like: > --- > 2018-04-13 18:08:34.37 7f1ce4472700 0 log_channel(cluster) log [WRN] : > 292 slow requests, 5 included below; oldest blocked for > 1745.864417 secs > 2018-04-13 18:08:34.378900 7f1ce4472700 0 log_channel(cluster) log [WRN] : > slow request 960.563534 seconds old, received at 2018-04-13 17:52:33.815273: > client_request(client.34720:16487379 getattr pAsLsXsFs #0x109ff6d > 2018-04-13 17:52:33.814904 caller_uid=94894, caller_gid=513{513,}) currently > failed to rdlock, waiting > 2018-04-13 18:08:34.378904 7f1ce4472700 0 log_channel(cluster) log [WRN] : > slow request 30.636678 seconds old, received at 2018-04-13 18:08:03.742128: > client_request(client.34302:16453640 getattr pAsLsXsFs #0x109ff6d > 2018-04-13 18:08:03.741630 caller_uid=94894, caller_gid=513{513,}) currently > failed to rdlock, waiting > 2018-04-13 18:08:34.378908 7f1ce4472700 0 log_channel(cluster) log [WRN] : > slow request 972.648926 seconds old, received at 2018-04-13 17:52:21.729881: > client_request(client.34720:16487334 lookup #0x101fcab/sometarball.tar.gz > 2018-04-13 17:52:21.729450 caller_uid=94894, caller_gid=513{513,}) currently > failed to rdlock, waiting > 2018-04-13 18:08:34.378913 7f1ce4472700 0 log_channel(cluster) log [WRN] : > slow request 1685.953657 seconds old, received at 2018-04-13 17:40:28.425149: > client_request(client.34810:16564864 lookup #0x101fcab/sometarball.tar.gz > 2018-04-13 17:40:28.424961 caller_uid=94894, caller_gid=513{513,}) currently > failed to rdlock, waiting > 2018-04-13 18:08:34.378918 7f1ce4472700 0 log_channel(cluster) log [WRN] : > slow request 1552.743795 seconds old, received at 2018-04-13 17:42:41.635012: > client_request(client.34302:16453566 getattr pAsLsXsFs #0x109ff6d > 2018-04-13 17:42:41.634726 caller_uid=94894, caller_gid=513{513,}) currently > failed to rdlock, waiting > --- > As you can see (oldest blocked for > 1745.864417 secs) it stays in that > situation for quite a while. > The number of blocked requests is also not decreasing, but instead slowly > increasing whenever a new request is added to the queue. > > We have a setup of one active MDS, a standby-replay, and a standby. > Triggering a failover does not help, it only resets the "oldest blocked" time. Sounds like a client issue (a client is holding a lock on a file but failing to relinquish it for another client's request to be processed). Are these kernel (and what version?) or fuse clients? John > > I checked the following things on the active MDS: > --- > # ceph daemon mds.mon001 objecter_requests > { > "ops": [], > "linger_ops": [], > "pool_ops": [], > "pool_stat_ops": [], > "statfs_ops": [], > "command_ops": [] > } > # ceph daemon mds.mon001 ops | grep event | grep -v "initiated" | grep -v > "failed to rdlock" | grep -v events > => no output, only "initiated" and "rdlock" are in the queue. > --- > > There's also almost no CPU load, almost no other I/O, and ceph is > deep-scrubbing ~pg (this also finishes and the next pg is scrubbed fine), > and the scrubbing is not even happening in the metadata pool (easy to see in > the Luminous dashboard): > --- > # ceph -s > cluster: > id: some_funny_hash > health: HEALTH_WARN > 1 MDSs report slow requests > > services: > mon: 3 daemons, quorum mon003,mon001,mon002 > mgr: mon001(active), standbys: mon002, mon003 > mds: cephfs_baf-1/1/1 up {0=mon001=up:active}, 1 up:standby-replay, 1 > up:standby > osd: 196 osds: 196 up, 196 in > > data: > pools: 2 pools, 4224 pgs > objects: 15649k objects, 61761 GB > usage: 114 TB used, 586 TB / 700 TB avail > pgs: 4223 active+clean > 1active+clean+scrubbing+deep > > io: > client: 175 kB/s rd, 3 op/s rd, 0 op/s wr > --- > > Does anybody have any idea what's going on here? > > Yesterday, this also happened, but resolved itself after about 1 hour. > Right now, it's going on for about half an hour... > > Cheers, > Oliver > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___
Re: [ceph-users] High TCP retransmission rates, only with Ceph
I should have been more clear. The TCP retransmissions are on the OSD host. On Sun, Apr 15, 2018 at 1:48 PM, Paweł Sadowskiwrote: > On 04/15/2018 08:18 PM, Robert Stanford wrote: > >> >> Iperf gives about 7Gb/s between a radosgw host and one of my OSD hosts >> (8 disks, 8 OSD daemons, one of 3 OSD hosts). When I benchmark radosgw >> with cosbench I see high TCP retransmission rates (from sar -n ETCP 1). I >> don't see this with iperf. Why would Ceph, but not iperf, cause high TCP >> retransmission rates? >> > > Most probably your application (radosgw in this case) is not able to > process requests fast enough and some packets are dropped. > > -- > PS > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] High TCP retransmission rates, only with Ceph
On 04/15/2018 08:18 PM, Robert Stanford wrote: Iperf gives about 7Gb/s between a radosgw host and one of my OSD hosts (8 disks, 8 OSD daemons, one of 3 OSD hosts). When I benchmark radosgw with cosbench I see high TCP retransmission rates (from sar -n ETCP 1). I don't see this with iperf. Why would Ceph, but not iperf, cause high TCP retransmission rates? Most probably your application (radosgw in this case) is not able to process requests fast enough and some packets are dropped. -- PS ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] High TCP retransmission rates, only with Ceph
Iperf gives about 7Gb/s between a radosgw host and one of my OSD hosts (8 disks, 8 OSD daemons, one of 3 OSD hosts). When I benchmark radosgw with cosbench I see high TCP retransmission rates (from sar -n ETCP 1). I don't see this with iperf. Why would Ceph, but not iperf, cause high TCP retransmission rates? Thank you ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] ZeroDivisionError: float division by zero in /usr/lib/ceph/mgr/dashboard/module.py (12.2.4)
Hi, I'm not sure if this have been solved since 12.2.4. The same code occurs in a different file in Github https://github.com/ceph/ceph/blob/ 50412f7e9c2691ec10132c8bf9310a05a40e9f9d/src/pybind/mgr/status/module.p y The ZeroDivisionError occurs when the dashboard is open, and there is a network outage (link between 2 datacenters is broken). I'm not sure about the behaviour of the actual UI in the dashboard at the same time. Syslog trace: ceph-mgr[1324]: [15/Apr/2018:09:47:12] HTTP Traceback (most recent call last): ceph-mgr[1324]: File "/usr/lib/python2.7/dist-packages/cherrypy/_cprequest.py", line 670, in respond ceph-mgr[1324]: response.body = self.handler() ceph-mgr[1324]: File "/usr/lib/python2.7/dist-packages/cherrypy/lib/encoding.py", line 217, in __call__ ceph-mgr[1324]: self.body = self.oldhandler(*args, **kwargs) ceph-mgr[1324]: File "/usr/lib/python2.7/dist-packages/cherrypy/lib/jsontools.py", line 63, in json_handler ceph-mgr[1324]: value = cherrypy.serving.request._json_inner_handler(*args, **kwargs) ceph-mgr[1324]: File "/usr/lib/python2.7/dist-packages/cherrypy/_cpdispatch.py", line 61, in __call__ ceph-mgr[1324]: return self.callable(*self.args, **self.kwargs) ceph-mgr[1324]: File "/usr/lib/ceph/mgr/dashboard/module.py", line 991, in list_data ceph-mgr[1324]: return self._osds_by_server() ceph-mgr[1324]: File "/usr/lib/ceph/mgr/dashboard/module.py", line 1040, in _osds_by_server ceph-mgr[1324]: osd_map.osds_by_id[osd_id]) ceph-mgr[1324]: File "/usr/lib/ceph/mgr/dashboard/module.py", line 1007, in _osd_summary ceph-mgr[1324]: result['stats'][s.split(".")[1]] = global_instance().get_rate('osd', osd_spec, s) ceph-mgr[1324]: File "/usr/lib/ceph/mgr/dashboard/module.py", line 268, in get_rate ceph-mgr[1324]: return (data[-1][1] - data[-2][1]) / float(data[-1][0] - data[-2][0]) ceph-mgr[1324]: ZeroDivisionError: float division by zero HTH, -- Nicolas Huillard ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com