Dear Cephalopodians, a small addition.
As far as I know, the I/O the user is performing is based on the following
directory structure:
datafolder/some_older_tarball.tar.gz
datafolder/sometarball.tar.gz
datafolder/processing_number_2/
datafolder/processing_number_3/
datafolder/processing_number_4/
The problem appeared to start when:
- many clients were reading from datafolder/some_older_tarball.tar.gz, but
extracting somewhere else (to another filesystem).
- then, one single client starts to create datafolder/sometarball.tar.gz,
packaging files from another filesystem.
Can this cause such a lockup? If so, can we prevent it somehow?
Cheers,
Oliver
Am 13.04.2018 um 18:16 schrieb Oliver Freyermuth:
> Dear Cephalopodians,
>
> in our cluster (CentOS 7.4, EC Pool, Snappy compression, Luminous 12.2.4),
> we often have all (~40) clients accessing one file in readonly mode, even
> with multiple processes per client doing that.
>
> Sometimes (I do not yet know when, nor why!) the MDS ends up in a situation
> like:
> -----------------------------------------------------------
> 2018-04-13 18:08:34.378888 7f1ce4472700 0 log_channel(cluster) log [WRN] :
> 292 slow requests, 5 included below; oldest blocked for > 1745.864417 secs
> 2018-04-13 18:08:34.378900 7f1ce4472700 0 log_channel(cluster) log [WRN] :
> slow request 960.563534 seconds old, received at 2018-04-13 17:52:33.815273:
> client_request(client.34720:16487379 getattr pAsLsXsFs #0x1000009ff6d
> 2018-04-13 17:52:33.814904 caller_uid=94894, caller_gid=513{513,}) currently
> failed to rdlock, waiting
> 2018-04-13 18:08:34.378904 7f1ce4472700 0 log_channel(cluster) log [WRN] :
> slow request 30.636678 seconds old, received at 2018-04-13 18:08:03.742128:
> client_request(client.34302:16453640 getattr pAsLsXsFs #0x1000009ff6d
> 2018-04-13 18:08:03.741630 caller_uid=94894, caller_gid=513{513,}) currently
> failed to rdlock, waiting
> 2018-04-13 18:08:34.378908 7f1ce4472700 0 log_channel(cluster) log [WRN] :
> slow request 972.648926 seconds old, received at 2018-04-13 17:52:21.729881:
> client_request(client.34720:16487334 lookup #0x1000001fcab/sometarball.tar.gz
> 2018-04-13 17:52:21.729450 caller_uid=94894, caller_gid=513{513,}) currently
> failed to rdlock, waiting
> 2018-04-13 18:08:34.378913 7f1ce4472700 0 log_channel(cluster) log [WRN] :
> slow request 1685.953657 seconds old, received at 2018-04-13 17:40:28.425149:
> client_request(client.34810:16564864 lookup #0x1000001fcab/sometarball.tar.gz
> 2018-04-13 17:40:28.424961 caller_uid=94894, caller_gid=513{513,}) currently
> failed to rdlock, waiting
> 2018-04-13 18:08:34.378918 7f1ce4472700 0 log_channel(cluster) log [WRN] :
> slow request 1552.743795 seconds old, received at 2018-04-13 17:42:41.635012:
> client_request(client.34302:16453566 getattr pAsLsXsFs #0x1000009ff6d
> 2018-04-13 17:42:41.634726 caller_uid=94894, caller_gid=513{513,}) currently
> failed to rdlock, waiting
> -----------------------------------------------------------
> As you can see (oldest blocked for > 1745.864417 secs) it stays in that
> situation for quite a while.
> The number of blocked requests is also not decreasing, but instead slowly
> increasing whenever a new request is added to the queue.
>
> We have a setup of one active MDS, a standby-replay, and a standby.
> Triggering a failover does not help, it only resets the "oldest blocked"
> time.
>
> I checked the following things on the active MDS:
> -----------------------------------------------------------
> # ceph daemon mds.mon001 objecter_requests
> {
> "ops": [],
> "linger_ops": [],
> "pool_ops": [],
> "pool_stat_ops": [],
> "statfs_ops": [],
> "command_ops": []
> }
> # ceph daemon mds.mon001 ops | grep event | grep -v "initiated" | grep -v
> "failed to rdlock" | grep -v events
> => no output, only "initiated" and "rdlock" are in the queue.
> -----------------------------------------------------------
>
> There's also almost no CPU load, almost no other I/O, and ceph is
> deep-scrubbing ~pg (this also finishes and the next pg is scrubbed fine),
> and the scrubbing is not even happening in the metadata pool (easy to see in
> the Luminous dashboard):
> -----------------------------------------------------------
> # ceph -s
> cluster:
> id: some_funny_hash
> health: HEALTH_WARN
> 1 MDSs report slow requests
>
> services:
> mon: 3 daemons, quorum mon003,mon001,mon002
> mgr: mon001(active), standbys: mon002, mon003
> mds: cephfs_baf-1/1/1 up {0=mon001=up:active}, 1 up:standby-replay, 1
> up:standby
> osd: 196 osds: 196 up, 196 in
>
> data:
> pools: 2 pools, 4224 pgs
> objects: 15649k objects, 61761 GB
> usage: 114 TB used, 586 TB / 700 TB avail
> pgs: 4223 active+clean
> 1 active+clean+scrubbing+deep
>
> io:
> client: 175 kB/s rd, 3 op/s rd, 0 op/s wr
> -----------------------------------------------------------
>
> Does anybody have any idea what's going on here?
>
> Yesterday, this also happened, but resolved itself after about 1 hour.
> Right now, it's going on for about half an hour...
>
> Cheers,
> Oliver
>
>
>
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
