On Thu, May 17, 2018 at 4:16 PM, Uwe Sauter <uwe.sauter...@gmail.com> wrote:
> Hi,
>>> I'm currently chewing on an issue regarding "slow requests are blocked".
>>> I'd like to identify the OSD that is causing those events
>>> once the cluster is back to HEALTH_OK (as I have no monitoring yet that
>>> would get this info in realtime).
>>> Collecting this information could help identify aging disks if you were
>>> able to accumulate and analyze which OSD had blocking
>>> requests in the past and how often those events occur.
>>> My research so far let's me think that this information is only available
>>> as long as the requests are actually blocked. Is this
>>> correct?
>> You don't give any indication what version you are running but see
>> https://tracker.ceph.com/issues/23205
> the cluster is an Proxmox installation which is based on an Ubuntu kernel.
> # ceph -v
> ceph version 12.2.5 (dfcb7b53b2e4fcd2a5af0240d4975adc711ab96e) luminous
> (stable)
> The mistery is that these blocked requests occur numerously when at least
> one of the 6 servers is booted with kernel 4.15.17, if all are running
> 4.13.16 the number of blocked requests is infrequent and low.

Sounds like you need to profile your two kernel versions and work out
why one is under-performing.

> Regards,
>         Uwe

ceph-users mailing list

Reply via email to