Re: [ceph-users] Troubleshooting hanging storage backend whenever there is any cluster change

Burkhard Linke Fri, 12 Oct 2018 05:03:32 -0700

Hi,


On 10/12/2018 01:55 PM, Nils Fahldieck - Profihost AG wrote:

I rebooted a Ceph host and logged `ceph status` & `ceph health detail`
every 5 seconds. During this I encountered 'PG_AVAILABILITY Reduced data
availability: pgs peering'. At the same time some VMs hung as described
before.

Just a wild guess... you have 71 OSDs and about 4500 PG with size=3.13500 PG instance overall, resulting in ~190 PGs per OSD under normalcircumstances.

If one host is down and the PGs have to re-peer, you might reach thelimit of 200 PG/OSDs on some of the OSDs, resulting in stuck peering.

You can try to raise this limit. There are several threads on themailing list about this.


Regards,
Burkhard

--
Dr. rer. nat. Burkhard Linke
Bioinformatics and Systems Biology
Justus-Liebig-University Giessen
35392 Giessen, Germany
Phone: (+49) (0)641 9935810

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Troubleshooting hanging storage backend whenever there is any cluster change

Reply via email to