"osd_peering_wq_threads": "2",
> "osd_recovery_thread_suicide_timeout": "300",
> "osd_recovery_thread_timeout": "30",
> "osd_remove_thread_suicide_timeout": "36000",
> "osd_remove_th
uot;,
"osd_recovery_thread_suicide_timeout": "300",
"osd_recovery_thread_timeout": "30",
"osd_remove_thread_suicide_timeout": "36000",
"osd_remove_thread_timeout": "3600",
-Original Message--
This message seems to be very concerning:
>mds0: Metadata damage detected
but for the rest, the cluster seems still to be recovering. you could try
to seep thing up with ceph tell, like:
ceph tell osd.* injectargs --osd_max_backfills=10
ceph tell osd.* injectargs
Below id the information you were asking for. I think they are size=2,
min size=1.
Dan
# ceph status
cluster 7bffce86-9d7b-4bdf-a9c9-67670e68ca77
health HEALTH_ERR
140 pgs are stuck inactive for more than 300 seconds
64 pgs backfill_wait
76 pgs
What are some outputs of commands to show us the state of your cluster.
Most notable is `ceph status` but `ceph osd tree` would be helpful. What
are the size of the pools in your cluster? Are they all size=3 min_size=2?
On Fri, May 11, 2018 at 12:05 PM Daniel Davidson
Hello,
Today we had a node crash, and looking at it, it seems there is a
problem with the RAID controller, so it is not coming back up, maybe
ever. It corrupted the local filesytem for the ceph storage there.
The remainder of our storage (10.2.10) cluster is running, and it looks
to be