This looks fine and will recover on its own.
If you are not seeing enough client IO means that your tuning for
recovery IO vs client IO priority is incorrect.
A simple and effective way is increasing the osd_recovery_sleep_hdd
option (I think the default is 0.05 in Luminous and 0.1 since Mimic?)
Hi Paul,
before I upgrade, there are 17 osd server, (8 osd per server), 3 mds/rgw, 2
active mds, then I add 5 osd server(16 osd per server), then one active
server crash( and I reboot it), the mds can't come back to health anymore,
So, I add two new mds server, and delete one of the original the
Well, that's not a lot of information to troubleshoot such a problem.
Please post the output of the following commands:
* ceph -s
* ceph health detail
* ceph osd pool ls detail
* ceph osd tree
* ceph osd df tree
* ceph versions
And a description of what you did to upgrade it.
Paul
2018-09-15
Hello there,
I have a ceph cluster which increase from 400TB to 900 TB recently, now the
cluster is in unhealthy status, there're about 1700+ pg in unclean status
# ceph pg dump_stuck unclean|wc
ok
1696 10176 191648
the cephfs can't work anymore, the read io was no more than MB/s.
Is