Re: [ceph-users] [need your help] How to Fix unclean PG

2018-09-17 Thread Paul Emmerich
This looks fine and will recover on its own. If you are not seeing enough client IO means that your tuning for recovery IO vs client IO priority is incorrect. A simple and effective way is increasing the osd_recovery_sleep_hdd option (I think the default is 0.05 in Luminous and 0.1 since Mimic?)

Re: [ceph-users] [need your help] How to Fix unclean PG

2018-09-15 Thread Frank Yu
Hi Paul, before I upgrade, there are 17 osd server, (8 osd per server), 3 mds/rgw, 2 active mds, then I add 5 osd server(16 osd per server), then one active server crash( and I reboot it), the mds can't come back to health anymore, So, I add two new mds server, and delete one of the original the

Re: [ceph-users] [need your help] How to Fix unclean PG

2018-09-15 Thread Paul Emmerich
Well, that's not a lot of information to troubleshoot such a problem. Please post the output of the following commands: * ceph -s * ceph health detail * ceph osd pool ls detail * ceph osd tree * ceph osd df tree * ceph versions And a description of what you did to upgrade it. Paul 2018-09-15

[ceph-users] [need your help] How to Fix unclean PG

2018-09-15 Thread Frank Yu
Hello there, I have a ceph cluster which increase from 400TB to 900 TB recently, now the cluster is in unhealthy status, there're about 1700+ pg in unclean status # ceph pg dump_stuck unclean|wc ok 1696 10176 191648 the cephfs can't work anymore, the read io was no more than MB/s. Is