please try with: ceph pg repair <pgid> most of the time will be good!
good luck! > 在 2016年9月26日,下午10:44,Eugen Block <[email protected]> 写道: > > (Sorry, sometimes I use the wrong shortcuts too quick) > > Hi experts, > > I need your help. I have a running cluster with 19 OSDs and 3 MONs. I created > a separate LVM for /var/lib/ceph on one of the nodes. I stopped the mon > service on that node, rsynced the content to the newly created LVM and > restarted the monitor, but obviously, I didn't do that correctly as I'm stuck > in ERROR state and can't repair the respective PGs. > How would I do that correctly? I want to do the same on the remaining nodes, > but without bringing the cluster to error state. > > One thing I already learned is to set the noout flag before stopping > services, but what else is there to do to accomplish that? > > But now that it is in error state, how can I repair my cluster? the current > status is: > > ---cut here--- > ceph@node01:~/ceph-deploy> ceph -s > cluster 655cb05a-435a-41ba-83d9-8549f7c36167 > health HEALTH_ERR > 16 pgs inconsistent > 261 scrub errors > monmap e7: 3 mons at > {mon1=192.168.160.15:6789/0,mon2=192.168.160.17:6789/0,mon3=192.168.160.16:6789/0} > election epoch 356, quorum 0,1,2 mon1,mon2,mon3 > osdmap e3394: 19 osds: 19 up, 19 in > pgmap v7105355: 8432 pgs, 15 pools, 1003 GB data, 205 kobjects > 2114 GB used, 6038 GB / 8153 GB avail > 8413 active+clean > 16 active+clean+inconsistent > 3 active+clean+scrubbing+deep > client io 0 B/s rd, 136 kB/s wr, 34 op/s > > ceph@ndesan01:~/ceph-deploy> ceph health detail > HEALTH_ERR 16 pgs inconsistent; 261 scrub errors > pg 1.ffa is active+clean+inconsistent, acting [16,5] > pg 1.cc9 is active+clean+inconsistent, acting [5,18] > pg 1.bb1 is active+clean+inconsistent, acting [15,5] > pg 1.ac4 is active+clean+inconsistent, acting [0,5] > pg 1.a46 is active+clean+inconsistent, acting [13,4] > pg 1.a16 is active+clean+inconsistent, acting [5,18] > pg 1.9e4 is active+clean+inconsistent, acting [13,9] > pg 1.9b7 is active+clean+inconsistent, acting [5,6] > pg 1.950 is active+clean+inconsistent, acting [0,9] > pg 1.6db is active+clean+inconsistent, acting [15,5] > pg 1.5f6 is active+clean+inconsistent, acting [17,5] > pg 1.5c2 is active+clean+inconsistent, acting [8,4] > pg 1.5bc is active+clean+inconsistent, acting [9,6] > pg 1.505 is active+clean+inconsistent, acting [16,9] > pg 1.3e6 is active+clean+inconsistent, acting [2,4] > pg 1.32 is active+clean+inconsistent, acting [18,5] > 261 scrub errors > ---cut here--- > > And the number of scrub errors is increasing, although I started with more > thatn 400 scrub errors. > What I have tried is to manually repair single PGs as described in [1]. But > some of the broken PGs have no entries in the log file so I don't have > anything to look at. > In case there is one object in one OSD but is missing in the other. how do I > get that copied back there? Everything I've tried so far didn't accomplish > anything except the decreasing number of scrub errors, but they are > increasing again, so no success at all. > > I'd be really greatful for your advice! > > Regards, > Eugen > > [1] http://ceph.com/planet/ceph-manually-repair-object/ > > -- > Eugen Block voice : +49-40-559 51 75 > NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 > Postfach 61 03 15 > D-22423 Hamburg e-mail : [email protected] > > Vorsitzende des Aufsichtsrates: Angelika Mozdzen > Sitz und Registergericht: Hamburg, HRB 90934 > Vorstand: Jens-U. Mozdzen > USt-IdNr. DE 814 013 983 > > _______________________________________________ > ceph-users mailing list > [email protected] > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
