Re: [ceph-users] How to maintain cluster properly (Part2)

lyt_yudi Mon, 26 Sep 2016 19:46:52 -0700

please try with: 
ceph pg repair <pgid>

most of the time will be good!


good luck!


> 在 2016年9月26日，下午10:44，Eugen Block <[email protected]> 写道：
> 
> (Sorry, sometimes I use the wrong shortcuts too quick)
> 
> Hi experts,
> 
> I need your help. I have a running cluster with 19 OSDs and 3 MONs. I created 
> a separate LVM for /var/lib/ceph on one of the nodes. I stopped the mon 
> service on that node, rsynced the content to the newly created LVM and 
> restarted the monitor, but obviously, I didn't do that correctly as I'm stuck 
> in ERROR state and can't repair the respective PGs.
> How would I do that correctly? I want to do the same on the remaining nodes, 
> but without bringing the cluster to error state.
> 
> One thing I already learned is to set the noout flag before stopping 
> services, but what else is there to do to accomplish that?
> 
> But now that it is in error state, how can I repair my cluster? the current 
> status is:
> 
> ---cut here---
> ceph@node01:~/ceph-deploy> ceph -s
>    cluster 655cb05a-435a-41ba-83d9-8549f7c36167
>     health HEALTH_ERR
>            16 pgs inconsistent
>            261 scrub errors
>     monmap e7: 3 mons at 
> {mon1=192.168.160.15:6789/0,mon2=192.168.160.17:6789/0,mon3=192.168.160.16:6789/0}
>            election epoch 356, quorum 0,1,2 mon1,mon2,mon3
>     osdmap e3394: 19 osds: 19 up, 19 in
>      pgmap v7105355: 8432 pgs, 15 pools, 1003 GB data, 205 kobjects
>            2114 GB used, 6038 GB / 8153 GB avail
>                8413 active+clean
>                  16 active+clean+inconsistent
>                   3 active+clean+scrubbing+deep
>  client io 0 B/s rd, 136 kB/s wr, 34 op/s
> 
> ceph@ndesan01:~/ceph-deploy> ceph health detail
> HEALTH_ERR 16 pgs inconsistent; 261 scrub errors
> pg 1.ffa is active+clean+inconsistent, acting [16,5]
> pg 1.cc9 is active+clean+inconsistent, acting [5,18]
> pg 1.bb1 is active+clean+inconsistent, acting [15,5]
> pg 1.ac4 is active+clean+inconsistent, acting [0,5]
> pg 1.a46 is active+clean+inconsistent, acting [13,4]
> pg 1.a16 is active+clean+inconsistent, acting [5,18]
> pg 1.9e4 is active+clean+inconsistent, acting [13,9]
> pg 1.9b7 is active+clean+inconsistent, acting [5,6]
> pg 1.950 is active+clean+inconsistent, acting [0,9]
> pg 1.6db is active+clean+inconsistent, acting [15,5]
> pg 1.5f6 is active+clean+inconsistent, acting [17,5]
> pg 1.5c2 is active+clean+inconsistent, acting [8,4]
> pg 1.5bc is active+clean+inconsistent, acting [9,6]
> pg 1.505 is active+clean+inconsistent, acting [16,9]
> pg 1.3e6 is active+clean+inconsistent, acting [2,4]
> pg 1.32 is active+clean+inconsistent, acting [18,5]
> 261 scrub errors
> ---cut here---
> 
> And the number of scrub errors is increasing, although I started with more 
> thatn 400 scrub errors.
> What I have tried is to manually repair single PGs as described in [1]. But 
> some of the broken PGs have no entries in the log file so I don't have 
> anything to look at.
> In case there is one object in one OSD but is missing in the other. how do I 
> get that copied back there? Everything I've tried so far didn't accomplish 
> anything except the decreasing number of scrub errors, but they are 
> increasing again, so no success at all.
> 
> I'd be really greatful for your advice!
> 
> Regards,
> Eugen
> 
> [1] http://ceph.com/planet/ceph-manually-repair-object/
> 
> -- 
> Eugen Block                             voice   : +49-40-559 51 75
> NDE Netzdesign und -entwicklung AG      fax     : +49-40-559 51 77
> Postfach 61 03 15
> D-22423 Hamburg                         e-mail  : [email protected]
> 
>        Vorsitzende des Aufsichtsrates: Angelika Mozdzen
>          Sitz und Registergericht: Hamburg, HRB 90934
>                  Vorstand: Jens-U. Mozdzen
>                   USt-IdNr. DE 814 013 983
> 
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] How to maintain cluster properly (Part2)

Reply via email to