Hi,

can you share `ceph osd tree`? What crush rules are in use in your cluster? I assume that the two failed OSDs prevent the remapping because the rules can't be applied.


Regards,
Eugen


Zitat von Philipp Schwaha <phil...@schwaha.net>:

hi,

I have a problem with a cluster being stuck in recovery after osd
failure. at first recovery was doing quite well, but now it just sits
there without any progress. I currently looks like this:

     health HEALTH_ERR
            36 pgs are stuck inactive for more than 300 seconds
            50 pgs backfill_wait
            52 pgs degraded
            36 pgs down
            36 pgs peering
            1 pgs recovering
            1 pgs recovery_wait
            36 pgs stuck inactive
            52 pgs stuck unclean
            52 pgs undersized
            recovery 261632/2235446 objects degraded (11.704%)
            recovery 259813/2235446 objects misplaced (11.622%)
            recovery 2/1117723 unfound (0.000%)
     monmap e3: 3 mons at
{0=192.168.19.13:6789/0,1=192.168.19.17:6789/0,2=192.168.19.23:6789/0}
            election epoch 78, quorum 0,1,2 0,1,2
     osdmap e7430: 6 osds: 4 up, 4 in; 88 remapped pgs
            flags sortbitwise
      pgmap v20023893: 256 pgs, 1 pools, 4366 GB data, 1091 kobjects
            8421 GB used, 10183 GB / 18629 GB avail
            261632/2235446 objects degraded (11.704%)
            259813/2235446 objects misplaced (11.622%)
            2/1117723 unfound (0.000%)
                 168 active+clean
                  50 active+undersized+degraded+remapped+wait_backfill
                  36 down+remapped+peering
                   1 active+recovering+undersized+degraded+remapped
                   1 active+recovery_wait+undersized+degraded+remapped

Is there any way to motivate it to resume recovery?

Thanks
        Philipp



_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to