For some reason I didn't notice that number. But it's most likely you are hitting this or similar bug: https://tracker.ceph.com/issues/21803
On Wed, Mar 6, 2019, 17:30 Simon Ironside <sirons...@caffetine.org> wrote: > That's the misplaced objects, no problem there. Degraded objects are at > 153.818%. > > Simon > > On 06/03/2019 15:26, Darius Kasparavičius wrote: > > Hi, > > > > there it's 1.2% not 1200%. > > > > On Wed, Mar 6, 2019 at 4:36 PM Simon Ironside <sirons...@caffetine.org> > wrote: > >> Hi, > >> > >> I'm still seeing this issue during failure testing of a new Mimic 13.2.4 > >> cluster. To reproduce: > >> > >> - Working Mimic 13.2.4 cluster > >> - Pull a disk > >> - Wait for recovery to complete (i.e. back to HEALTH_OK) > >> - Remove the OSD with `ceph osd crush remove` > >> - See greater than 100% degraded objects while it recovers as below > >> > >> It doesn't seem to do any harm, once recovery completes the cluster > >> returns to HEALTH_OK. > >> I can only find bug 21803 on the tracker that seems to cover this > >> behaviour which is marked as resolved. > >> > >> Simon > >> > >> cluster: > >> id: MY ID > >> health: HEALTH_WARN > >> 709/58572 objects misplaced (1.210%) > >> Degraded data redundancy: 90094/58572 objects degraded > >> (153.818%), 49 pgs degraded, 51 pgs undersized > >> > >> services: > >> mon: 3 daemons, quorum san2-mon1,san2-mon2,san2-mon3 > >> mgr: san2-mon1(active), standbys: san2-mon2, san2-mon3 > >> osd: 52 osds: 52 up, 52 in; 84 remapped pgs > >> > >> data: > >> pools: 16 pools, 2016 pgs > >> objects: 19.52 k objects, 72 GiB > >> usage: 7.8 TiB used, 473 TiB / 481 TiB avail > >> pgs: 90094/58572 objects degraded (153.818%) > >> 709/58572 objects misplaced (1.210%) > >> 1932 active+clean > >> 47 active+recovery_wait+undersized+degraded+remapped > >> 33 active+remapped+backfill_wait > >> 2 active+recovering+undersized+remapped > >> 1 active+recovery_wait+undersized+degraded > >> 1 active+recovering+undersized+degraded+remapped > >> > >> io: > >> client: 24 KiB/s wr, 0 op/s rd, 3 op/s wr > >> recovery: 0 B/s, 126 objects/s > >> > >> > >> On 13/10/2017 18:53, David Zafman wrote: > >>> I improved the code to compute degraded objects during > >>> backfill/recovery. During my testing it wouldn't result in percentage > >>> above 100%. I'll have to look at the code and verify that some > >>> subsequent changes didn't break things. > >>> > >>> David > >>> > >>> > >>> On 10/13/17 9:55 AM, Florian Haas wrote: > >>>>>>> Okay, in that case I've no idea. What was the timeline for the > >>>>>>> recovery > >>>>>>> versus the rados bench and cleanup versus the degraded object > counts, > >>>>>>> then? > >>>>>> 1. Jewel deployment with filestore. > >>>>>> 2. Upgrade to Luminous (including mgr deployment and "ceph osd > >>>>>> require-osd-release luminous"), still on filestore. > >>>>>> 3. rados bench with subsequent cleanup. > >>>>>> 4. All OSDs up, all PGs active+clean. > >>>>>> 5. Stop one OSD. Remove from CRUSH, auth list, OSD map. > >>>>>> 6. Reinitialize OSD with bluestore. > >>>>>> 7. Start OSD, commencing backfill. > >>>>>> 8. Degraded objects above 100%. > >>>>>> > >>>>>> Please let me know if that information is useful. Thank you! > >>>>> Hmm, that does leave me a little perplexed. > >>>> Yeah exactly, me too. :) > >>>> > >>>>> David, do we maybe do something with degraded counts based on the > >>>>> number of > >>>>> objects identified in pg logs? Or some other heuristic for number of > >>>>> objects > >>>>> that might be stale? That's the only way I can think of to get these > >>>>> weird > >>>>> returning sets. > >>>> One thing that just crossed my mind: would it make a difference > >>>> whether after the OSD goes out or not, in the time window between it > >>>> going down and being deleted from the crushmap/osdmap? I think it > >>>> shouldn't (whether being marked out or just non-existent, it's not > >>>> eligible for holding any data so either way), but I'm not really sure > >>>> about the mechanics of the internals here. > >>>> > >>>> Cheers, > >>>> Florian > >>> _______________________________________________ > >>> ceph-users mailing list > >>> ceph-users@lists.ceph.com > >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> _______________________________________________ > >> ceph-users mailing list > >> ceph-users@lists.ceph.com > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com