Re: [ceph-users] How to abandon PGs that are stuck in "incomplete"?

[email protected] Sun, 04 Sep 2016 01:52:21 -0700

Short reply from phone. Try object store tool on primary OSD and mark as 
complete.


ceph-objectstore-tool that is. Run on the primary OSD for that PG.

Wido

> Op 3 sep. 2016 om 19:47 heeft Dan Jakubiec <[email protected]> het 
> volgende geschreven:
> 
> I think we are zero'ing in now on root cause for the stuck incomplete.  Looks 
> like the common factor for all our stuck PGs is that they are all showing the 
> removed OSD 8 in their "down_osds_we_would_probe" list (from "ceph pg <id> 
> query").
> 
> For reference, I found a few archived threads of other people experiencing 
> similar problems in the past:
> 
>   https://www.mail-archive.com/[email protected]/msg13985.html
>   
> http://ceph-users.ceph.narkive.com/jJ2DyVw7/ceph-pgs-stuck-creating-after-running-force-create-pg
>   http://lists.ceph.com/pipermail/ceph-users-ceph.com/2014-August/042338.html
> 
> The general consensus from those threads is that as long as 
> down_osds_we_would_probe is pointing to any OSD that can't be reached, those 
> PGs will remain stuck incomplete and can't be cured by force_create_pg or 
> even "ceph osd lost".
> 
> Question: is there any command we can run to remove the old OSD from 
> down_osds_we_would_probe?
> 
> I did try to create an new "fake" OSD.8 today (just created the OSD, but 
> didn't bring it all the way up), and I was able to finally run "ceph osd lost 
> 8".  Did not seem to have any impact.
> 
> If there is no command to removed the old OSD, I think our next step will be 
> to bring up a new/real/empty OSD.8 and see if that will clear the log jam.  
> But seems like there should be a tool to deal with this kind of thing?
> 
> Thanks,
> 
> -- Dan
> 
> 
>> On Sep 2, 2016, at 15:01, Dan Jakubiec <[email protected]> wrote:
>> 
>> Re-packaging this question which was buried in a larger, less-specific 
>> thread from a couple of days ago.  Hoping this will be more useful here.
>> 
>> We have been working on restoring our Ceph cluster after losing a large 
>> number of OSDs.  We have all PGs active now except for 80 PGs that are stuck 
>> in the "incomplete" state.  These PGs are referencing OSD.8 which we removed 
>> 2 weeks ago due to corruption.
>> 
>> We would like to abandon the "incomplete" PGs as they are not restorable.  
>> We have tried the following:
>> 
>> Per the docs, we made sure min_size on the corresponding pools was set to 1. 
>>  This did not clear the condition.
>> Ceph would not let us issue "ceph osd lost N" because OSD.8 had already been 
>> removed from the cluster.
>> We also tried "ceph pg force_create_pg X" on all the PGs.  The 80 PGs moved 
>> to "creating" for a few minutes but then all went back to "incomplete".
>> 
>> How do we abandon these PGs to allow recovery to continue?  Is there some 
>> way to force individual PGs to be marked as "lost"?
>> 
>> 
>> ====
>> 
>> Some miscellaneous data below:
>> 
>> djakubiec@dev:~$ ceph osd lost 8 --yes-i-really-mean-it
>> osd.8 is not down or doesn't exist
>> 
>> 
>> djakubiec@dev:~$ ceph osd tree
>> ID WEIGHT   TYPE NAME       UP/DOWN REWEIGHT PRIMARY-AFFINITY
>> -1 58.19960 root default
>> -2  7.27489     host node24
>>  1  7.27489         osd.1        up  1.00000          1.00000
>> -3  7.27489     host node25
>>  2  7.27489         osd.2        up  1.00000          1.00000
>> -4  7.27489     host node26
>>  3  7.27489         osd.3        up  1.00000          1.00000
>> -5  7.27489     host node27
>>  4  7.27489         osd.4        up  1.00000          1.00000
>> -6  7.27489     host node28
>>  5  7.27489         osd.5        up  1.00000          1.00000
>> -7  7.27489     host node29
>>  6  7.27489         osd.6        up  1.00000          1.00000
>> -8  7.27539     host node30
>>  9  7.27539         osd.9        up  1.00000          1.00000
>> -9  7.27489     host node31
>>  7  7.27489         osd.7        up  1.00000          1.00000
>> 
>> BUT, even though OSD 8 no longer exists I see still lots of references to 
>> OSD 8 in various ceph dumps and query's.
>> 
>> Interestingly, we do still see weird entries in the CRUSH map (should I do 
>> something about these?):
>> 
>> # devices
>> device 0 device0
>> device 1 osd.1
>> device 2 osd.2
>> device 3 osd.3
>> device 4 osd.4
>> device 5 osd.5
>> device 6 osd.6
>> device 7 osd.7
>> device 8 device8
>> device 9 osd.9
>> 
>> 
>> 
>> And for what it is worth, here is the ceph -s:
>> 
>>     cluster 10d47013-8c2a-40c1-9b4a-214770414234
>>      health HEALTH_ERR
>>             212 pgs are stuck inactive for more than 300 seconds
>>             93 pgs backfill_wait
>>             1 pgs backfilling
>>             101 pgs degraded
>>             63 pgs down
>>             80 pgs incomplete
>>             89 pgs inconsistent
>>             4 pgs recovery_wait
>>             1 pgs repair
>>             132 pgs stale
>>             80 pgs stuck inactive
>>             132 pgs stuck stale
>>             103 pgs stuck unclean
>>             97 pgs undersized
>>             2 requests are blocked > 32 sec
>>             recovery 4394354/46343776 objects degraded (9.482%)
>>             recovery 4025310/46343776 objects misplaced (8.686%)
>>             2157 scrub errors
>>             mds cluster is degraded
>>      monmap e1: 3 mons at 
>> {core=10.0.1.249:6789/0,db=10.0.1.251:6789/0,dev=10.0.1.250:6789/0}
>>             election epoch 266, quorum 0,1,2 core,dev,db
>>       fsmap e3627: 1/1/1 up {0=core=up:replay}
>>      osdmap e4293: 8 osds: 8 up, 8 in; 144 remapped pgs
>>             flags sortbitwise
>>       pgmap v1866639: 744 pgs, 10 pools, 7668 GB data, 20673 kobjects
>>             8339 GB used, 51257 GB / 59596 GB avail
>>             4394354/46343776 objects degraded (9.482%)
>>             4025310/46343776 objects misplaced (8.686%)
>>                  362 active+clean
>>                  112 stale+active+clean
>>                   89 active+undersized+degraded+remapped+wait_backfill
>>                   66 active+clean+inconsistent
>>                   63 down+incomplete
>>                   19 stale+active+clean+inconsistent
>>                   17 incomplete
>>                    5 active+undersized+degraded+remapped
>>                    4 active+recovery_wait+degraded
>>                    2 
>> active+undersized+degraded+remapped+inconsistent+wait_backfill
>>                    1 stale+active+clean+scrubbing+deep+inconsistent+repair
>>                    1 active+remapped+inconsistent+wait_backfill
>>                    1 active+clean+scrubbing+deep
>>                    1 active+remapped+wait_backfill
>>                    1 active+undersized+degraded+remapped+backfilling
>> 
>> 
>> 
>> Thanks,
>> 
>> -- Dan
> 
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] How to abandon PGs that are stuck in "incomplete"?

Reply via email to