I don't want to "rescue" any OSDs. I want to clean the incomplete PGs to make CEPH proceed with PG re-creation and making those groups active again. In my case which OSDs should I start with the "osd_find_best_info_ignore_history_les" option? This is the part of query output from one of the groups to be cleared: "probing_osds": [ "54(1)", "81(2)", "103(0)", "103(1)", "118(9)", "126(3)", "129(4)", "141(1)", "142(2)", "147(7)", "150(1)", "153(8)", "159(0)","165(6)", "168(5)", "171(0)","174(3)","177(9)","180(5)","262(2)","291(5)","313(1)","314(8)","315(7)","316(0)","318(6)"], "down_osds_we_would_probe": [4,88,91,94,112,133]
Maks wt., 28 sie 2018 o 15:20 Paul Emmerich <[email protected]> napisał(a): > I don't think it's documented. > > It won't affect PGs that are active+clean. > Takes effect during peering, easiest to set it in ceph.conf and > restart the OSDs on *all* OSDs that you want to rescue. > Important to not forget to unset it afterwards > > > Paul > > 2018-08-28 13:21 GMT+02:00 Maks Kowalik <[email protected]>: > > Thank you for answering. > > Where is this option documented? > > Do I set it in the config file, or using "tell osd.number" or > admin-daemon? > > Do I set it on the primary OSD of the up set, on all OSDs of the up set, > or > > maybe on all historical peers holding the shards of a particular group? > > Is this option dangerous to other groups on those OSDs (currently an OSD > > holds about 160 pgs)? > > > > Maks > > > > wt., 28 sie 2018 o 12:12 Paul Emmerich <[email protected]> > napisał(a): > >> > >> No need to delete it, that situation should be mostly salvagable by > >> setting osd_find_best_info_ignore_history_les temporarily on the > >> affected OSDs. > >> That should cause you to "just" lose some writes resulting in > inconsistent > >> data. > >> > >> > >> Paul > >> > >> 2018-08-28 11:08 GMT+02:00 Maks Kowalik <[email protected]>: > >> > What is the correct procedure for re-creating an incomplete placement > >> > group > >> > that belongs to an erasure coded pool? > >> > I'm facing a situation when too many shards of 3 PGs were lost during > >> > OSD > >> > crashes, and taking the data loss was decided, but can't force ceph to > >> > recreate those PGs. The query output shows: > >> > "peering_blocked_by_detail": [ > >> > {"detail": "peering_blocked_by_history_les_bound"} > >> > What was tried: > >> > 1. manual deletion of all shards appearing in "peers" secion of PG > query > >> > output > >> > 2. marking all shards as complete using ceph-objectstore-tool > >> > 3. deleting peering history from OSDs keeping the shards > >> > > >> > _______________________________________________ > >> > ceph-users mailing list > >> > [email protected] > >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > > >> > >> > >> > >> -- > >> Paul Emmerich > >> > >> Looking for help with your Ceph cluster? Contact us at https://croit.io > >> > >> croit GmbH > >> Freseniusstr. 31h > >> 81247 München > >> www.croit.io > >> Tel: +49 89 1896585 90 > > > > -- > Paul Emmerich > > Looking for help with your Ceph cluster? Contact us at https://croit.io > > croit GmbH > Freseniusstr. 31h > 81247 München > www.croit.io > Tel: +49 89 1896585 90 >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
