Thanks Sam for the quick response. Just want to make sure I understand it 
correctly:

If we haveĀ [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11] and all of 1,2,3 are down, the 
PG is active as we are using 8 + 3, and once 4 is down, even though we bring up 
1,2,3, the PG could not become active unless we bring 4 up. Is my understanding 
correct here?

Thanks,
Guang

----------------------------------------
> Date: Thu, 13 Nov 2014 09:06:27 -0800
> Subject: Re: PG down
> From: [email protected]
> To: [email protected]
> CC: [email protected]
>
> It looks like the acting set went down to the min allowable size and
> went active with osd 8. At that point you needed every member of that
> acting set to go active later on to avoiding loosing writes. You can
> prevent this by setting a min_size above the number of data chunks.
> -Sam
>
> On Thu, Nov 13, 2014 at 4:15 AM, GuangYang <[email protected]> wrote:
>> Hi Sam,
>> Yesterday there was one PG down in our cluster and I am confused by the PG 
>> state, I am not sure if it is a bug (or an issue has been fixed as I see a 
>> couple of related fixes in giant), it would be nice you can help to take a 
>> look.
>>
>> Here is what happened:
>>
>> We are using EC pool with 8 data chunks and 3 code chunks, saying the PG has 
>> up/acting set as [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], there was one OSD in 
>> the set down and up, so that it triggered PG recovering. However, when doing 
>> recover, the primary OSD crash as due to a corrupted file chunk, then 
>> another OSD become primary, start recover and crashed, and so on so forth 
>> until there are 4 OSDs down in the set and the PG is marked down.
>>
>> After that, we left the OSD having corrupted data down and started all other 
>> crashed OSDs, we expected the PG could become active, however, the PG is 
>> still down with the following query information:
>>
>> { "state": "down+remapped+inconsistent+peering",
>> "epoch": 4469,
>> "up": [
>> 377,
>> 107,
>> 328,
>> 263,
>> 395,
>> 467,
>> 352,
>> 475,
>> 333,
>> 37,
>> 380],
>> "acting": [
>> 2147483647,
>> 107,
>> 328,
>> 263,
>> 395,
>> 2147483647,
>> 352,
>> 475,
>> 333,
>> 37,
>> 380],
>> ...
>> 377]}],
>> "probing_osds": [
>> "37(9)",
>> "107(1)",
>> "263(3)",
>> "328(2)",
>> "333(8)",
>> "352(6)",
>> "377(0)",
>> "380(10)",
>> "395(4)",
>> "467(5)",
>> "475(7)"],
>> "blocked": "peering is blocked due to down osds",
>> "down_osds_we_would_probe": [
>> 8],
>> "peering_blocked_by": [
>> { "osd": 8,
>> "current_lost_at": 0,
>> "comment": "starting or marking this osd lost may let us proceed"}]},
>> { "name": "Started",
>> "enter_time": "2014-11-12 10:12:23.067369"}],
>> }
>>
>> Here osd.8 is the one having corrupted data.
>>
>> The way we worked around this issue is to set norecover and start osd.8, get 
>> that PG active and then removed the object (via rados), unset norecover and 
>> things become clean again. But the most confusing part is that even we only 
>> left osd.8 down, the PG couldn't become active.
>>
>> We are using firefly v0.80.4.
>>
>> Thanks,
>> Guang
                                          --
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to