On 06 Mar 2014, at 01:36, Ivan Kelly <[email protected]> wrote:

> On Thu, Mar 06, 2014 at 08:44:18AM +0000, Rakesh R wrote:
>>>>> I already pointed out. the admin should be aware of potential data loss. 
>>>>> so no confidence.
>> 
>> In HDFS shared storage perspective, data loss is not acceptable.
> I agree. A manual tool don't really help (right now the admin just
> deletes the underreplicated node).
> 
> My thoughts on the case, it that, even though there's nothing to
> recover after the first bookie goes down, we should replace the bookie
> in the ensemble, so that if another bookie in the ensemble changes, we
> don't lose quorum. Once quorum is lost, all bets are off.
> 

OK, this comment is not entirely clear to me. I thought in your example you had 
ensemble 3, quorum 2, and you had lost both B2 and B3. In that case, you 
already lost quorum. Not for L1, but at that point there are cases in which you 
don't know if you've lost a record. In the specific scenario you describe, we 
know there is no record 1 because there is no record 0, fine. But, if you had a 
record 0, then we wouldn't know if we lost a record and consequently the ledger 
is broken. We may be able to fix this particular case by simply (not) 
replicating what we have and declaring success, but it is not a general 
solution, I'm afraid.

>> 
>> 
>>>> the postponing is already there, since the ledger couldn't be opened and 
>>>> fenced.
>> 
>> Yeah Sijie you are right, it will postpone to next cycle. 
>> AFAIK AutoRecovery feature will keep on trying to open it again and
>> again, this cycle will never ends. It is a kind of hanging too.
> Actually, it's a little worse than that. The recovery worker will
> acquire the lock on the unreplicated node, try to open, release the
> lock, and repeat ad infinitum, without any pause between loops. This
> will create a lot of write traffic on zookeeper for the locks.


Ok, thanks for the clarification. Having an unbounded number of attempts is 
definitely not good. Independent of how we solve this problem, I was thinking 
about keeping track of the number of attempts.

-Flavio


> 
> -Ivan

Reply via email to