Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "BookieRecoveryPage" page has been changed by FlavioJunqueira:
http://wiki.apache.org/hadoop/BookieRecoveryPage?action=diff&rev1=2&rev2=3

  = Bookie Recovery Design =
  
+ == Problem statement and trade-offs ==
  The essential idea of the bookie recovery feature is to enable an application 
to heal its bookie ensemble once some bookie has crashed. The bookie recovery 
task is basically the one of reconstructing the ledger fragment that the 
crashed stored or should have stored, had it not crashed.
  
- == Requirements ==
+ By design, a bookie can store fragments of multiple ledgers. To recover a 
bookie, we hence need to create new copies of each of the fragments that were 
present in the faulty bookie. There two choices: we recover ledgers 
individually or we recover one ledger at a time. To decide which one is more 
appropriate, we have to think about how we will use such a recovery tool. If 
applications are to run such a tool, then it is probably best to recover one at 
a time or at small batches. If some operations team will perform recovery on 
behalf of applications, then they will probably prefer to recover the whole set 
of faulty bookie.
  
- == Design ==
+ Such a recovery tool can run either as a separate client or directly in a 
bookie. The advantage of implementing recovery on the client side is 
simplicity: we can just leverage the client implementation to read entries and 
write to the new bookie. Performing such a task in a client, however, may lead 
to an inefficient utilization of network bandwidth. For an efficient 
utilization of network bandwidth, it is best to copy entries directly. 
  
+ 
+ == Design choices ==
+ 
+ TBD
+ 

Reply via email to