Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The "BookieRecoveryPage" page has been changed by FlavioJunqueira: http://wiki.apache.org/hadoop/BookieRecoveryPage?action=diff&rev1=2&rev2=3 = Bookie Recovery Design = + == Problem statement and trade-offs == The essential idea of the bookie recovery feature is to enable an application to heal its bookie ensemble once some bookie has crashed. The bookie recovery task is basically the one of reconstructing the ledger fragment that the crashed stored or should have stored, had it not crashed. - == Requirements == + By design, a bookie can store fragments of multiple ledgers. To recover a bookie, we hence need to create new copies of each of the fragments that were present in the faulty bookie. There two choices: we recover ledgers individually or we recover one ledger at a time. To decide which one is more appropriate, we have to think about how we will use such a recovery tool. If applications are to run such a tool, then it is probably best to recover one at a time or at small batches. If some operations team will perform recovery on behalf of applications, then they will probably prefer to recover the whole set of faulty bookie. - == Design == + Such a recovery tool can run either as a separate client or directly in a bookie. The advantage of implementing recovery on the client side is simplicity: we can just leverage the client implementation to read entries and write to the new bookie. Performing such a task in a client, however, may lead to an inefficient utilization of network bandwidth. For an efficient utilization of network bandwidth, it is best to copy entries directly. + + == Design choices == + + TBD +
