jvrao commented on a change in pull request #927: BP-24: BookieScanner: Enhance 
Data Integrity
URL: https://github.com/apache/bookkeeper/pull/927#discussion_r159816536
 
 

 ##########
 File path: site/bps/BP-24-BookieScanner.md
 ##########
 @@ -0,0 +1,92 @@
+?---
+title: "BP-24: BookieScanner: Enhance Data Integrity"
+issue: https://github.com/apache/bookkeeper/<issue-number>
+state: "Under Discussion"
+release: "N/A"
+---
+
+
+### Motivation
+
+
+Currently Bookie can't deal entry losing gracefully, the AutoRecovery is 
restricted to the bookie level, which means the AutoRecovery takes effect only 
after bookie is down. However when a disk fails, either or both the ledger 
index files and entry log files could potentially become corrupt. BookKeeper 
needs to provide mechanisms to identify and handle these problems.
+
+
+### Proposed Changes
+
+
+We introduce Bookie Scanner, which is a background task, to scan index files 
and entry log files to detect possible corruptions. Since data corruption may 
happen at any time on any block on any Bookie, it is important to identify 
these errors in a timely manner. This way, the bookie can remove/compact 
corrupted entries and re-replicate entries from other replicas, to maintain 
data integrity and reduce client errors. 
 
 Review comment:
   Please add/callout perf scope to this work. This background scanner needs to 
yield to real customer workload. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to