[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229993#comment-13229993
 ] 

Rakesh R commented on BOOKKEEPER-126:
-------------------------------------

Yes, I agree with you. Its good, if able to handle the under-replication in 
bookkeeper.

Following are the multiple thoughts comes to my mind, please go through this.

*Proposal-1)* As per my observation apart from 'bookie down' scenario(here it 
can automate admin tool), the failure of anyone of the following 'flush()' 
operation can leads to dataloss. Since it is async opr the client will be 
unaware about these failures, further entries will override the data and so 
only these entries needs to considered as 'under-replicated' and initiate 
under-replica action.
+Bookie.java+
{noformat}
try {
    ledgerCache.flushLedger(true);
} catch (IOException e) {
    LOG.error("Exception flushing Ledger", e);
    flushFailed = true;
}
try {
    entryLogger.flush();
} catch (IOException e) {
    LOG.error("Exception flushing entry logger", e);
    flushFailed = true;
}
{noformat}

*Proposal-2)* Initiate the recovery whenever the client finds any missing 
entries and then succesfully get the same from next bookie. 
Still there is a gap of dataloss, say some data got lost/corrupt and no read 
operation in near future.

*Proposal-3)* Daemon thread can be associated with every bookie and do 
periodical scanning of its own ledgers and its entries, if found any errors can 
contact ZK and tries to initiate replication of those entries.
In this case, it needs to build a mechanism to communicate between bookies, as 
per my understanding there is no inter-bookie protocol exists. Also the cost of 
scannig will be very high if the ledgers/entries are more :-)

-Rakesh
                
> EntryLogger doesn't detect when one of it's logfiles is corrupt
> ---------------------------------------------------------------
>
>                 Key: BOOKKEEPER-126
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-126
>             Project: Bookkeeper
>          Issue Type: Bug
>            Reporter: Ivan Kelly
>            Priority: Blocker
>             Fix For: 4.1.0
>
>
> If an entry log is corrupt, the bookie will ignore any entries past the 
> corruption. Quorum writes stops this being a problem at the moment, but we 
> should detect corruptions like this and rereplicate if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to