[
https://issues.apache.org/jira/browse/HBASE-6719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13450331#comment-13450331
]
stack commented on HBASE-6719:
------------------------------
Its similar I'd say but can't fix same way, right Terry?
Looking at your patch:
What happens when we return false? We'll retry? That seems right.
Please make your formatting suit the rest of the file (spaces after 'if' and
before opening bracket, etc.)
Should this be a LOG.fatal? Fatal implies shutdown. WARN if we are going to
go around again on this file?
You log same message three times though circumstance is different each time you
log. Change the log to suit context?
> [replication] Data will lose if open a Hlog failed more than
> maxRetriesMultiplier
> ---------------------------------------------------------------------------------
>
> Key: HBASE-6719
> URL: https://issues.apache.org/jira/browse/HBASE-6719
> Project: HBase
> Issue Type: Bug
> Components: replication
> Affects Versions: 0.94.1
> Reporter: terry zhang
> Assignee: terry zhang
> Priority: Critical
> Fix For: 0.94.2
>
> Attachments: hbase-6719.patch
>
>
> Please Take a look below code
> {code:title=ReplicationSource.java|borderStyle=solid}
> protected boolean openReader(int sleepMultiplier) {
> {
> ...
> catch (IOException ioe) {
> LOG.warn(peerClusterZnode + " Got: ", ioe);
> // TODO Need a better way to determinate if a file is really gone but
> // TODO without scanning all logs dir
> if (sleepMultiplier == this.maxRetriesMultiplier) {
> LOG.warn("Waited too long for this file, considering dumping");
> return !processEndOfFile(); // Open a file failed over
> maxRetriesMultiplier(default 10)
> }
> }
> return true;
> ...
> }
> protected boolean processEndOfFile() {
> if (this.queue.size() != 0) { // Skipped this Hlog . Data loss
> this.currentPath = null;
> this.position = 0;
> return true;
> } else if (this.queueRecovered) { // Terminate Failover Replication
> source thread ,data loss
> this.manager.closeRecoveredQueue(this);
> LOG.info("Finished recovering the queue");
> this.running = false;
> return true;
> }
> return false;
> }
> {code}
> Some Time HDFS will meet some problem but actually Hlog file is OK , So after
> HDFS back ,Some data will lose and can not find them back in slave cluster.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira