[
https://issues.apache.org/jira/browse/HBASE-6719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lars Hofhansl updated HBASE-6719:
---------------------------------
Attachment: 6719.txt
Can we rewrite the patch this way?
One concern I have: What if the file is actually gone for some reason? In that
case it seems we'd never stop retrying.
@J-D: what do you think?
> [replication] Data will lose if open a Hlog failed more than
> maxRetriesMultiplier
> ---------------------------------------------------------------------------------
>
> Key: HBASE-6719
> URL: https://issues.apache.org/jira/browse/HBASE-6719
> Project: HBase
> Issue Type: Bug
> Components: replication
> Affects Versions: 0.94.1
> Reporter: terry zhang
> Assignee: terry zhang
> Priority: Critical
> Fix For: 0.94.3
>
> Attachments: 6719.txt, hbase-6719.patch
>
>
> Please Take a look below code
> {code:title=ReplicationSource.java|borderStyle=solid}
> protected boolean openReader(int sleepMultiplier) {
> {
> ...
> catch (IOException ioe) {
> LOG.warn(peerClusterZnode + " Got: ", ioe);
> // TODO Need a better way to determinate if a file is really gone but
> // TODO without scanning all logs dir
> if (sleepMultiplier == this.maxRetriesMultiplier) {
> LOG.warn("Waited too long for this file, considering dumping");
> return !processEndOfFile(); // Open a file failed over
> maxRetriesMultiplier(default 10)
> }
> }
> return true;
> ...
> }
> protected boolean processEndOfFile() {
> if (this.queue.size() != 0) { // Skipped this Hlog . Data loss
> this.currentPath = null;
> this.position = 0;
> return true;
> } else if (this.queueRecovered) { // Terminate Failover Replication
> source thread ,data loss
> this.manager.closeRecoveredQueue(this);
> LOG.info("Finished recovering the queue");
> this.running = false;
> return true;
> }
> return false;
> }
> {code}
> Some Time HDFS will meet some problem but actually Hlog file is OK , So after
> HDFS back ,Some data will lose and can not find them back in slave cluster.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira