[
https://issues.apache.org/jira/browse/HBASE-6401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416241#comment-13416241
]
nkeywal commented on HBASE-6401:
--------------------------------
Yeah, I wrote it for hadoop, but then saw it was fixed on their trunk, so I
didn't created it. But I haven't found a related jira. We could want to fix it
on 1.0.3, adding the ordering I was mentionning on the dev list (put the DN on
the same box as the RS as the last locations to make sure we don't use a dead
DN).
> HBase may lose edits after a crash if used with HDFS 1.0.3 or older
> -------------------------------------------------------------------
>
> Key: HBASE-6401
> URL: https://issues.apache.org/jira/browse/HBASE-6401
> Project: HBase
> Issue Type: Bug
> Components: regionserver
> Affects Versions: 0.96.0
> Environment: all
> Reporter: nkeywal
> Priority: Critical
> Attachments: TestReadAppendWithDeadDN.java
>
>
> This comes from a hdfs bug, fixed in some hdfs versions. I haven't found the
> hdfs jira for this.
> Context: HBase Write Ahead Log features. This is using hdfs append. If the
> node crashes, the file that was written is read by other processes to replay
> the action.
> - So we have in hdfs one (dead) process writing with another process reading.
> - But, despite the call to syncFs, we don't always see the data when we have
> a dead node. It seems to be because the call in DFSClient#updateBlockInfo
> ignores the ipc errors and set the length to 0.
> - So we may miss all the writes to the last block if we try to connect to the
> dead DN.
> hdfs 1.0.3, branch-1 or branch-1-win: we have the issue
> http://svn.apache.org/viewvc/hadoop/common/branches/branch-1/src/hdfs/org/apache/hadoop/hdfs/DFSClient.java?revision=1359853&view=markup
> hdfs branch-2 or trunk: we should not have the issue (but not tested)
> http://svn.apache.org/viewvc/hadoop/common/branches/branch-2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java?view=markup
> The attached test will fail ~50 of the time.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira