[ https://issues.apache.org/jira/browse/HDFS-2738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aaron T. Myers resolved HDFS-2738. ---------------------------------- Resolution: Fixed Hadoop Flags: Reviewed Thanks a lot for the review, Todd. I've just committed this. > FSEditLog.selectinputStreams is reading through in-progress streams even when > non-in-progress are requested > ----------------------------------------------------------------------------------------------------------- > > Key: HDFS-2738 > URL: https://issues.apache.org/jira/browse/HDFS-2738 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ha, name-node > Affects Versions: HA branch (HDFS-1623) > Reporter: Todd Lipcon > Assignee: Aaron T. Myers > Priority: Blocker > Attachments: HDFS-2738-HDFS-1623.patch, HDFS-2738-HDFS-1623.patch, > HDFS-2738-HDFS-1623.patch > > > The new code in HDFS-1580 is causing an issue with selectInputStreams in the > HA context. When the active is writing to the shared edits, > selectInputStreams is called on the standby. This ends up calling > {{journalSet.getInputStream}} but doesn't pass the {{inProgressOk=false}} > flag. So, {{getInputStream}} ends up reading and validating the in-progress > stream unnecessarily. Since the validation results are no longer properly > cached, {{findMaxTransaction}} also re-validates the in-progress stream, and > then breaks the corruption check in this code. The end result is a lot of > errors like: > 2011-12-30 16:45:02,521 ERROR namenode.FileJournalManager > (FileJournalManager.java:getNumberOfTransactions(266)) - Gap in transactions, > max txnid is 579, 0 txns from 578 > 2011-12-30 16:45:02,521 INFO ha.EditLogTailer (EditLogTailer.java:run(163)) > - Got error, will try again. > java.io.IOException: No non-corrupt logs for txid 578 > at > org.apache.hadoop.hdfs.server.namenode.JournalSet.getInputStream(JournalSet.java:229) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1081) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:115) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.access$0(EditLogTailer.java:100) > at > org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:154) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira