[jira] [Updated] (HDFS-3605) Block mistakenly marked corrupt during edit log catchup phase of failover

Todd Lipcon (JIRA) Fri, 13 Jul 2012 16:58:36 -0700

     [ 
https://issues.apache.org/jira/browse/HDFS-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Todd Lipcon updated HDFS-3605:
------------------------------

    Attachment: hdfs-3605.txt

Hey Uma. I took your unit test (thanks) and modified it to be minimal and 
remove sleeps. Then I prepared a patch with a slightly different approach: I 
now use a boolean inside BlockManager to determine whether to do the block 
postponement. I think this is a bit simpler, and still fixes the issue.

Am I missing another case with this fix? The optimization you did might be 
useful but per above I think we can make this minimal and optimize separately. 
I don't think it's required for the bugfix.

This patch isn't quite final - I want to add a few javadocs, etc.
                
> Block mistakenly marked corrupt during edit log catchup phase of failover
> -------------------------------------------------------------------------
>
>                 Key: HDFS-3605
>                 URL: https://issues.apache.org/jira/browse/HDFS-3605
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: ha, name-node
>    Affects Versions: 2.0.0-alpha, 2.0.1-alpha
>            Reporter: Brahma Reddy Battula
>            Assignee: Todd Lipcon
>         Attachments: HDFS-3605.patch, TestAppendBlockMiss.java, hdfs-3605.txt
>
>
> Open file for append
> Write data and sync.
> After next log roll and editlog tailing in standbyNN close the append stream.
> Call append multiple times on the same file, before next editlog roll.
> Now abruptly kill the current active namenode.
> Here block is missed..
> this may be because of All latest blocks were queued in StandBy Namenode. 
> During failover, first OP_CLOSE was processing the pending queue and adding 
> the block to corrupted block. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-3605) Block mistakenly marked corrupt during edit log catchup phase of failover

Reply via email to