[ 
https://issues.apache.org/jira/browse/HDFS-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14252758#comment-14252758
 ] 

Hadoop QA commented on HDFS-7443:
---------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12688142/HDFS-7443.001.patch
  against trunk revision c4d9713.

    {color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

    {color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
                        Please justify why no new tests are needed for this 
patch.
                        Also please list what manual steps were performed to 
verify this patch.

    {color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

    {color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

    {color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

    {color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

    {color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

                  
org.apache.hadoop.hdfs.server.namenode.ha.TestBootstrapStandbyWithQJM
                  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
                  
org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration
                  org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA
                  org.apache.hadoop.fs.TestSymlinkHdfsFileContext

                                      The following test timeouts occurred in 
hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9079//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9079//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9079//console

This message is automatically generated.

> Datanode upgrade to BLOCKID_BASED_LAYOUT fails if duplicate block files are 
> present in the same volume
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-7443
>                 URL: https://issues.apache.org/jira/browse/HDFS-7443
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.6.0
>            Reporter: Kihwal Lee
>            Assignee: Colin Patrick McCabe
>            Priority: Blocker
>         Attachments: HDFS-7443.001.patch
>
>
> When we did an upgrade from 2.5 to 2.6 in a medium size cluster, about 4% of 
> datanodes were not coming up.  They treid data file layout upgrade for 
> BLOCKID_BASED_LAYOUT introduced in HDFS-6482, but failed.
> All failures were caused by {{NativeIO.link()}} throwing IOException saying 
> {{EEXIST}}.  The data nodes didn't die right away, but the upgrade was soon 
> retried when the block pool initialization was retried whenever 
> {{BPServiceActor}} was registering with the namenode.  After many retries, 
> datenodes terminated.  This would leave {{previous.tmp}} and {{current}} with 
> no {{VERSION}} file in the block pool slice storage directory.  
> Although {{previous.tmp}} contained the old {{VERSION}} file, the content was 
> in the new layout and the subdirs were all newly created ones.  This 
> shouldn't have happened because the upgrade-recovery logic in {{Storage}} 
> removes {{current}} and renames {{previous.tmp}} to {{current}} before 
> retrying.  All successfully upgraded volumes had old state preserved in their 
> {{previous}} directory.
> In summary there were two observed issues.
> - Upgrade failure with {{link()}} failing with {{EEXIST}}
> - {{previous.tmp}} contained not the content of original {{current}}, but 
> half-upgraded one.
> We did not see this in smaller scale test clusters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to