[
https://issues.apache.org/jira/browse/HDFS-8763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14716339#comment-14716339
]
Hadoop QA commented on HDFS-8763:
---------------------------------
\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch | 15m 51s | Findbugs (version ) appears to
be broken on trunk. |
| {color:green}+1{color} | @author | 0m 0s | The patch does not contain any
@author tags. |
| {color:green}+1{color} | tests included | 0m 0s | The patch appears to
include 4 new or modified test files. |
| {color:green}+1{color} | javac | 7m 51s | There were no new javac warning
messages. |
| {color:green}+1{color} | javadoc | 10m 9s | There were no new javadoc
warning messages. |
| {color:green}+1{color} | release audit | 0m 23s | The applied patch does
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle | 0m 33s | There were no new checkstyle
issues. |
| {color:red}-1{color} | whitespace | 0m 0s | The patch has 1 line(s) that
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install | 1m 37s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with
eclipse:eclipse. |
| {color:red}-1{color} | findbugs | 2m 40s | The patch appears to introduce 4
new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native | 3m 16s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 67m 21s | Tests failed in hadoop-hdfs. |
| | | 110m 17s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs |
| Failed unit tests | hadoop.hdfs.TestLeaseRecovery |
| Timed out tests | org.apache.hadoop.hdfs.TestInjectionForSimulatedStorage |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL |
http://issues.apache.org/jira/secure/attachment/12752681/HDFS-8763.01.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 4cbbfa2 |
| whitespace |
https://builds.apache.org/job/PreCommit-HDFS-Build/12158/artifact/patchprocess/whitespace.txt
|
| Findbugs warnings |
https://builds.apache.org/job/PreCommit-HDFS-Build/12158/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
|
| hadoop-hdfs test log |
https://builds.apache.org/job/PreCommit-HDFS-Build/12158/artifact/patchprocess/testrun_hadoop-hdfs.txt
|
| Test Results |
https://builds.apache.org/job/PreCommit-HDFS-Build/12158/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output |
https://builds.apache.org/job/PreCommit-HDFS-Build/12158/console |
This message was automatically generated.
> After file closed, a race condition between IBR of 3rd replica of lastBlock
> and ReplicationMonitor
> --------------------------------------------------------------------------------------------------
>
> Key: HDFS-8763
> URL: https://issues.apache.org/jira/browse/HDFS-8763
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: HDFS
> Affects Versions: 2.4.0
> Reporter: jiangyu
> Assignee: Walter Su
> Priority: Minor
> Attachments: HDFS-8763.01.patch
>
>
> -For our cluster, the NameNode is always very busy, so for every incremental
> block report , the contention of lock is heavy.-
> -The logic of incremental block report is as follow, client send block to dn1
> and dn1 mirror to dn2, dn2 mirror to dn3. After finish this block, all
> datanode will report the newly received block to namenode. In NameNode side,
> all will go to the method processIncrementalBlockReport in BlockManager
> class. But the status of the block reported from dn2,dn3 is RECEIVING_BLOCK,
> for dn1 is RECEIED_BLOCK. It is okay if dn2, dn3 report before dn1(that is
> common), but in some busy environment, it is easy to find dn1 report before
> dn2 or dn3, let’s assume dn2 report first, dn1 report second, and dn3 report
> third.-
> -So dn1 will addStoredBlock and find the replica of this block is not reach
> the the original number(which is 3), and the block will add to
> neededReplications construction and soon ask some node in pipeline (dn1 or
> dn2)to replica it dn4 . After sometime, dn4 and dn3 all report this block,
> then choose one node to invalidate.-
> Here is one log i found in our cluster:
> {noformat}
> 2015-07-08 01:05:34,675 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> allocateBlock:
> /logs/***_bigdata_spam/logs/application_1435099124107_470749/xx.xx.4.62_45454.tmp.
> BP-1386326728-xx.xx.2.131-1382089338395
> blk_3194502674_2121080184{blockUCState=UNDER_CONSTRUCTION,
> primaryNodeIndex=-1,
> replicas=[ReplicaUnderConstruction[[DISK]DS-a7c0f8f6-2399-4980-9479-efa08487b7b3:NORMAL|RBW],
>
> ReplicaUnderConstruction[[DISK]DS-c75145a0-ed63-4180-87ee-d48ccaa647c5:NORMAL|RBW],
>
> ReplicaUnderConstruction[[DISK]DS-15a4dc8e-5b7d-449f-a941-6dced45e6f07:NORMAL|RBW]]}
> 2015-07-08 01:05:34,689 INFO BlockStateChange: BLOCK* addStoredBlock:
> blockMap updated: xx.xx.7.75:50010 is added to
> blk_3194502674_2121080184{blockUCState=UNDER_CONSTRUCTION,
> primaryNodeIndex=-1,
> replicas=[ReplicaUnderConstruction[[DISK]DS-15a4dc8e-5b7d-449f-a941-6dced45e6f07:NORMAL|RBW],
>
> ReplicaUnderConstruction[[DISK]DS-74ed264f-da43-4cc3-9fa9-164ba99f752a:NORMAL|RBW],
>
> ReplicaUnderConstruction[[DISK]DS-56121ce1-8991-45b3-95bc-2a5357991512:NORMAL|RBW]]}
> size 0
> 2015-07-08 01:05:34,689 INFO BlockStateChange: BLOCK* addStoredBlock:
> blockMap updated: xx.xx.4.62:50010 is added to
> blk_3194502674_2121080184{blockUCState=UNDER_CONSTRUCTION,
> primaryNodeIndex=-1,
> replicas=[ReplicaUnderConstruction[[DISK]DS-15a4dc8e-5b7d-449f-a941-6dced45e6f07:NORMAL|RBW],
>
> ReplicaUnderConstruction[[DISK]DS-74ed264f-da43-4cc3-9fa9-164ba99f752a:NORMAL|RBW],
>
> ReplicaUnderConstruction[[DISK]DS-56121ce1-8991-45b3-95bc-2a5357991512:NORMAL|RBW]]}
> size 0
> 2015-07-08 01:05:35,003 INFO BlockStateChange: BLOCK* ask xx.xx.4.62:50010 to
> replicate blk_3194502674_2121080184 to datanode(s) xx.xx.4.65:50010
> 2015-07-08 01:05:35,403 INFO BlockStateChange: BLOCK* addStoredBlock:
> blockMap updated: xx.xx.7.73:50010 is added to blk_3194502674_2121080184 size
> 67750
> 2015-07-08 01:05:35,833 INFO BlockStateChange: BLOCK* addStoredBlock:
> blockMap updated: xx.xx.4.65:50010 is added to blk_3194502674_2121080184 size
> 67750
> 2015-07-08 01:05:35,833 INFO BlockStateChange: BLOCK* InvalidateBlocks: add
> blk_3194502674_2121080184 to xx.xx.7.75:50010
> 2015-07-08 01:05:35,833 INFO BlockStateChange: BLOCK* chooseExcessReplicates:
> (xx.xx.7.75:50010, blk_3194502674_2121080184) is added to invalidated blocks
> set
> 2015-07-08 01:05:35,852 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> InvalidateBlocks: ask xx.xx.7.75:50010 to delete [blk_3194502674_2121080184,
> blk_3194497594_2121075104]
> {noformat}
> Some day, the number of this situation can be 400000, that is not good for
> the performance and waste network band.
> Our base version is hadoop 2.4 and i have checked hadoop 2.7.1 didn’t find
> any difference.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)