[ 
https://issues.apache.org/jira/browse/HDFS-13758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16568063#comment-16568063
 ] 

genericqa commented on HDFS-13758:
----------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
 6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 16s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 40s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 89m 54s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}156m 39s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.client.impl.TestBlockReaderLocal |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | HDFS-13758 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12934235/HDFS-13758.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux b5ad6ae19b5d 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 40ab8ee |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24691/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24691/testReport/ |
| Max. process+thread count | 3454 (vs. ulimit of 10000) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/24691/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> DatanodeManager should throw exception if it has BlockRecoveryCommand but the 
> block is not under construction
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-13758
>                 URL: https://issues.apache.org/jira/browse/HDFS-13758
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Wei-Chiu Chuang
>            Assignee: chencan
>            Priority: Major
>         Attachments: HDFS-10240 scenarios.jpg, HDFS-13758.001.patch
>
>
> In Hadoop 3, HDFS-8909 added an assertion assumption that if a 
> BlockRecoveryCommand exists for a block, the block is under construction.
>  
> {code:title=DatanodeManager#getBlockRecoveryCommand()}
>   BlockRecoveryCommand brCommand = new BlockRecoveryCommand(blocks.length);
>   for (BlockInfo b : blocks) {
>     BlockUnderConstructionFeature uc = b.getUnderConstructionFeature();
>     assert uc != null;
> ...
> {code}
> This assertion accidentally fixed one of the possible scenario of HDFS-10240 
> data corruption, if a recoverLease() is made immediately followed by a 
> close(), before DataNodes have the chance to heartbeat.
> In a unit test you'll get:
> {noformat}
> 2018-07-19 09:43:41,331 [IPC Server handler 9 on 57890] WARN  ipc.Server 
> (Server.java:logException(2724)) - IPC Server handler 9 on 57890, call 
> Call#41 Retry#0 
> org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.sendHeartbeat from 
> 127.0.0.1:57903
> java.lang.AssertionError
>       at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getBlockRecoveryCommand(DatanodeManager.java:1551)
>       at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.handleHeartbeat(DatanodeManager.java:1661)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.handleHeartbeat(FSNamesystem.java:3865)
>       at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.sendHeartbeat(NameNodeRpcServer.java:1504)
>       at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.sendHeartbeat(DatanodeProtocolServerSideTranslatorPB.java:119)
>       at 
> org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:31660)
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
>       at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
>       at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
>       at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:422)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1689)
>       at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
> {noformat}
> I propose to change this assertion even though it address the data 
> corruption, because:
> # We should throw an more meaningful exception than an NPE
> # on a production cluster, the assert is ignored, and you'll get a more 
> noticeable NPE. Future HDFS developers might fix this NPE, causing 
> regression. An NPE is typically not captured and handled, so there's a chance 
> to result in internal state inconsistency.
> # It doesn't address all possible scenarios of HDFS-10240. A proper fix 
> should reject close() if the block is being recovered.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to