[ 
https://issues.apache.org/jira/browse/HDFS-12749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16372717#comment-16372717
 ] 

genericqa commented on HDFS-12749:
----------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 16m 
39s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2.7 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
 0s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
57s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
43s{color} | {color:green} branch-2.7 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 60 line(s) that end in whitespace. Use 
git apply --whitespace=fix <<patch_file>>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 94m 48s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}135m 28s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Unreaped Processes | hadoop-hdfs:24 |
| Failed junit tests | hadoop.hdfs.TestFileCreationEmpty |
| Timed out junit tests | org.apache.hadoop.hdfs.TestDatanodeRegistration |
|   | org.apache.hadoop.hdfs.TestDFSClientFailover |
|   | org.apache.hadoop.hdfs.TestSetrepIncreasing |
|   | org.apache.hadoop.hdfs.TestDatanodeDeath |
|   | org.apache.hadoop.hdfs.TestDFSClientRetries |
|   | org.apache.hadoop.hdfs.TestDFSFinalize |
|   | org.apache.hadoop.hdfs.TestHDFSFileSystemContract |
|   | org.apache.hadoop.hdfs.TestDatanodeStartupFixesLegacyStorageIDs |
|   | org.apache.hadoop.hdfs.TestDecommission |
|   | org.apache.hadoop.hdfs.TestSeekBug |
|   | org.apache.hadoop.hdfs.TestDatanodeReport |
|   | org.apache.hadoop.hdfs.web.TestWebHDFS |
|   | org.apache.hadoop.hdfs.web.TestWebHDFSXAttr |
|   | org.apache.hadoop.hdfs.TestDFSRollback |
|   | org.apache.hadoop.hdfs.TestMiniDFSCluster |
|   | org.apache.hadoop.hdfs.TestDistributedFileSystem |
|   | org.apache.hadoop.hdfs.web.TestWebHDFSForHA |
|   | org.apache.hadoop.hdfs.TestDFSClientExcludedNodes |
|   | org.apache.hadoop.hdfs.TestBalancerBandwidth |
|   | org.apache.hadoop.hdfs.TestSetTimes |
|   | org.apache.hadoop.hdfs.TestDFSShell |
|   | org.apache.hadoop.hdfs.TestDataTransferProtocol |
|   | org.apache.hadoop.hdfs.web.TestWebHDFSAcl |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ea57d10 |
| JIRA Issue | HDFS-12749 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12911516/HDFS-12749-branch-2.7.002.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux bb5dee397f8e 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-2.7 / 1f2ab8b |
| maven | version: Apache Maven 3.0.5 |
| Default Java | 1.7.0_151 |
| findbugs | v3.0.0 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/23150/artifact/out/whitespace-eol.txt
 |
| Unreaped Processes Log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/23150/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-reaper.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/23150/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/23150/testReport/ |
| Max. process+thread count | 5397 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/23150/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> DN may not send block report to NN after NN restart
> ---------------------------------------------------
>
>                 Key: HDFS-12749
>                 URL: https://issues.apache.org/jira/browse/HDFS-12749
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: 2.7.1, 2.8.3, 2.7.5, 3.0.0, 2.9.1
>            Reporter: TanYuxin
>            Priority: Major
>         Attachments: HDFS-12749-branch-2.7.002.patch, HDFS-12749.001.patch
>
>
> Now our cluster have thousands of DN, millions of files and blocks. When NN 
> restart, NN's load is very high.
> After NN restart´╝îDN will call BPServiceActor#reRegister method to register. 
> But register RPC will get a IOException since NN is busy dealing with Block 
> Report.  The exception is caught at BPServiceActor#processCommand.
> Next is the caught IOException:
> {code:java}
> WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Error processing 
> datanode Command
> java.io.IOException: Failed on local exception: java.io.IOException: 
> java.net.SocketTimeoutException: 60000 millis timeout while waiting for 
> channel to be ready for read. ch : java.nio.channels.SocketChannel[connected 
> local=/DataNode_IP:Port remote=NameNode_Host/IP:Port]; Host Details : local 
> host is: "DataNode_Host/Datanode_IP"; destination host is: 
> "NameNode_Host":Port;
>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:773)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1474)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1407)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
>         at com.sun.proxy.$Proxy13.registerDatanode(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.registerDatanode(DatanodeProtocolClientSideTranslatorPB.java:126)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.register(BPServiceActor.java:793)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.reRegister(BPServiceActor.java:926)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:604)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:898)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:711)
>         at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:864)
>         at java.lang.Thread.run(Thread.java:745)
> {code}
> The un-catched IOException breaks BPServiceActor#register, and the Block 
> Report can not be sent immediately. 
> {code}
>   /**
>    * Register one bp with the corresponding NameNode
>    * <p>
>    * The bpDatanode needs to register with the namenode on startup in order
>    * 1) to report which storage it is serving now and 
>    * 2) to receive a registrationID
>    *  
>    * issued by the namenode to recognize registered datanodes.
>    * 
>    * @param nsInfo current NamespaceInfo
>    * @see FSNamesystem#registerDatanode(DatanodeRegistration)
>    * @throws IOException
>    */
>   void register(NamespaceInfo nsInfo) throws IOException {
>     // The handshake() phase loaded the block pool storage
>     // off disk - so update the bpRegistration object from that info
>     DatanodeRegistration newBpRegistration = bpos.createRegistration();
>     LOG.info(this + " beginning handshake with NN");
>     while (shouldRun()) {
>       try {
>         // Use returned registration from namenode with updated fields
>         newBpRegistration = bpNamenode.registerDatanode(newBpRegistration);
>         newBpRegistration.setNamespaceInfo(nsInfo);
>         bpRegistration = newBpRegistration;
>         break;
>       } catch(EOFException e) {  // namenode might have just restarted
>         LOG.info("Problem connecting to server: " + nnAddr + " :"
>             + e.getLocalizedMessage());
>         sleepAndLogInterrupts(1000, "connecting to server");
>       } catch(SocketTimeoutException e) {  // namenode is busy
>         LOG.info("Problem connecting to server: " + nnAddr);
>         sleepAndLogInterrupts(1000, "connecting to server");
>       }
>     }
>     
>     LOG.info("Block pool " + this + " successfully registered with NN");
>     bpos.registrationSucceeded(this, bpRegistration);
>     // random short delay - helps scatter the BR from all DNs
>     scheduler.scheduleBlockReport(dnConf.initialBlockReportDelay);
>   }
> {code}
> But NameNode has processed registerDatanode successfully, so it won't ask DN 
> to re-register again



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to