[ 
https://issues.apache.org/jira/browse/HDFS-10730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-10730:
------------------------------
    Description: 
In HDFS-10723, [~kihwal] suggested that 
{quote}
it is not a good idea to hard-code or reuse the same port number in unit tests. 
Because the jenkins slave can run multiple jobs at the same time.
{quote}
Then I collected some tests which failed by this reason in recent jenkin 
buildings.
Finally I found these two failed test 
{{TestFileChecksum.testStripedFileChecksumWithMissedDataBlocks1}}(https://builds.apache.org/job/PreCommit-HDFS-Build/16301/testReport/)
 and 
{{TestDecommissionWithStriped.testDecommissionWithURBlockForSameBlockGroup}}(https://builds.apache.org/job/PreCommit-HDFS-Build/16257/testReport/).

The stack infos:
{code}
java.net.BindException: Problem binding to [localhost:57241] 
java.net.BindException: Address already in use; For more details see:  
http://wiki.apache.org/hadoop/BindException
        at sun.nio.ch.Net.bind0(Native Method)
        at sun.nio.ch.Net.bind(Net.java:433)
        at sun.nio.ch.Net.bind(Net.java:425)
        at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
        at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
        at org.apache.hadoop.ipc.Server.bind(Server.java:538)
        at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:811)
        at org.apache.hadoop.ipc.Server.<init>(Server.java:2611)
        at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:958)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server.<init>(ProtobufRpcEngine.java:562)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:537)
        at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:800)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.initIpcServer(DataNode.java:953)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1361)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:488)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2658)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2546)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2593)
        at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2259)
        at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2298)
        at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2278)
        at 
org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:482)
        at 
org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocks1(TestFileChecksum.java:182)
{code}

{code}
java.net.BindException: Problem binding to [localhost:54191] 
java.net.BindException: Address already in use; For more details see:  
http://wiki.apache.org/hadoop/BindException
        at sun.nio.ch.Net.bind0(Native Method)
        at sun.nio.ch.Net.bind(Net.java:433)
        at sun.nio.ch.Net.bind(Net.java:425)
        at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
        at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
        at org.apache.hadoop.ipc.Server.bind(Server.java:530)
        at org.apache.hadoop.ipc.Server.bind(Server.java:519)
        at 
org.apache.hadoop.hdfs.net.TcpPeerServer.<init>(TcpPeerServer.java:52)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.initDataXceiver(DataNode.java:1082)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1348)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:488)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2658)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2546)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2593)
        at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2259)
        at 
org.apache.hadoop.hdfs.TestDecommissionWithStriped.testDecommissionWithURBlockForSameBlockGroup(TestDecommissionWithStriped.java:255)
{code}

We can make a change to update the param value for {{keepPort}} from
{code}
cluster.restartDataNode(dnp, true);
{code}
to
{code}
cluster.restartDataNode(dnp, false);
{code}

  was:
In HDFS-10723, [~kihwal] suggested that 
{quote}
it is not a good idea to hard-core or reuse the same port number in unit tests. 
Because the jenkins slave can run multiple jobs at the same time.
{quote}
Then I collected some tests which failed by this reason in recent jenkin 
buildings.
Finally I found these two failed test 
{{TestFileChecksum.testStripedFileChecksumWithMissedDataBlocks1}}(https://builds.apache.org/job/PreCommit-HDFS-Build/16301/testReport/)
 and 
{{TestDecommissionWithStriped.testDecommissionWithURBlockForSameBlockGroup}}(https://builds.apache.org/job/PreCommit-HDFS-Build/16257/testReport/).

The stack infos:
{code}
java.net.BindException: Problem binding to [localhost:57241] 
java.net.BindException: Address already in use; For more details see:  
http://wiki.apache.org/hadoop/BindException
        at sun.nio.ch.Net.bind0(Native Method)
        at sun.nio.ch.Net.bind(Net.java:433)
        at sun.nio.ch.Net.bind(Net.java:425)
        at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
        at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
        at org.apache.hadoop.ipc.Server.bind(Server.java:538)
        at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:811)
        at org.apache.hadoop.ipc.Server.<init>(Server.java:2611)
        at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:958)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server.<init>(ProtobufRpcEngine.java:562)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:537)
        at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:800)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.initIpcServer(DataNode.java:953)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1361)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:488)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2658)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2546)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2593)
        at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2259)
        at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2298)
        at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2278)
        at 
org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:482)
        at 
org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocks1(TestFileChecksum.java:182)
{code}

{code}
java.net.BindException: Problem binding to [localhost:54191] 
java.net.BindException: Address already in use; For more details see:  
http://wiki.apache.org/hadoop/BindException
        at sun.nio.ch.Net.bind0(Native Method)
        at sun.nio.ch.Net.bind(Net.java:433)
        at sun.nio.ch.Net.bind(Net.java:425)
        at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
        at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
        at org.apache.hadoop.ipc.Server.bind(Server.java:530)
        at org.apache.hadoop.ipc.Server.bind(Server.java:519)
        at 
org.apache.hadoop.hdfs.net.TcpPeerServer.<init>(TcpPeerServer.java:52)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.initDataXceiver(DataNode.java:1082)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1348)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:488)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2658)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2546)
        at 
org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2593)
        at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2259)
        at 
org.apache.hadoop.hdfs.TestDecommissionWithStriped.testDecommissionWithURBlockForSameBlockGroup(TestDecommissionWithStriped.java:255)
{code}

We can make a change to update the param value for {{keepPort}} from
{code}
cluster.restartDataNode(dnp, true);
{code}
to
{code}
cluster.restartDataNode(dnp, false);
{code}


> Fix some failed tests due to BindException
> ------------------------------------------
>
>                 Key: HDFS-10730
>                 URL: https://issues.apache.org/jira/browse/HDFS-10730
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Yiqun Lin
>            Assignee: Yiqun Lin
>         Attachments: HDFS-10730.001.patch
>
>
> In HDFS-10723, [~kihwal] suggested that 
> {quote}
> it is not a good idea to hard-code or reuse the same port number in unit 
> tests. Because the jenkins slave can run multiple jobs at the same time.
> {quote}
> Then I collected some tests which failed by this reason in recent jenkin 
> buildings.
> Finally I found these two failed test 
> {{TestFileChecksum.testStripedFileChecksumWithMissedDataBlocks1}}(https://builds.apache.org/job/PreCommit-HDFS-Build/16301/testReport/)
>  and 
> {{TestDecommissionWithStriped.testDecommissionWithURBlockForSameBlockGroup}}(https://builds.apache.org/job/PreCommit-HDFS-Build/16257/testReport/).
> The stack infos:
> {code}
> java.net.BindException: Problem binding to [localhost:57241] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
>       at sun.nio.ch.Net.bind0(Native Method)
>       at sun.nio.ch.Net.bind(Net.java:433)
>       at sun.nio.ch.Net.bind(Net.java:425)
>       at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
>       at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>       at org.apache.hadoop.ipc.Server.bind(Server.java:538)
>       at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:811)
>       at org.apache.hadoop.ipc.Server.<init>(Server.java:2611)
>       at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:958)
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.<init>(ProtobufRpcEngine.java:562)
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:537)
>       at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:800)
>       at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initIpcServer(DataNode.java:953)
>       at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1361)
>       at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:488)
>       at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2658)
>       at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2546)
>       at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2593)
>       at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2259)
>       at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2298)
>       at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2278)
>       at 
> org.apache.hadoop.hdfs.TestFileChecksum.getFileChecksum(TestFileChecksum.java:482)
>       at 
> org.apache.hadoop.hdfs.TestFileChecksum.testStripedFileChecksumWithMissedDataBlocks1(TestFileChecksum.java:182)
> {code}
> {code}
> java.net.BindException: Problem binding to [localhost:54191] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
>       at sun.nio.ch.Net.bind0(Native Method)
>       at sun.nio.ch.Net.bind(Net.java:433)
>       at sun.nio.ch.Net.bind(Net.java:425)
>       at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
>       at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>       at org.apache.hadoop.ipc.Server.bind(Server.java:530)
>       at org.apache.hadoop.ipc.Server.bind(Server.java:519)
>       at 
> org.apache.hadoop.hdfs.net.TcpPeerServer.<init>(TcpPeerServer.java:52)
>       at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initDataXceiver(DataNode.java:1082)
>       at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1348)
>       at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:488)
>       at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2658)
>       at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2546)
>       at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2593)
>       at 
> org.apache.hadoop.hdfs.MiniDFSCluster.restartDataNode(MiniDFSCluster.java:2259)
>       at 
> org.apache.hadoop.hdfs.TestDecommissionWithStriped.testDecommissionWithURBlockForSameBlockGroup(TestDecommissionWithStriped.java:255)
> {code}
> We can make a change to update the param value for {{keepPort}} from
> {code}
> cluster.restartDataNode(dnp, true);
> {code}
> to
> {code}
> cluster.restartDataNode(dnp, false);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to