[jira] [Updated] (HDFS-15414) java.net.SocketException: Original Exception : java.io.IOException: Broken pipe

2020-06-15 Thread YCozy (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YCozy updated HDFS-15414:
-
Description: 
We observed this exception in a DataNode's log while we are not shutting down 
any nodes in the cluster. Specifically, we have a cluster with 3 DataNodes 
(DN1, DN2, DN3) and 2 NameNodes (NN1, NN2). At some point, this exception 
occurs in DN3's log:
{noformat}
2020-06-08 21:53:03,373 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(127.0.0.1:9666, 
datanodeUuid=4408ff04-e406-4ccc-bd5c-8516ad57ec21, infoPort=9664, 
infoSecurePort=0, ipcPort=9667, 
storageInfo=lv=-57;cid=CID-c816c4ea-a559-4fd5-9b3a-b5994dc3a5fa;nsid=34747155;c=1591653120007)
 Starting thread to transfer 
BP-553302063-172.17.0.3-1591653120007:blk_1073741825_1002 to 127.0.0.1:9766
2020-06-08 21:53:03,373 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(127.0.0.1:9666, 
datanodeUuid=4408ff04-e406-4ccc-bd5c-8516ad57ec21, infoPort=9664, 
infoSecurePort=0, ipcPort=9667, 
storageInfo=lv=-57;cid=CID-c816c4ea-a559-4fd5-9b3a-b5994dc3a5fa;nsid=34747155;c=1591653120007)
 Starting thread to transfer 
BP-553302063-172.17.0.3-1591653120007:blk_1073741825_1002 to 127.0.0.1:9766
2020-06-08 21:53:03,381 INFO 
org.apache.hadoop.hdfs.server.datanode.checker.ThrottledAsyncChecker: 
Scheduling a check for /app/dn3/current
2020-06-08 21:53:03,383 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(127.0.0.1:9666, 
datanodeUuid=4408ff04-e406-4ccc-bd5c-8516ad57ec21, infoPort=9664, 
infoSecurePort=0, ipcPort=9667, 
storageInfo=lv=-57;cid=CID-c816c4ea-a559-4fd5-9b3a-b5994dc3a5fa;nsid=34747155;c=1591653120007):Failed
 to transfer BP-553302063-172.17.0.3-1591653120007:blk_1073741825_1002 to 
127.0.0.1:9766 got
java.net.SocketException: Original Exception : java.io.IOException: Broken pipe
at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
at 
sun.nio.ch.FileChannelImpl.transferToDirectlyInternal(FileChannelImpl.java:428)
at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:493)
at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:605)
at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:223)
 
at 
org.apache.hadoop.hdfs.server.datanode.FileIoProvider.transferToSocketFully(FileIoProvider.java:280)
 
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:620)
 
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.doSendBlock(BlockSender.java:804)
 
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:751)
 
at 
org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:2469)
 
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Broken pipe 
... 11 more{noformat}
Port 9766 is DN2's address. 

Around the same time, we observe the following exceptions in DN2's log:
{noformat}
2020-06-08 21:53:03,379 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Receiving BP-553302063-172.17.0.3-1591653120007:blk_1073741825_1002 src: 
/127.0.0.1:47618 dest: /127.0.0.1:9766
2020-06-08 21:53:03,379 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
opWriteBlock BP-553302063-172.17.0.3-1591653120007:blk_1073741825_1002 received 
exception org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: 
Block BP-553302063-172.17.0.3-1591653120007:blk_1073741825_1002 already exists 
in state FINALIZED and thus cannot be created.
2020-06-08 21:53:03,379 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
007e9b383989:9766:DataXceiver error processing WRITE_BLOCK operation  src: 
/127.0.0.1:47618 dst: /127.0.0.1:9766; 
org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block 
BP-553302063-172.17.0.3-1591653120007:blk_1073741825_1002 already exists in 
state FINALIZED and thus cannot be created.{noformat}
However, this exception doesn't look like the cause of the broken pipe because 
earlier DN2 has another occurrence of a ReplicaAlreadyExistsException, but DN3 
only has one occurrence of broken pipe. Here's the other occurrence of 
ReplicaAlreadyExistsException on DN2:
{noformat}
2020-06-08 21:52:54,438 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Receiving BP-553302063-172.17.0.3-1591653120007:blk_1073741825_1001 src: 
/127.0.0.1:47462 dest: /127.0.0.1:9766
2020-06-08 21:52:54,438 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
opWriteBlock BP-553302063-172.17.0.3-1591653120007:blk_1073741825_1001 received 
exception org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: 
Block BP-553302063-172.17.0.3-1591653120007:blk_1073741825_1001 already exists 
in state FINALIZED and thus cannot be created.
2020-06-08 21:52:54,448 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
007e9b383989:9766:DataXceiver error processing WRITE_BLOCK operation  

[jira] [Updated] (HDFS-15414) java.net.SocketException: Original Exception : java.io.IOException: Broken pipe

2020-06-15 Thread YCozy (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YCozy updated HDFS-15414:
-
Description: 
We observed this exception in a DataNode's log while we are not shutting down 
any nodes in the cluster. Specifically, we have a cluster with 3 DataNodes 
(DN1, DN2, DN3) and 2 NameNodes (NN1, NN2). At some point, this exception 
occurs in DN3's log:
{noformat}
2020-06-08 21:53:03,373 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(127.0.0.1:9666, 
datanodeUuid=4408ff04-e406-4ccc-bd5c-8516ad57ec21, infoPort=9664, 
infoSecurePort=0, ipcPort=9667, 
storageInfo=lv=-57;cid=CID-c816c4ea-a559-4fd5-9b3a-b5994dc3a5fa;nsid=34747155;c=1591653120007)
 Starting thread to transfer 
BP-553302063-172.17.0.3-1591653120007:blk_1073741825_1002 to 127.0.0.1:9766
2020-06-08 21:53:03,373 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(127.0.0.1:9666, 
datanodeUuid=4408ff04-e406-4ccc-bd5c-8516ad57ec21, infoPort=9664, 
infoSecurePort=0, ipcPort=9667, 
storageInfo=lv=-57;cid=CID-c816c4ea-a559-4fd5-9b3a-b5994dc3a5fa;nsid=34747155;c=1591653120007)
 Starting thread to transfer 
BP-553302063-172.17.0.3-1591653120007:blk_1073741825_1002 to 127.0.0.1:9766
2020-06-08 21:53:03,381 INFO 
org.apache.hadoop.hdfs.server.datanode.checker.ThrottledAsyncChecker: 
Scheduling a check for /app/dn3/current
2020-06-08 21:53:03,383 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(127.0.0.1:9666, 
datanodeUuid=4408ff04-e406-4ccc-bd5c-8516ad57ec21, infoPort=9664, 
infoSecurePort=0, ipcPort=9667, 
storageInfo=lv=-57;cid=CID-c816c4ea-a559-4fd5-9b3a-b5994dc3a5fa;nsid=34747155;c=1591653120007):Failed
 to transfer BP-553302063-172.17.0.3-1591653120007:blk_1073741825_1002 to 
127.0.0.1:9766 got
java.net.SocketException: Original Exception : java.io.IOException: Broken pipe
at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
at 
sun.nio.ch.FileChannelImpl.transferToDirectlyInternal(FileChannelImpl.java:428)
at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:493)
at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:605)
at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:223)
 
at 
org.apache.hadoop.hdfs.server.datanode.FileIoProvider.transferToSocketFully(FileIoProvider.java:280)
 
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:620)
 
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.doSendBlock(BlockSender.java:804)
 
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:751)
 
at 
org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:2469)
 
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Broken pipe 
... 11 more{noformat}
Port 9766 is DN2's address. 

Around the same time, we observe the following exceptions in DN2's log:
{noformat}
2020-06-08 21:53:03,379 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Receiving BP-553302063-172.17.0.3-1591653120007:blk_1073741825_1002 src: 
/127.0.0.1:47618 dest: /127.0.0.1:9766
2020-06-08 21:53:03,379 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
opWriteBlock BP-553302063-172.17.0.3-1591653120007:blk_1073741825_1002 received 
exception org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: 
Block BP-553302063-172.17.0.3-1591653120007:blk_1073741825_1002 already exists 
in state FINALIZED and thus cannot be created.
2020-06-08 21:53:03,379 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
007e9b383989:9766:DataXceiver error processing WRITE_BLOCK operation  src: 
/127.0.0.1:47618 dst: /127.0.0.1:9766; 
org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block 
BP-553302063-172.17.0.3-1591653120007:blk_1073741825_1002 already exists in 
state FINALIZED and thus cannot be created.{noformat}
However, this exception does look like the cause of the broken pipe because 
earlier DN2 has another occurrence of a ReplicaAlreadyExistsException, but DN3 
only has one occurrence of broken pipe. Here's the other occurrence of 
ReplicaAlreadyExistsException on DN2:
{noformat}
2020-06-08 21:52:54,438 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Receiving BP-553302063-172.17.0.3-1591653120007:blk_1073741825_1001 src: 
/127.0.0.1:47462 dest: /127.0.0.1:9766
2020-06-08 21:52:54,438 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
opWriteBlock BP-553302063-172.17.0.3-1591653120007:blk_1073741825_1001 received 
exception org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: 
Block BP-553302063-172.17.0.3-1591653120007:blk_1073741825_1001 already exists 
in state FINALIZED and thus cannot be created.
2020-06-08 21:52:54,448 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
007e9b383989:9766:DataXceiver error processing WRITE_BLOCK operation  src: 

[jira] [Created] (HDFS-15414) java.net.SocketException: Original Exception : java.io.IOException: Broken pipe

2020-06-15 Thread YCozy (Jira)
YCozy created HDFS-15414:


 Summary: java.net.SocketException: Original Exception : 
java.io.IOException: Broken pipe
 Key: HDFS-15414
 URL: https://issues.apache.org/jira/browse/HDFS-15414
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.10.0
Reporter: YCozy


We observed this exception in a DataNode's log while we are not shutting down 
any nodes in the cluster. Specifically, we have a cluster with 3 DataNodes 
(DN1, DN2, DN3) and 2 NameNodes (NN1, NN2). At some point, this exception 
occurs in DN3's log:
{noformat}
2020-06-08 21:53:03,373 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(127.0.0.1:9666, 
datanodeUuid=4408ff04-e406-4ccc-bd5c-8516ad57ec21, infoPort=9664, 
infoSecurePort=0, ipcPort=9667, 
storageInfo=lv=-57;cid=CID-c816c4ea-a559-4fd5-9b3a-b5994dc3a5fa;nsid=34747155;c=1591653120007)
 Starting thread to transfer 
BP-553302063-172.17.0.3-1591653120007:blk_1073741825_1002 to 127.0.0.1:9766
2020-06-08 21:53:03,373 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(127.0.0.1:9666, 
datanodeUuid=4408ff04-e406-4ccc-bd5c-8516ad57ec21, infoPort=9664, 
infoSecurePort=0, ipcPort=9667, 
storageInfo=lv=-57;cid=CID-c816c4ea-a559-4fd5-9b3a-b5994dc3a5fa;nsid=34747155;c=1591653120007)
 Starting thread to transfer 
BP-553302063-172.17.0.3-1591653120007:blk_1073741825_1002 to 127.0.0.1:9766
2020-06-08 21:53:03,381 INFO 
org.apache.hadoop.hdfs.server.datanode.checker.ThrottledAsyncChecker: 
Scheduling a check for /app/dn3/current
2020-06-08 21:53:03,383 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(127.0.0.1:9666, 
datanodeUuid=4408ff04-e406-4ccc-bd5c-8516ad57ec21, infoPort=9664, 
infoSecurePort=0, ipcPort=9667, 
storageInfo=lv=-57;cid=CID-c816c4ea-a559-4fd5-9b3a-b5994dc3a5fa;nsid=34747155;c=1591653120007):Failed
 to transfer BP-553302063-172.17.0.3-1591653120007:blk_1073741825_1002 to 
127.0.0.1:9766 got
java.net.SocketException: Original Exception : java.io.IOException: Broken pipe
at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
at 
sun.nio.ch.FileChannelImpl.transferToDirectlyInternal(FileChannelImpl.java:428)
at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:493)
at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:605)
at 
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:223)
 
at 
org.apache.hadoop.hdfs.server.datanode.FileIoProvider.transferToSocketFully(FileIoProvider.java:280)
 
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:620)
 
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.doSendBlock(BlockSender.java:804)
 
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:751)
 
at 
org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:2469)
 
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Broken pipe 
... 11 more{noformat}
Port 9766 is DN2's address. 

Around the same time, we observe the following exceptions in DN2's log:
{noformat}
2020-06-08 21:53:03,379 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Receiving BP-553302063-172.17.0.3-1591653120007:blk_1073741825_1002 src: 
/127.0.0.1:47618 dest: /127.0.0.1:9766
2020-06-08 21:53:03,379 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
opWriteBlock BP-553302063-172.17.0.3-1591653120007:blk_1073741825_1002 received 
exception org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: 
Block BP-553302063-172.17.0.3-    1591653120007:blk_1073741825_1002 already 
exists in state FINALIZED and thus cannot be created.
2020-06-08 21:53:03,379 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
007e9b383989:9766:DataXceiver error processing WRITE_BLOCK operation  src: 
/127.0.0.1:47618 dst: /127.0.0.1:9766; 
org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block     
BP-553302063-172.17.0.3-1591653120007:blk_1073741825_1002 already exists in 
state FINALIZED and thus cannot be created.{noformat}
However, this exception does look like the cause of the broken pipe because 
earlier DN2 has another occurrence of a ReplicaAlreadyExistsException, but DN3 
only has one occurrence of broken pipe. Here's the other occurrence of 
ReplicaAlreadyExistsException on DN2:
{noformat}
2020-06-08 21:52:54,438 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Receiving BP-553302063-172.17.0.3-1591653120007:blk_1073741825_1001 src: 
/127.0.0.1:47462 dest: /127.0.0.1:9766
2020-06-08 21:52:54,438 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
opWriteBlock BP-553302063-172.17.0.3-1591653120007:blk_1073741825_1001 received 
exception org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: 
Block BP-553302063-172.17.0.3-    

[jira] [Created] (HDFS-15367) Fail to get file checksum even if there's an available replica.

2020-05-20 Thread YCozy (Jira)
YCozy created HDFS-15367:


 Summary: Fail to get file checksum even if there's an available 
replica.
 Key: HDFS-15367
 URL: https://issues.apache.org/jira/browse/HDFS-15367
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: dfsclient, namenode
Affects Versions: 2.10.0
Reporter: YCozy


DFSClient can fail to get file checksum even when there's an available replica. 
One possible triggering process of the bug is as follows:
 * Start a cluster with three DNs (DN1, DN2, DN3). The default replication 
factor is set to 2.
 * Both DN1 and DN3 register with NN, as can be seen from NN's log (DN1 uses 
port 9866 while DN3 uses port 9666):

{noformat}
2020-05-21 01:24:57,196 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
new node: /default-rack/127.0.0.1:9866
2020-05-21 01:25:06,155 INFO org.apache.hadoop.net.NetworkTopology: Adding a 
new node: /default-rack/127.0.0.1:9666{noformat}
 * DN1 sends block report to NN, as can be seen from NN's log:

{noformat}
2020-05-21 01:24:57,336 INFO BlockStateChange: BLOCK* processReport 
0x3ae7e5805f2e704e: from storage DS-638ee5ae-e435-4d82-ae4f-9066bc7eb850 node 
DatanodeRegistration(127.0.0.1:9866, 
datanodeUuid=b0702574-968f-4817-a660-42ec1c475606, infoPort=9864, 
infoSecurePort=0, ipcPort=9867, 
storageInfo=lv=-57;cid=CID-75860997-47d0-4957-a4e6-4edbd79d64b8;nsid=49920454;c=1590024277030),
 blocks: 0, hasStaleStorage: false, processing time: 3 msecs, 
invalidatedBlocks: 0{noformat}
 * DN3 fails to send the block report to NN because of a network partition. We 
inject network partition to fail DN3's blockReport RPC. Also, NN's log does not 
contain the "processReport" entry for DN3.
 * DFSClient uploads a file. NN chooses DN1 and DN3 to host the replicas. The 
network partition on DN3 stops, so the file is uploaded successfully. This can 
be verified by NN's log:

{noformat}
2020-05-21 01:25:13,644 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
allocate blk_1073741825_1001, replicas=127.0.0.1:9666, 127.0.0.1:9866 for 
/dir1/file1._COPYING_{noformat}
 * Stop DN1, as can be seen from DN1's log:

{noformat}
2020-05-21 01:25:21,114 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
SHUTDOWN_MSG:{noformat}
 * DFSClient tries to get the file checksum. It fails to connect to DN1 and 
gives up. The bug is triggered.

{noformat}
20/05/21 01:25:34 INFO hdfs.DFSClient: Connecting to datanode 127.0.0.1:9866
20/05/21 01:25:34 WARN hdfs.DFSClient: src=/dir1/file1, 
datanodes[0]=DatanodeInfoWithStorage[127.0.0.1:9866,DS-638ee5ae-e435-4d82-ae4f-9066bc7eb850,DISK]
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:714)
        at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
        at org.apache.hadoop.hdfs.DFSClient.connectToDN(DFSClient.java:1925)
        at org.apache.hadoop.hdfs.DFSClient.getFileChecksum(DFSClient.java:1798)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$33.doCall(DistributedFileSystem.java:1638)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$33.doCall(DistributedFileSystem.java:1635)
        at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1646)
        at 
org.apache.hadoop.fs.shell.Display$Checksum.processPath(Display.java:199)
        at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:327)
        at 
org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:299)
        at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:281)
        at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:265)
        at 
org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:119)
        at org.apache.hadoop.fs.shell.Command.run(Command.java:175)
        at org.apache.hadoop.fs.FsShell.run(FsShell.java:317)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
        at org.apache.hadoop.fs.FsShell.main(FsShell.java:380)
checksum: Fail to get block MD5 for 
BP-2092781073-172.17.0.4-1590024277030:blk_1073741825_1001{noformat}
Since DN3 also has a replica of the file, DFSClient should try to contact DN3 
to get the checksum.

To verify that DFSClient didn't connect to DN3, we changed the DEBUG log in 
DFSClient.connectToDN() to INFO log. From the above error messages printed by 
DFSClient we can see that the DFSClient only tries to connect to DN1.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HDFS-15235) Transient network failure during NameNode failover kills the NameNode

2020-04-01 Thread YCozy (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072805#comment-17072805
 ] 

YCozy commented on HDFS-15235:
--

Hello [~weichiu], thanks for looking into this!

For triggering this bug, the most critical step is to prevent NN2 from sending 
the response back. We have a tool that can reliably trigger this bug by 
simulating network partitions in a real cluster (i.e., not via unit tests).

On the other hand, for fixing this bug, I slightly modified the failover code 
to make it work as specified in the 
[documentation|https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithNFS.html#Architecture].
 Previously, when failing *+back+* from NN2 back to NN1, the code blindly 
fences NN2 even if NN2 has successfully becomes standby. Since the fencing 
functionality is for making sure that we have at most one active NameNode 
anytime, killing NN2 in this case is unnecessary. Moreover, killing a running 
NameNode will reduce the robustness of the cluster. After this patch, the code 
fences NN2 +*only when NN2 fails to become standby*+. In this way, we still 
make sure that there will be at most one active NameNode any time.

> Transient network failure during NameNode failover kills the NameNode
> -
>
> Key: HDFS-15235
> URL: https://issues.apache.org/jira/browse/HDFS-15235
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: YCozy
>Assignee: YCozy
>Priority: Major
> Attachments: HDFS-15235.001.patch
>
>
> We have an HA cluster with two NameNodes: an active NN1 and a standby NN2. At 
> some point, NN1 becomes unhealthy and the admin tries to manually failover to 
> NN2 by running command
> {code:java}
> $ hdfs haadmin -failover NN1 NN2
> {code}
> NN2 receives the request and becomes active:
> {code:java}
> 2020-03-24 00:24:56,412 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services 
> started for standby state
> 2020-03-24 00:24:56,413 WARN 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Edit log tailer 
> interrupted: sleep interrupted
> 2020-03-24 00:24:56,415 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services 
> required for active state
> 2020-03-24 00:24:56,417 INFO 
> org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
> unfinalized segments in /app/ha-name-dir-shared/current
> 2020-03-24 00:24:56,419 INFO 
> org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
> unfinalized segments in /app/nn2/name/current
> 2020-03-24 00:24:56,419 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Catching up to latest 
> edits from old active before taking over writer role in edits logs
> 2020-03-24 00:24:56,435 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Reading 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@7c3095fa 
> expecting start txid #1
> 2020-03-24 00:24:56,436 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Start loading edits file 
> /app/ha-name-dir-shared/current/edits_001-019 
> maxTxnsToRead = 9223372036854775807
> 2020-03-24 00:24:56,441 INFO 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream: 
> Fast-forwarding stream 
> '/app/ha-name-dir-shared/current/edits_001-019'
>  to transaction ID 1
> 2020-03-24 00:24:56,567 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Loaded 1 edits file(s) (the last named 
> /app/ha-name-dir-shared/current/edits_001-019)
>  of total size 1305.0, total edits 19.0, total load time 109.0 ms
> 2020-03-24 00:24:56,567 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Marking all 
> datanodes as stale
> 2020-03-24 00:24:56,568 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Processing 4 
> messages from DataNodes that were previously queued during standby state
> 2020-03-24 00:24:56,569 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Reprocessing replication 
> and invalidation queues
> 2020-03-24 00:24:56,569 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: initializing 
> replication queues
> 2020-03-24 00:24:56,570 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing 
> edit logs at txnid 20
> 2020-03-24 00:24:56,571 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 20
> 2020-03-24 00:24:56,812 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory: Initializing quota with 4 
> thread(s)
> 2020-03-24 00:24:56,819 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory: Quota initialization 
> completed in 6 millisecondsname 

[jira] [Commented] (HDFS-15235) Transient network failure during NameNode failover kills the NameNode

2020-03-31 Thread YCozy (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071727#comment-17071727
 ] 

YCozy commented on HDFS-15235:
--

Hello [~weichiu], would you please help review the patch? Thanks!

> Transient network failure during NameNode failover kills the NameNode
> -
>
> Key: HDFS-15235
> URL: https://issues.apache.org/jira/browse/HDFS-15235
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: YCozy
>Assignee: YCozy
>Priority: Major
> Attachments: HDFS-15235.001.patch
>
>
> We have an HA cluster with two NameNodes: an active NN1 and a standby NN2. At 
> some point, NN1 becomes unhealthy and the admin tries to manually failover to 
> NN2 by running command
> {code:java}
> $ hdfs haadmin -failover NN1 NN2
> {code}
> NN2 receives the request and becomes active:
> {code:java}
> 2020-03-24 00:24:56,412 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services 
> started for standby state
> 2020-03-24 00:24:56,413 WARN 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Edit log tailer 
> interrupted: sleep interrupted
> 2020-03-24 00:24:56,415 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services 
> required for active state
> 2020-03-24 00:24:56,417 INFO 
> org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
> unfinalized segments in /app/ha-name-dir-shared/current
> 2020-03-24 00:24:56,419 INFO 
> org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
> unfinalized segments in /app/nn2/name/current
> 2020-03-24 00:24:56,419 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Catching up to latest 
> edits from old active before taking over writer role in edits logs
> 2020-03-24 00:24:56,435 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Reading 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@7c3095fa 
> expecting start txid #1
> 2020-03-24 00:24:56,436 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Start loading edits file 
> /app/ha-name-dir-shared/current/edits_001-019 
> maxTxnsToRead = 9223372036854775807
> 2020-03-24 00:24:56,441 INFO 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream: 
> Fast-forwarding stream 
> '/app/ha-name-dir-shared/current/edits_001-019'
>  to transaction ID 1
> 2020-03-24 00:24:56,567 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Loaded 1 edits file(s) (the last named 
> /app/ha-name-dir-shared/current/edits_001-019)
>  of total size 1305.0, total edits 19.0, total load time 109.0 ms
> 2020-03-24 00:24:56,567 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Marking all 
> datanodes as stale
> 2020-03-24 00:24:56,568 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Processing 4 
> messages from DataNodes that were previously queued during standby state
> 2020-03-24 00:24:56,569 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Reprocessing replication 
> and invalidation queues
> 2020-03-24 00:24:56,569 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: initializing 
> replication queues
> 2020-03-24 00:24:56,570 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing 
> edit logs at txnid 20
> 2020-03-24 00:24:56,571 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 20
> 2020-03-24 00:24:56,812 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory: Initializing quota with 4 
> thread(s)
> 2020-03-24 00:24:56,819 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory: Quota initialization 
> completed in 6 millisecondsname space=3storage space=24690storage 
> types=RAM_DISK=0, SSD=0, DISK=0, ARCHIVE=0, PROVIDED=0
> 2020-03-24 00:24:56,827 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: 
> Starting CacheReplicationMonitor with interval 3 milliseconds
> {code}
> But NN2 fails to send back the RPC response because of temporary network 
> partitioning.
> {code:java}
> java.io.EOFException: End of File Exception between local host is: 
> "24e7b5a52e85/172.17.0.2"; destination host is: "127.0.0.3":8180; : 
> java.io.EOFException; For more details see:  
> http://wiki.apache.org/hadoop/EOFException
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
>         at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>         at 

[jira] [Commented] (HDFS-15235) Transient network failure during NameNode failover kills the NameNode

2020-03-30 Thread YCozy (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071006#comment-17071006
 ] 

YCozy commented on HDFS-15235:
--

Hello [~ayushtkn], I've made the title more accurate. Could you please help 
review the patch? Thanks!

> Transient network failure during NameNode failover kills the NameNode
> -
>
> Key: HDFS-15235
> URL: https://issues.apache.org/jira/browse/HDFS-15235
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: YCozy
>Assignee: YCozy
>Priority: Major
> Attachments: HDFS-15235.001.patch
>
>
> We have an HA cluster with two NameNodes: an active NN1 and a standby NN2. At 
> some point, NN1 becomes unhealthy and the admin tries to manually failover to 
> NN2 by running command
> {code:java}
> $ hdfs haadmin -failover NN1 NN2
> {code}
> NN2 receives the request and becomes active:
> {code:java}
> 2020-03-24 00:24:56,412 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services 
> started for standby state
> 2020-03-24 00:24:56,413 WARN 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Edit log tailer 
> interrupted: sleep interrupted
> 2020-03-24 00:24:56,415 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services 
> required for active state
> 2020-03-24 00:24:56,417 INFO 
> org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
> unfinalized segments in /app/ha-name-dir-shared/current
> 2020-03-24 00:24:56,419 INFO 
> org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
> unfinalized segments in /app/nn2/name/current
> 2020-03-24 00:24:56,419 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Catching up to latest 
> edits from old active before taking over writer role in edits logs
> 2020-03-24 00:24:56,435 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Reading 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@7c3095fa 
> expecting start txid #1
> 2020-03-24 00:24:56,436 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Start loading edits file 
> /app/ha-name-dir-shared/current/edits_001-019 
> maxTxnsToRead = 9223372036854775807
> 2020-03-24 00:24:56,441 INFO 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream: 
> Fast-forwarding stream 
> '/app/ha-name-dir-shared/current/edits_001-019'
>  to transaction ID 1
> 2020-03-24 00:24:56,567 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Loaded 1 edits file(s) (the last named 
> /app/ha-name-dir-shared/current/edits_001-019)
>  of total size 1305.0, total edits 19.0, total load time 109.0 ms
> 2020-03-24 00:24:56,567 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Marking all 
> datanodes as stale
> 2020-03-24 00:24:56,568 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Processing 4 
> messages from DataNodes that were previously queued during standby state
> 2020-03-24 00:24:56,569 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Reprocessing replication 
> and invalidation queues
> 2020-03-24 00:24:56,569 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: initializing 
> replication queues
> 2020-03-24 00:24:56,570 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing 
> edit logs at txnid 20
> 2020-03-24 00:24:56,571 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 20
> 2020-03-24 00:24:56,812 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory: Initializing quota with 4 
> thread(s)
> 2020-03-24 00:24:56,819 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory: Quota initialization 
> completed in 6 millisecondsname space=3storage space=24690storage 
> types=RAM_DISK=0, SSD=0, DISK=0, ARCHIVE=0, PROVIDED=0
> 2020-03-24 00:24:56,827 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: 
> Starting CacheReplicationMonitor with interval 3 milliseconds
> {code}
> But NN2 fails to send back the RPC response because of temporary network 
> partitioning.
> {code:java}
> java.io.EOFException: End of File Exception between local host is: 
> "24e7b5a52e85/172.17.0.2"; destination host is: "127.0.0.3":8180; : 
> java.io.EOFException; For more details see:  
> http://wiki.apache.org/hadoop/EOFException
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
>         at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>         at 

[jira] [Commented] (HDFS-15235) Transient network failure during NameNode failover kills the NameNode

2020-03-28 Thread YCozy (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17069462#comment-17069462
 ] 

YCozy commented on HDFS-15235:
--

Also, NN2 shouldn't be killed because the fencing should be invoked only when 
we can't make the NN standby, according to the following sentence in the 
[documentation|https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithNFS.html#Architecture]:
{quote}During a failover, if it cannot be verified that the previous Active 
node has relinquished its Active state, the fencing process is responsible for 
cutting off the previous Active’s access to the shared edits storage.
{quote}

> Transient network failure during NameNode failover kills the NameNode
> -
>
> Key: HDFS-15235
> URL: https://issues.apache.org/jira/browse/HDFS-15235
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: YCozy
>Assignee: YCozy
>Priority: Major
> Attachments: HDFS-15235.001.patch
>
>
> We have an HA cluster with two NameNodes: an active NN1 and a standby NN2. At 
> some point, NN1 becomes unhealthy and the admin tries to manually failover to 
> NN2 by running command
> {code:java}
> $ hdfs haadmin -failover NN1 NN2
> {code}
> NN2 receives the request and becomes active:
> {code:java}
> 2020-03-24 00:24:56,412 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services 
> started for standby state
> 2020-03-24 00:24:56,413 WARN 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Edit log tailer 
> interrupted: sleep interrupted
> 2020-03-24 00:24:56,415 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services 
> required for active state
> 2020-03-24 00:24:56,417 INFO 
> org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
> unfinalized segments in /app/ha-name-dir-shared/current
> 2020-03-24 00:24:56,419 INFO 
> org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
> unfinalized segments in /app/nn2/name/current
> 2020-03-24 00:24:56,419 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Catching up to latest 
> edits from old active before taking over writer role in edits logs
> 2020-03-24 00:24:56,435 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Reading 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@7c3095fa 
> expecting start txid #1
> 2020-03-24 00:24:56,436 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Start loading edits file 
> /app/ha-name-dir-shared/current/edits_001-019 
> maxTxnsToRead = 9223372036854775807
> 2020-03-24 00:24:56,441 INFO 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream: 
> Fast-forwarding stream 
> '/app/ha-name-dir-shared/current/edits_001-019'
>  to transaction ID 1
> 2020-03-24 00:24:56,567 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Loaded 1 edits file(s) (the last named 
> /app/ha-name-dir-shared/current/edits_001-019)
>  of total size 1305.0, total edits 19.0, total load time 109.0 ms
> 2020-03-24 00:24:56,567 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Marking all 
> datanodes as stale
> 2020-03-24 00:24:56,568 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Processing 4 
> messages from DataNodes that were previously queued during standby state
> 2020-03-24 00:24:56,569 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Reprocessing replication 
> and invalidation queues
> 2020-03-24 00:24:56,569 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: initializing 
> replication queues
> 2020-03-24 00:24:56,570 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing 
> edit logs at txnid 20
> 2020-03-24 00:24:56,571 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 20
> 2020-03-24 00:24:56,812 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory: Initializing quota with 4 
> thread(s)
> 2020-03-24 00:24:56,819 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory: Quota initialization 
> completed in 6 millisecondsname space=3storage space=24690storage 
> types=RAM_DISK=0, SSD=0, DISK=0, ARCHIVE=0, PROVIDED=0
> 2020-03-24 00:24:56,827 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: 
> Starting CacheReplicationMonitor with interval 3 milliseconds
> {code}
> But NN2 fails to send back the RPC response because of temporary network 
> partitioning.
> {code:java}
> java.io.EOFException: End of File Exception between local host is: 
> "24e7b5a52e85/172.17.0.2"; destination host is: "127.0.0.3":8180; : 
> 

[jira] [Commented] (HDFS-15235) Transient network failure during NameNode failover kills the NameNode

2020-03-28 Thread YCozy (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17069461#comment-17069461
 ] 

YCozy commented on HDFS-15235:
--

Thank you [~ayushtkn]! Upon further analysis we found that NN1 did become 
active again. However, NN2 shouldn't be killed so that another failover can be 
easily launched. Since we don't want NN1 to stay active at the first place, 
leaving NN1 as the only running NN in the cluster doesn't really help.

I've made the title more accurate.

> Transient network failure during NameNode failover kills the NameNode
> -
>
> Key: HDFS-15235
> URL: https://issues.apache.org/jira/browse/HDFS-15235
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: YCozy
>Assignee: YCozy
>Priority: Major
> Attachments: HDFS-15235.001.patch
>
>
> We have an HA cluster with two NameNodes: an active NN1 and a standby NN2. At 
> some point, NN1 becomes unhealthy and the admin tries to manually failover to 
> NN2 by running command
> {code:java}
> $ hdfs haadmin -failover NN1 NN2
> {code}
> NN2 receives the request and becomes active:
> {code:java}
> 2020-03-24 00:24:56,412 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services 
> started for standby state
> 2020-03-24 00:24:56,413 WARN 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Edit log tailer 
> interrupted: sleep interrupted
> 2020-03-24 00:24:56,415 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services 
> required for active state
> 2020-03-24 00:24:56,417 INFO 
> org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
> unfinalized segments in /app/ha-name-dir-shared/current
> 2020-03-24 00:24:56,419 INFO 
> org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
> unfinalized segments in /app/nn2/name/current
> 2020-03-24 00:24:56,419 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Catching up to latest 
> edits from old active before taking over writer role in edits logs
> 2020-03-24 00:24:56,435 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Reading 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@7c3095fa 
> expecting start txid #1
> 2020-03-24 00:24:56,436 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Start loading edits file 
> /app/ha-name-dir-shared/current/edits_001-019 
> maxTxnsToRead = 9223372036854775807
> 2020-03-24 00:24:56,441 INFO 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream: 
> Fast-forwarding stream 
> '/app/ha-name-dir-shared/current/edits_001-019'
>  to transaction ID 1
> 2020-03-24 00:24:56,567 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Loaded 1 edits file(s) (the last named 
> /app/ha-name-dir-shared/current/edits_001-019)
>  of total size 1305.0, total edits 19.0, total load time 109.0 ms
> 2020-03-24 00:24:56,567 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Marking all 
> datanodes as stale
> 2020-03-24 00:24:56,568 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Processing 4 
> messages from DataNodes that were previously queued during standby state
> 2020-03-24 00:24:56,569 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Reprocessing replication 
> and invalidation queues
> 2020-03-24 00:24:56,569 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: initializing 
> replication queues
> 2020-03-24 00:24:56,570 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing 
> edit logs at txnid 20
> 2020-03-24 00:24:56,571 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 20
> 2020-03-24 00:24:56,812 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory: Initializing quota with 4 
> thread(s)
> 2020-03-24 00:24:56,819 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory: Quota initialization 
> completed in 6 millisecondsname space=3storage space=24690storage 
> types=RAM_DISK=0, SSD=0, DISK=0, ARCHIVE=0, PROVIDED=0
> 2020-03-24 00:24:56,827 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: 
> Starting CacheReplicationMonitor with interval 3 milliseconds
> {code}
> But NN2 fails to send back the RPC response because of temporary network 
> partitioning.
> {code:java}
> java.io.EOFException: End of File Exception between local host is: 
> "24e7b5a52e85/172.17.0.2"; destination host is: "127.0.0.3":8180; : 
> java.io.EOFException; For more details see:  
> http://wiki.apache.org/hadoop/EOFException
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
>         at 
> 

[jira] [Updated] (HDFS-15235) Transient network failure during NameNode failover kills the NameNode

2020-03-28 Thread YCozy (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YCozy updated HDFS-15235:
-
Summary: Transient network failure during NameNode failover kills the 
NameNode  (was: Transient network failure during NameNode failover makes 
cluster unavailable)

> Transient network failure during NameNode failover kills the NameNode
> -
>
> Key: HDFS-15235
> URL: https://issues.apache.org/jira/browse/HDFS-15235
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: YCozy
>Assignee: YCozy
>Priority: Major
> Attachments: HDFS-15235.001.patch
>
>
> We have an HA cluster with two NameNodes: an active NN1 and a standby NN2. At 
> some point, NN1 becomes unhealthy and the admin tries to manually failover to 
> NN2 by running command
> {code:java}
> $ hdfs haadmin -failover NN1 NN2
> {code}
> NN2 receives the request and becomes active:
> {code:java}
> 2020-03-24 00:24:56,412 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services 
> started for standby state
> 2020-03-24 00:24:56,413 WARN 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Edit log tailer 
> interrupted: sleep interrupted
> 2020-03-24 00:24:56,415 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services 
> required for active state
> 2020-03-24 00:24:56,417 INFO 
> org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
> unfinalized segments in /app/ha-name-dir-shared/current
> 2020-03-24 00:24:56,419 INFO 
> org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
> unfinalized segments in /app/nn2/name/current
> 2020-03-24 00:24:56,419 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Catching up to latest 
> edits from old active before taking over writer role in edits logs
> 2020-03-24 00:24:56,435 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Reading 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@7c3095fa 
> expecting start txid #1
> 2020-03-24 00:24:56,436 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Start loading edits file 
> /app/ha-name-dir-shared/current/edits_001-019 
> maxTxnsToRead = 9223372036854775807
> 2020-03-24 00:24:56,441 INFO 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream: 
> Fast-forwarding stream 
> '/app/ha-name-dir-shared/current/edits_001-019'
>  to transaction ID 1
> 2020-03-24 00:24:56,567 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Loaded 1 edits file(s) (the last named 
> /app/ha-name-dir-shared/current/edits_001-019)
>  of total size 1305.0, total edits 19.0, total load time 109.0 ms
> 2020-03-24 00:24:56,567 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Marking all 
> datanodes as stale
> 2020-03-24 00:24:56,568 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Processing 4 
> messages from DataNodes that were previously queued during standby state
> 2020-03-24 00:24:56,569 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Reprocessing replication 
> and invalidation queues
> 2020-03-24 00:24:56,569 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: initializing 
> replication queues
> 2020-03-24 00:24:56,570 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing 
> edit logs at txnid 20
> 2020-03-24 00:24:56,571 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 20
> 2020-03-24 00:24:56,812 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory: Initializing quota with 4 
> thread(s)
> 2020-03-24 00:24:56,819 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory: Quota initialization 
> completed in 6 millisecondsname space=3storage space=24690storage 
> types=RAM_DISK=0, SSD=0, DISK=0, ARCHIVE=0, PROVIDED=0
> 2020-03-24 00:24:56,827 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: 
> Starting CacheReplicationMonitor with interval 3 milliseconds
> {code}
> But NN2 fails to send back the RPC response because of temporary network 
> partitioning.
> {code:java}
> java.io.EOFException: End of File Exception between local host is: 
> "24e7b5a52e85/172.17.0.2"; destination host is: "127.0.0.3":8180; : 
> java.io.EOFException; For more details see:  
> http://wiki.apache.org/hadoop/EOFException
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
>         at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>        

[jira] [Commented] (HDFS-15235) Transient network failure during NameNode failover makes cluster unavailable

2020-03-28 Thread YCozy (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17069442#comment-17069442
 ] 

YCozy commented on HDFS-15235:
--

[~hemanthboyina], [~elgoiri], would you be so kind to help review the patch? 
Thanks!

> Transient network failure during NameNode failover makes cluster unavailable
> 
>
> Key: HDFS-15235
> URL: https://issues.apache.org/jira/browse/HDFS-15235
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: YCozy
>Assignee: YCozy
>Priority: Major
> Attachments: HDFS-15235.001.patch
>
>
> We have an HA cluster with two NameNodes: an active NN1 and a standby NN2. At 
> some point, NN1 becomes unhealthy and the admin tries to manually failover to 
> NN2 by running command
> {code:java}
> $ hdfs haadmin -failover NN1 NN2
> {code}
> NN2 receives the request and becomes active:
> {code:java}
> 2020-03-24 00:24:56,412 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services 
> started for standby state
> 2020-03-24 00:24:56,413 WARN 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Edit log tailer 
> interrupted: sleep interrupted
> 2020-03-24 00:24:56,415 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services 
> required for active state
> 2020-03-24 00:24:56,417 INFO 
> org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
> unfinalized segments in /app/ha-name-dir-shared/current
> 2020-03-24 00:24:56,419 INFO 
> org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
> unfinalized segments in /app/nn2/name/current
> 2020-03-24 00:24:56,419 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Catching up to latest 
> edits from old active before taking over writer role in edits logs
> 2020-03-24 00:24:56,435 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Reading 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@7c3095fa 
> expecting start txid #1
> 2020-03-24 00:24:56,436 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Start loading edits file 
> /app/ha-name-dir-shared/current/edits_001-019 
> maxTxnsToRead = 9223372036854775807
> 2020-03-24 00:24:56,441 INFO 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream: 
> Fast-forwarding stream 
> '/app/ha-name-dir-shared/current/edits_001-019'
>  to transaction ID 1
> 2020-03-24 00:24:56,567 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Loaded 1 edits file(s) (the last named 
> /app/ha-name-dir-shared/current/edits_001-019)
>  of total size 1305.0, total edits 19.0, total load time 109.0 ms
> 2020-03-24 00:24:56,567 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Marking all 
> datanodes as stale
> 2020-03-24 00:24:56,568 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Processing 4 
> messages from DataNodes that were previously queued during standby state
> 2020-03-24 00:24:56,569 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Reprocessing replication 
> and invalidation queues
> 2020-03-24 00:24:56,569 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: initializing 
> replication queues
> 2020-03-24 00:24:56,570 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing 
> edit logs at txnid 20
> 2020-03-24 00:24:56,571 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 20
> 2020-03-24 00:24:56,812 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory: Initializing quota with 4 
> thread(s)
> 2020-03-24 00:24:56,819 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory: Quota initialization 
> completed in 6 millisecondsname space=3storage space=24690storage 
> types=RAM_DISK=0, SSD=0, DISK=0, ARCHIVE=0, PROVIDED=0
> 2020-03-24 00:24:56,827 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: 
> Starting CacheReplicationMonitor with interval 3 milliseconds
> {code}
> But NN2 fails to send back the RPC response because of temporary network 
> partitioning.
> {code:java}
> java.io.EOFException: End of File Exception between local host is: 
> "24e7b5a52e85/172.17.0.2"; destination host is: "127.0.0.3":8180; : 
> java.io.EOFException; For more details see:  
> http://wiki.apache.org/hadoop/EOFException
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
>         at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>         at 

[jira] [Commented] (HDFS-15235) Transient network failure during NameNode failover makes cluster unavailable

2020-03-26 Thread YCozy (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17067649#comment-17067649
 ] 

YCozy commented on HDFS-15235:
--

[~ayushtkn], could you please take a look at the patch? Thanks!

> Transient network failure during NameNode failover makes cluster unavailable
> 
>
> Key: HDFS-15235
> URL: https://issues.apache.org/jira/browse/HDFS-15235
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: YCozy
>Assignee: YCozy
>Priority: Major
> Attachments: HDFS-15235.001.patch
>
>
> We have an HA cluster with two NameNodes: an active NN1 and a standby NN2. At 
> some point, NN1 becomes unhealthy and the admin tries to manually failover to 
> NN2 by running command
> {code:java}
> $ hdfs haadmin -failover NN1 NN2
> {code}
> NN2 receives the request and becomes active:
> {code:java}
> 2020-03-24 00:24:56,412 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services 
> started for standby state
> 2020-03-24 00:24:56,413 WARN 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Edit log tailer 
> interrupted: sleep interrupted
> 2020-03-24 00:24:56,415 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services 
> required for active state
> 2020-03-24 00:24:56,417 INFO 
> org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
> unfinalized segments in /app/ha-name-dir-shared/current
> 2020-03-24 00:24:56,419 INFO 
> org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
> unfinalized segments in /app/nn2/name/current
> 2020-03-24 00:24:56,419 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Catching up to latest 
> edits from old active before taking over writer role in edits logs
> 2020-03-24 00:24:56,435 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Reading 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@7c3095fa 
> expecting start txid #1
> 2020-03-24 00:24:56,436 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Start loading edits file 
> /app/ha-name-dir-shared/current/edits_001-019 
> maxTxnsToRead = 9223372036854775807
> 2020-03-24 00:24:56,441 INFO 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream: 
> Fast-forwarding stream 
> '/app/ha-name-dir-shared/current/edits_001-019'
>  to transaction ID 1
> 2020-03-24 00:24:56,567 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Loaded 1 edits file(s) (the last named 
> /app/ha-name-dir-shared/current/edits_001-019)
>  of total size 1305.0, total edits 19.0, total load time 109.0 ms
> 2020-03-24 00:24:56,567 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Marking all 
> datanodes as stale
> 2020-03-24 00:24:56,568 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Processing 4 
> messages from DataNodes that were previously queued during standby state
> 2020-03-24 00:24:56,569 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Reprocessing replication 
> and invalidation queues
> 2020-03-24 00:24:56,569 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: initializing 
> replication queues
> 2020-03-24 00:24:56,570 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing 
> edit logs at txnid 20
> 2020-03-24 00:24:56,571 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 20
> 2020-03-24 00:24:56,812 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory: Initializing quota with 4 
> thread(s)
> 2020-03-24 00:24:56,819 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory: Quota initialization 
> completed in 6 millisecondsname space=3storage space=24690storage 
> types=RAM_DISK=0, SSD=0, DISK=0, ARCHIVE=0, PROVIDED=0
> 2020-03-24 00:24:56,827 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: 
> Starting CacheReplicationMonitor with interval 3 milliseconds
> {code}
> But NN2 fails to send back the RPC response because of temporary network 
> partitioning.
> {code:java}
> java.io.EOFException: End of File Exception between local host is: 
> "24e7b5a52e85/172.17.0.2"; destination host is: "127.0.0.3":8180; : 
> java.io.EOFException; For more details see:  
> http://wiki.apache.org/hadoop/EOFException
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
>         at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>         at 

[jira] [Updated] (HDFS-15235) Transient network failure during NameNode failover makes cluster unavailable

2020-03-25 Thread YCozy (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YCozy updated HDFS-15235:
-
Attachment: HDFS-15235.001.patch
Status: Patch Available  (was: Open)

Attaching a patch with both the UT to reproduce the bug, and the fix for the 
bug.

> Transient network failure during NameNode failover makes cluster unavailable
> 
>
> Key: HDFS-15235
> URL: https://issues.apache.org/jira/browse/HDFS-15235
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: YCozy
>Priority: Major
> Attachments: HDFS-15235.001.patch
>
>
> We have an HA cluster with two NameNodes: an active NN1 and a standby NN2. At 
> some point, NN1 becomes unhealthy and the admin tries to manually failover to 
> NN2 by running command
> {code:java}
> $ hdfs haadmin -failover NN1 NN2
> {code}
> NN2 receives the request and becomes active:
> {code:java}
> 2020-03-24 00:24:56,412 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services 
> started for standby state
> 2020-03-24 00:24:56,413 WARN 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Edit log tailer 
> interrupted: sleep interrupted
> 2020-03-24 00:24:56,415 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services 
> required for active state
> 2020-03-24 00:24:56,417 INFO 
> org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
> unfinalized segments in /app/ha-name-dir-shared/current
> 2020-03-24 00:24:56,419 INFO 
> org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
> unfinalized segments in /app/nn2/name/current
> 2020-03-24 00:24:56,419 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Catching up to latest 
> edits from old active before taking over writer role in edits logs
> 2020-03-24 00:24:56,435 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Reading 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@7c3095fa 
> expecting start txid #1
> 2020-03-24 00:24:56,436 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Start loading edits file 
> /app/ha-name-dir-shared/current/edits_001-019 
> maxTxnsToRead = 9223372036854775807
> 2020-03-24 00:24:56,441 INFO 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream: 
> Fast-forwarding stream 
> '/app/ha-name-dir-shared/current/edits_001-019'
>  to transaction ID 1
> 2020-03-24 00:24:56,567 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Loaded 1 edits file(s) (the last named 
> /app/ha-name-dir-shared/current/edits_001-019)
>  of total size 1305.0, total edits 19.0, total load time 109.0 ms
> 2020-03-24 00:24:56,567 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Marking all 
> datanodes as stale
> 2020-03-24 00:24:56,568 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Processing 4 
> messages from DataNodes that were previously queued during standby state
> 2020-03-24 00:24:56,569 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Reprocessing replication 
> and invalidation queues
> 2020-03-24 00:24:56,569 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: initializing 
> replication queues
> 2020-03-24 00:24:56,570 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing 
> edit logs at txnid 20
> 2020-03-24 00:24:56,571 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 20
> 2020-03-24 00:24:56,812 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory: Initializing quota with 4 
> thread(s)
> 2020-03-24 00:24:56,819 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory: Quota initialization 
> completed in 6 millisecondsname space=3storage space=24690storage 
> types=RAM_DISK=0, SSD=0, DISK=0, ARCHIVE=0, PROVIDED=0
> 2020-03-24 00:24:56,827 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: 
> Starting CacheReplicationMonitor with interval 3 milliseconds
> {code}
> But NN2 fails to send back the RPC response because of temporary network 
> partitioning.
> {code:java}
> java.io.EOFException: End of File Exception between local host is: 
> "24e7b5a52e85/172.17.0.2"; destination host is: "127.0.0.3":8180; : 
> java.io.EOFException; For more details see:  
> http://wiki.apache.org/hadoop/EOFException
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
>         at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>         at 

[jira] [Commented] (HDFS-15235) Transient network failure during NameNode failover makes cluster unavailable

2020-03-24 Thread YCozy (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17066043#comment-17066043
 ] 

YCozy commented on HDFS-15235:
--

[~ayushtkn] Thanks for looking at this! I'll try to upload a UT and a fix.

> Transient network failure during NameNode failover makes cluster unavailable
> 
>
> Key: HDFS-15235
> URL: https://issues.apache.org/jira/browse/HDFS-15235
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: YCozy
>Priority: Major
>
> We have an HA cluster with two NameNodes: an active NN1 and a standby NN2. At 
> some point, NN1 becomes unhealthy and the admin tries to manually failover to 
> NN2 by running command
> {code:java}
> $ hdfs haadmin -failover NN1 NN2
> {code}
> NN2 receives the request and becomes active:
> {code:java}
> 2020-03-24 00:24:56,412 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services 
> started for standby state
> 2020-03-24 00:24:56,413 WARN 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Edit log tailer 
> interrupted: sleep interrupted
> 2020-03-24 00:24:56,415 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services 
> required for active state
> 2020-03-24 00:24:56,417 INFO 
> org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
> unfinalized segments in /app/ha-name-dir-shared/current
> 2020-03-24 00:24:56,419 INFO 
> org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
> unfinalized segments in /app/nn2/name/current
> 2020-03-24 00:24:56,419 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Catching up to latest 
> edits from old active before taking over writer role in edits logs
> 2020-03-24 00:24:56,435 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Reading 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@7c3095fa 
> expecting start txid #1
> 2020-03-24 00:24:56,436 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Start loading edits file 
> /app/ha-name-dir-shared/current/edits_001-019 
> maxTxnsToRead = 9223372036854775807
> 2020-03-24 00:24:56,441 INFO 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream: 
> Fast-forwarding stream 
> '/app/ha-name-dir-shared/current/edits_001-019'
>  to transaction ID 1
> 2020-03-24 00:24:56,567 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Loaded 1 edits file(s) (the last named 
> /app/ha-name-dir-shared/current/edits_001-019)
>  of total size 1305.0, total edits 19.0, total load time 109.0 ms
> 2020-03-24 00:24:56,567 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Marking all 
> datanodes as stale
> 2020-03-24 00:24:56,568 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Processing 4 
> messages from DataNodes that were previously queued during standby state
> 2020-03-24 00:24:56,569 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Reprocessing replication 
> and invalidation queues
> 2020-03-24 00:24:56,569 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: initializing 
> replication queues
> 2020-03-24 00:24:56,570 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing 
> edit logs at txnid 20
> 2020-03-24 00:24:56,571 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog: Starting log segment at 20
> 2020-03-24 00:24:56,812 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory: Initializing quota with 4 
> thread(s)
> 2020-03-24 00:24:56,819 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory: Quota initialization 
> completed in 6 millisecondsname space=3storage space=24690storage 
> types=RAM_DISK=0, SSD=0, DISK=0, ARCHIVE=0, PROVIDED=0
> 2020-03-24 00:24:56,827 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: 
> Starting CacheReplicationMonitor with interval 3 milliseconds
> {code}
> But NN2 fails to send back the RPC response because of temporary network 
> partitioning.
> {code:java}
> java.io.EOFException: End of File Exception between local host is: 
> "24e7b5a52e85/172.17.0.2"; destination host is: "127.0.0.3":8180; : 
> java.io.EOFException; For more details see:  
> http://wiki.apache.org/hadoop/EOFException
>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
>         at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>         at 

[jira] [Commented] (HDFS-15235) Transient network failure during NameNode failover makes cluster unavailable

2020-03-24 Thread YCozy (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17066042#comment-17066042
 ] 

YCozy commented on HDFS-15235:
--

A bit more info:

After NN2 fails to send back a response, haadmin first tries to fail back to 
NN1 before it fences NN2. The failing back succeeds, i.e., NN2 becomes back to 
standby and NN1 becomes back to active. Here's a log snippet from NN2 (the 
following log is from a different run, so please ignore the timestamps):

 
{code:java}
2020-03-24 17:17:27,254 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services started 
for active state
2020-03-24 17:17:27,255 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: 
Ending log segment 19, 19
2020-03-24 17:17:27,255 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: LazyPersistFileScrubber 
was interrupted, exiting
2020-03-24 17:17:27,256 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: NameNodeEditLogRoller was 
interrupted, exiting
2020-03-24 17:17:27,257 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: 
Number of transactions: 2 Total time for transactions(ms): 5 Number of 
transactions batched in Syncs: 18 Number of syncs: 3 SyncTimes(ms): 4 6 
2020-03-24 17:17:27,259 INFO 
org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits 
file /app/ha-name-dir-shared/current/edits_inprogress_019 -> 
/app/ha-name-dir-shared/current/edits_019-020
2020-03-24 17:17:27,285 INFO 
org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits 
file /app/nn2/name/current/edits_inprogress_019 -> 
/app/nn2/name/current/edits_019-020
2020-03-24 17:17:27,286 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: 
FSEditLogAsync was interrupted, exiting
2020-03-24 17:17:27,290 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Shutting 
down CacheReplicationMonitor
2020-03-24 17:17:27,290 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services required 
for standby state
2020-03-24 17:17:27,294 INFO 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Will roll logs on 
active node every 120 seconds.
2020-03-24 17:17:27,299 INFO 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Starting standby 
checkpoint thread...
{code}
 

Since we only want to make sure that "only one NameNode be in the Active state 
at any given time" ([from the description of the fencing 
config|https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithNFS.html#Configuration_details]),
 and the successful fail-back has already achieved it, we shouldn't kill NN2.

> Transient network failure during NameNode failover makes cluster unavailable
> 
>
> Key: HDFS-15235
> URL: https://issues.apache.org/jira/browse/HDFS-15235
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: YCozy
>Priority: Major
>
> We have an HA cluster with two NameNodes: an active NN1 and a standby NN2. At 
> some point, NN1 becomes unhealthy and the admin tries to manually failover to 
> NN2 by running command
> {code:java}
> $ hdfs haadmin -failover NN1 NN2
> {code}
> NN2 receives the request and becomes active:
> {code:java}
> 2020-03-24 00:24:56,412 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services 
> started for standby state
> 2020-03-24 00:24:56,413 WARN 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Edit log tailer 
> interrupted: sleep interrupted
> 2020-03-24 00:24:56,415 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services 
> required for active state
> 2020-03-24 00:24:56,417 INFO 
> org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
> unfinalized segments in /app/ha-name-dir-shared/current
> 2020-03-24 00:24:56,419 INFO 
> org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
> unfinalized segments in /app/nn2/name/current
> 2020-03-24 00:24:56,419 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Catching up to latest 
> edits from old active before taking over writer role in edits logs
> 2020-03-24 00:24:56,435 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Reading 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@7c3095fa 
> expecting start txid #1
> 2020-03-24 00:24:56,436 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
> Start loading edits file 
> /app/ha-name-dir-shared/current/edits_001-019 
> maxTxnsToRead = 9223372036854775807
> 2020-03-24 00:24:56,441 INFO 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream: 
> Fast-forwarding stream 
> 

[jira] [Updated] (HDFS-15235) Transient network failure during NameNode failover makes cluster unavailable

2020-03-24 Thread YCozy (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YCozy updated HDFS-15235:
-
Description: 
We have an HA cluster with two NameNodes: an active NN1 and a standby NN2. At 
some point, NN1 becomes unhealthy and the admin tries to manually failover to 
NN2 by running command
{code:java}
$ hdfs haadmin -failover NN1 NN2
{code}
NN2 receives the request and becomes active:
{code:java}
2020-03-24 00:24:56,412 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services started 
for standby state
2020-03-24 00:24:56,413 WARN 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Edit log tailer 
interrupted: sleep interrupted
2020-03-24 00:24:56,415 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services required 
for active state
2020-03-24 00:24:56,417 INFO 
org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
unfinalized segments in /app/ha-name-dir-shared/current
2020-03-24 00:24:56,419 INFO 
org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
unfinalized segments in /app/nn2/name/current
2020-03-24 00:24:56,419 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Catching up to latest 
edits from old active before taking over writer role in edits logs
2020-03-24 00:24:56,435 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
Reading 
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@7c3095fa 
expecting start txid #1
2020-03-24 00:24:56,436 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
Start loading edits file 
/app/ha-name-dir-shared/current/edits_001-019 
maxTxnsToRead = 9223372036854775807
2020-03-24 00:24:56,441 INFO 
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream: 
Fast-forwarding stream 
'/app/ha-name-dir-shared/current/edits_001-019' 
to transaction ID 1
2020-03-24 00:24:56,567 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
Loaded 1 edits file(s) (the last named 
/app/ha-name-dir-shared/current/edits_001-019) 
of total size 1305.0, total edits 19.0, total load time 109.0 ms
2020-03-24 00:24:56,567 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Marking all 
datanodes as stale
2020-03-24 00:24:56,568 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Processing 4 
messages from DataNodes that were previously queued during standby state
2020-03-24 00:24:56,569 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Reprocessing replication 
and invalidation queues
2020-03-24 00:24:56,569 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: initializing 
replication queues
2020-03-24 00:24:56,570 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing 
edit logs at txnid 20
2020-03-24 00:24:56,571 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: 
Starting log segment at 20
2020-03-24 00:24:56,812 INFO 
org.apache.hadoop.hdfs.server.namenode.FSDirectory: Initializing quota with 4 
thread(s)
2020-03-24 00:24:56,819 INFO 
org.apache.hadoop.hdfs.server.namenode.FSDirectory: Quota initialization 
completed in 6 millisecondsname space=3storage space=24690storage 
types=RAM_DISK=0, SSD=0, DISK=0, ARCHIVE=0, PROVIDED=0
2020-03-24 00:24:56,827 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Starting 
CacheReplicationMonitor with interval 3 milliseconds
{code}
But NN2 fails to send back the RPC response because of temporary network 
partitioning.
{code:java}
java.io.EOFException: End of File Exception between local host is: 
"24e7b5a52e85/172.17.0.2"; destination host is: "127.0.0.3":8180; : 
java.io.EOFException; For more details see:  
http://wiki.apache.org/hadoop/EOFException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:837)
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:791)
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1597)
        at org.apache.hadoop.ipc.Client.call(Client.java:1539)
        at org.apache.hadoop.ipc.Client.call(Client.java:1436)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
        at com.sun.proxy.$Proxy8.transitionToActive(Unknown Source)
        at 

[jira] [Updated] (HDFS-15235) Transient network failure during NameNode failover makes cluster unavailable

2020-03-23 Thread YCozy (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YCozy updated HDFS-15235:
-
Description: 
We have an HA cluster with two NameNodes: an active NN1 and a standby NN2. At 
some point, NN1 becomes unhealthy and the admin tries to manually failover to 
NN2 by running command
{code:java}
$ hdfs haadmin -failover NN1 NN2
{code}
NN2 receives the request and becomes active:
{code:java}
2020-03-24 00:24:56,412 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services started 
for standby state
2020-03-24 00:24:56,413 WARN 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Edit log tailer 
interrupted: sleep interrupted
2020-03-24 00:24:56,415 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services required 
for active state
2020-03-24 00:24:56,417 INFO 
org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
unfinalized segments in /app/ha-name-dir-shared/current
2020-03-24 00:24:56,419 INFO 
org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
unfinalized segments in /app/nn2/name/current
2020-03-24 00:24:56,419 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Catching up to latest 
edits from old active before taking over writer role in edits logs
2020-03-24 00:24:56,435 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
Reading 
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@7c3095fa 
expecting start txid #1
2020-03-24 00:24:56,436 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
Start loading edits file 
/app/ha-name-dir-shared/current/edits_001-019 
maxTxnsToRead = 9223372036854775807
2020-03-24 00:24:56,441 INFO 
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream: 
Fast-forwarding stream 
'/app/ha-name-dir-shared/current/edits_001-019' 
to transaction ID 1
2020-03-24 00:24:56,567 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
Loaded 1 edits file(s) (the last named 
/app/ha-name-dir-shared/current/edits_001-019) 
of total size 1305.0, total edits 19.0, total load time 109.0 ms
2020-03-24 00:24:56,567 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Marking all 
datanodes as stale
2020-03-24 00:24:56,568 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Processing 4 
messages from DataNodes that were previously queued during standby state
2020-03-24 00:24:56,569 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Reprocessing replication 
and invalidation queues
2020-03-24 00:24:56,569 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: initializing 
replication queues
2020-03-24 00:24:56,570 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing 
edit logs at txnid 20
2020-03-24 00:24:56,571 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: 
Starting log segment at 20
2020-03-24 00:24:56,812 INFO 
org.apache.hadoop.hdfs.server.namenode.FSDirectory: Initializing quota with 4 
thread(s)
2020-03-24 00:24:56,819 INFO 
org.apache.hadoop.hdfs.server.namenode.FSDirectory: Quota initialization 
completed in 6 millisecondsname space=3storage space=24690storage 
types=RAM_DISK=0, SSD=0, DISK=0, ARCHIVE=0, PROVIDED=0
2020-03-24 00:24:56,827 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Starting 
CacheReplicationMonitor with interval 3 milliseconds
{code}
But NN2 fails to send back the RPC response because of temporary network 
partitioning.
{code:java}
java.io.EOFException: End of File Exception between local host is: 
"24e7b5a52e85/172.17.0.2"; destination host is: "127.0.0.3":8180; : 
java.io.EOFException; For more details see:  
http://wiki.apache.org/hadoop/EOFException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:837)
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:791)
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1597)
        at org.apache.hadoop.ipc.Client.call(Client.java:1539)
        at org.apache.hadoop.ipc.Client.call(Client.java:1436)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
        at com.sun.proxy.$Proxy8.transitionToActive(Unknown Source)
        at 

[jira] [Updated] (HDFS-15235) Transient network failure during NameNode failover makes cluster unavailable

2020-03-23 Thread YCozy (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YCozy updated HDFS-15235:
-
Description: 
We have an HA cluster with two NameNodes: an active NN1 and a standby NN2. At 
some point, NN1 becomes unhealthy and the admin tries to manually failover to 
NN2 by running command
{code:java}
$ hfs haadmin -failover NN1 NN2
{code}
NN2 receives the request and becomes active:
{code:java}
2020-03-24 00:24:56,412 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services started 
for standby state
2020-03-24 00:24:56,413 WARN 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Edit log tailer 
interrupted: sleep interrupted
2020-03-24 00:24:56,415 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services required 
for active state
2020-03-24 00:24:56,417 INFO 
org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
unfinalized segments in /app/ha-name-dir-shared/current
2020-03-24 00:24:56,419 INFO 
org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
unfinalized segments in /app/nn2/name/current
2020-03-24 00:24:56,419 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Catching up to latest 
edits from old active before taking over writer role in edits logs
2020-03-24 00:24:56,435 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
Reading 
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@7c3095fa 
expecting start txid #1
2020-03-24 00:24:56,436 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
Start loading edits file 
/app/ha-name-dir-shared/current/edits_001-019 
maxTxnsToRead = 9223372036854775807
2020-03-24 00:24:56,441 INFO 
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream: 
Fast-forwarding stream 
'/app/ha-name-dir-shared/current/edits_001-019' 
to transaction ID 1
2020-03-24 00:24:56,567 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
Loaded 1 edits file(s) (the last named 
/app/ha-name-dir-shared/current/edits_001-019) 
of total size 1305.0, total edits 19.0, total load time 109.0 ms
2020-03-24 00:24:56,567 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Marking all 
datanodes as stale
2020-03-24 00:24:56,568 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Processing 4 
messages from DataNodes that were previously queued during standby state
2020-03-24 00:24:56,569 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Reprocessing replication 
and invalidation queues
2020-03-24 00:24:56,569 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: initializing 
replication queues
2020-03-24 00:24:56,570 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing 
edit logs at txnid 20
2020-03-24 00:24:56,571 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: 
Starting log segment at 20
2020-03-24 00:24:56,812 INFO 
org.apache.hadoop.hdfs.server.namenode.FSDirectory: Initializing quota with 4 
thread(s)
2020-03-24 00:24:56,819 INFO 
org.apache.hadoop.hdfs.server.namenode.FSDirectory: Quota initialization 
completed in 6 millisecondsname space=3storage space=24690storage 
types=RAM_DISK=0, SSD=0, DISK=0, ARCHIVE=0, PROVIDED=0
2020-03-24 00:24:56,827 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Starting 
CacheReplicationMonitor with interval 3 milliseconds
{code}
But NN2 fails to send back the RPC response because of temporary network 
partitioning.
{code:java}
java.io.EOFException: End of File Exception between local host is: 
"24e7b5a52e85/172.17.0.2"; destination host is: "127.0.0.3":8180; : 
java.io.EOFException; For more details see:  
http://wiki.apache.org/hadoop/EOFException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:837)
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:791)
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1597)
        at org.apache.hadoop.ipc.Client.call(Client.java:1539)
        at org.apache.hadoop.ipc.Client.call(Client.java:1436)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
        at com.sun.proxy.$Proxy8.transitionToActive(Unknown Source)
        at 

[jira] [Created] (HDFS-15235) Transient network failure during NameNode failover makes cluster unavailable

2020-03-23 Thread YCozy (Jira)
YCozy created HDFS-15235:


 Summary: Transient network failure during NameNode failover makes 
cluster unavailable
 Key: HDFS-15235
 URL: https://issues.apache.org/jira/browse/HDFS-15235
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.3.0
Reporter: YCozy


We have an HA cluster with two NameNodes: an active NN1 and a standby NN2. At 
some point, NN1 becomes unhealthy and the admin tries to manually failover to 
NN2 by running command

 
{code:java}
$ hfs haadmin -failover NN1 NN2
{code}
 

NN2 receives the request and becomes active:

 
{code:java}
2020-03-24 00:24:56,412 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Stopping services started 
for standby state
2020-03-24 00:24:56,413 WARN 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Edit log tailer 
interrupted: sleep interrupted
2020-03-24 00:24:56,415 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services required 
for active state
2020-03-24 00:24:56,417 INFO 
org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
unfinalized segments in /app/ha-name-dir-shared/current
2020-03-24 00:24:56,419 INFO 
org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Recovering 
unfinalized segments in /app/nn2/name/current
2020-03-24 00:24:56,419 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Catching up to latest 
edits from old active before taking over writer role in edits logs
2020-03-24 00:24:56,435 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
Reading 
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@7c3095fa 
expecting start txid #1
2020-03-24 00:24:56,436 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
Start loading edits file 
/app/ha-name-dir-shared/current/edits_001-019 
maxTxnsToRead = 9223372036854775807
2020-03-24 00:24:56,441 INFO 
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream: 
Fast-forwarding stream 
'/app/ha-name-dir-shared/current/edits_001-019' 
to transaction ID 1
2020-03-24 00:24:56,567 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: 
Loaded 1 edits file(s) (the last named 
/app/ha-name-dir-shared/current/edits_001-019) 
of total size 1305.0, total edits 19.0, total load time 109.0 ms
2020-03-24 00:24:56,567 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Marking all 
datanodes as stale
2020-03-24 00:24:56,568 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Processing 4 
messages from DataNodes that were previously queued during standby state
2020-03-24 00:24:56,569 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Reprocessing replication 
and invalidation queues
2020-03-24 00:24:56,569 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: initializing 
replication queues
2020-03-24 00:24:56,570 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing 
edit logs at txnid 20
2020-03-24 00:24:56,571 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: 
Starting log segment at 20
2020-03-24 00:24:56,812 INFO 
org.apache.hadoop.hdfs.server.namenode.FSDirectory: Initializing quota with 4 
thread(s)
2020-03-24 00:24:56,819 INFO 
org.apache.hadoop.hdfs.server.namenode.FSDirectory: Quota initialization 
completed in 6 millisecondsname space=3storage space=24690storage 
types=RAM_DISK=0, SSD=0, DISK=0, ARCHIVE=0, PROVIDED=0
2020-03-24 00:24:56,827 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.CacheReplicationMonitor: Starting 
CacheReplicationMonitor with interval 3 milliseconds
{code}
 

But NN2 fails to send back the RPC response because of temporary network 
partitioning.

 
{code:java}
java.io.EOFException: End of File Exception between local host is: 
"24e7b5a52e85/172.17.0.2"; destination host is: "127.0.0.3":8180; : 
java.io.EOFException; For more details see:  
http://wiki.apache.org/hadoop/EOFException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:837)
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:791)
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1597)
        at org.apache.hadoop.ipc.Client.call(Client.java:1539)
        at org.apache.hadoop.ipc.Client.call(Client.java:1436)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)