[ https://issues.apache.org/jira/browse/HDFS-15725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17247501#comment-17247501 ]
Stephen O'Donnell commented on HDFS-15725: ------------------------------------------ Yes, it is probably the shutdown hook that is ultimately causing this, along with perhaps Kafka Connect not shutting down cleanly (I have not tried to confirm that). NN logs: {code} 2020-12-09 08:00:01,773 DEBUG org.apache.hadoop.hdfs.StateChange: persistBlocks: /redacted/file with 1 blocks is peristed to the file system 2020-12-09 08:00:01,831 DEBUG org.apache.hadoop.hdfs.StateChange: *DIR* NameNode.complete: /redacted/file fileId=3288929 for DFSClient_NONMAPREDUCE_374148706_65 2020-12-09 08:00:01,831 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.completeFile: /redacted/file for DFSClient_NONMAPREDUCE_374148706_65 2020-12-09 08:00:01,831 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: BLOCK* checkFileProgress: blk_1075927944_2211851{blockUCState=COMMITTED, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[[DISK]DS-ab35a9f9-9439-4b05-9ea1-9d68e0c5f11f:NORMAL:10.32.136.71:1044|RBW], ReplicaUnderConstruction[[DISK]DS-d07f48a7-258c-484a-af96-010557f70fa8:NORMAL:10.32.136.62:1044|RBW]]} has not reached minimal replication 1 <nothing further until next client tries to write it> {code} DN logs (all 3 DNs are basically the same): {code} {"type":"log","host":"host_name","category":"HDFS-hdfs-DATANODE-BASE","level":"DEBUG","system":"HEKODLK","time": "2020-12-09 08:00:01,822","logger":"datanode.DataNode","timezone":"EST","log":{"message":"PacketResponder: BP-1714193774-10.32.136.20-1601039015398:blk_1075927944_2211851, type=LAST_IN_PIPELINE, downstreams=0:[]: seqno=-2 waiting for local datanode to finish write."}} {"type":"log","host":"host_name","category":"HDFS-hdfs-DATANODE-BASE","level":"INFO","system":"HEKODLK","time": "2020-12-09 08:00:01,822","logger":"datanode.DataNode","timezone":"EST","log":{"message":"Exception for BP-1714193774-10.32.136.20-1601039015398:blk_1075927944_2211851"}} {"type":"log","host":"host_name","category":"HDFS-hdfs-DATANODE-BASE","level":"INFO","system":"HEKODLK","time": "2020-12-09 08:00:01,823","logger":"datanode.DataNode","timezone":"EST","log":{"message":"PacketResponder: BP-1714193774-10.32.136.20-1601039015398:blk_1075927944_2211851, type=LAST_IN_PIPELINE, downstreams=0:[]: Thread is interrupted."}} {"type":"log","host":"host_name","category":"HDFS-hdfs-DATANODE-BASE","level":"INFO","system":"HEKODLK","time": "2020-12-09 08:00:01,823","logger":"datanode.DataNode","timezone":"EST","log":{"message":"PacketResponder: BP-1714193774-10.32.136.20-1601039015398:blk_1075927944_2211851, type=LAST_IN_PIPELINE, downstreams=0:[] terminating"}} {"type":"log","host":"host_name","category":"HDFS-hdfs-DATANODE-BASE","level":"INFO","system":"HEKODLK","time": "2020-12-09 08:00:01,823","logger":"datanode.DataNode","timezone":"EST","log":{"message":"opWriteBlock BP-1714193774-10.32.136.20-1601039015398:blk_1075927944_2211851 received exception java.io.IOException: Connection reset by peer"}} {code} Notice the DN gets the Connection Reset shortly after the NN, so this probably is the shutdown hook on the client. It was killed over and over until this happened. > Lease Recovery never completes for a committed block which the DNs never > finalize > --------------------------------------------------------------------------------- > > Key: HDFS-15725 > URL: https://issues.apache.org/jira/browse/HDFS-15725 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Affects Versions: 3.4.0 > Reporter: Stephen O'Donnell > Assignee: Stephen O'Donnell > Priority: Major > Attachments: HDFS-15725.001.patch, lease_recovery_2_10.patch > > > It a very rare condition, the HDFS client process can get killed right at the > time it is completing a block / file. > The client sends the "complete" call to the namenode, moving the block into a > committed state, but it dies before it can send the final packet to the > Datanodes telling them to finalize the block. > This means the blocks are stuck on the datanodes in RBW state and nothing > will ever tell them to move out of that state. > The namenode / lease manager will retry forever to close the file, but it > will always complain it is waiting for blocks to reach minimal replication. > I have a simple test and patch to fix this, but I think it warrants some > discussion on whether this is the correct thing to do, or if I need to put > the fix behind a config switch. > My idea, is that if lease recovery occurs, and the block is still waiting on > "minimal replication", just put the file back to UNDER_CONSTRUCTION so that > on the next lease recovery attempt, BLOCK RECOVERY will happen, close the > file and move the replicas to FINALIZED. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org