[ https://issues.apache.org/jira/browse/HDFS-14800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16919501#comment-16919501 ]
hemanthboyina commented on HDFS-14800: -------------------------------------- hi [~LiJinglun] when the file creation is completed , the BlockUnderConstructionFeature will become null > Data race between block report and recoverLease() > ------------------------------------------------- > > Key: HDFS-14800 > URL: https://issues.apache.org/jira/browse/HDFS-14800 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: Wei-Chiu Chuang > Priority: Major > Attachments: PreCommit-HDFS-Build #27717 test - > testUpgradeFromRel1BBWImage [Jenkins].pdf > > > I thought I fixed it in HDFS-10240, but I am seeing a similar race condition > in a precommit test again. > https://builds.apache.org/job/PreCommit-HDFS-Build/27717/testReport/org.apache.hadoop.hdfs/TestDFSUpgradeFromImage/testUpgradeFromRel1BBWImage/ > {noformat} > 2019-08-29 13:34:31,788 [IPC Server handler 9 on default port 39637] INFO > namenode.FSNamesystem (FSNamesystem.java:recoverLeaseInternal(2682)) - > recoverLease: [Lease. Holder: DFSClient_8256078, pending creates: 13], > src=/1kb-multiple-checksum-blocks-64-16 from client DFSClient_8256078 > 2019-08-29 13:34:31,788 [IPC Server handler 9 on default port 39637] INFO > namenode.FSNamesystem (FSNamesystem.java:internalReleaseLease(3424)) - > Recovering [Lease. Holder: DFSClient_8256078, pending creates: 13], > src=/1kb-multiple-checksum-blocks-64-16 > 2019-08-29 13:34:31,789 [IPC Server handler 9 on default port 39637] WARN > BlockStateChange > (BlockUnderConstructionFeature.java:initializeBlockRecovery(238)) - BLOCK* > BlockUnderConstructionFeature.initializeBlockRecovery: No blocks found, lease > removed. > 2019-08-29 13:34:31,790 [IPC Server handler 9 on default port 39637] WARN > hdfs.StateChange (FSNamesystem.java:internalReleaseLease(3550)) - DIR* > NameSystem.internalReleaseLease: File /1kb-multiple-checksum-blocks-64-16 has > not been closed. Lease recovery is in progress. RecoveryId = 1031 for block > blk_7162739548153522810_1020 > 2019-08-29 13:34:31,791 [Block report processor] INFO BlockStateChange > (BlockManager.java:processReport(2645)) - BLOCK* processReport > 0x8f0bcadff51597e8: Processing first storage report for > DS-dd914776-8c9f-4f0d-9bd3-91a552bdc351 from datanode > 187b2e09-75e8-4bd0-ab46-90e32839a21d > 2019-08-29 13:34:31,793 [Block report processor] INFO BlockStateChange > (BlockManager.java:processReport(2674)) - BLOCK* processReport > 0x8f0bcadff51597e8: from storage DS-dd914776-8c9f-4f0d-9bd3-91a552bdc351 node > DatanodeRegistration(127.0.0.1:32987, > datanodeUuid=187b2e09-75e8-4bd0-ab46-90e32839a21d, infoPort=42569, > infoSecurePort=0, ipcPort=39147, > storageInfo=lv=-57;cid=testClusterId;nsid=889473900;c=1567085670457), blocks: > 13, hasStaleStorage: false, processing time: 2 msecs, invalidatedBlocks: 0 > 2019-08-29 13:34:31,808 [BP-268013932-172.17.0.2-1567085670423 heartbeating > to localhost/127.0.0.1:39637] INFO datanode.DataNode > (BPServiceActor.java:blockReport(432)) - Successfully sent block report > 0x8f0bcadff51597e8, containing 1 storage report(s), of which we sent 1. The > reports had 13 total blocks and used 1 RPC(s). This took 4 msecs to generate > and 33 msecs for RPC and NN processing. Got back no commands. > 2019-08-29 13:34:32,795 [IPC Server handler 8 on default port 39637] INFO > namenode.FSNamesystem (FSNamesystem.java:recoverLeaseInternal(2682)) - > recoverLease: [Lease. Holder: DFSClient_NONMAPREDUCE_1352689944_1, pending > creates: 1], src=/1kb-multiple-checksum-blocks-64-16 from client > DFSClient_NONMAPREDUCE_1352689944_1 > 2019-08-29 13:34:32,795 [IPC Server handler 8 on default port 39637] INFO > namenode.FSNamesystem (FSNamesystem.java:internalReleaseLease(3424)) - > Recovering [Lease. Holder: DFSClient_NONMAPREDUCE_1352689944_1, pending > creates: 1], src=/1kb-multiple-checksum-blocks-64-16 > 2019-08-29 13:34:32,796 [IPC Server handler 8 on default port 39637] INFO > blockmanagement.BlockManager (PendingRecoveryBlocks.java:add(80)) - Block > recovery attempt for blk_7162739548153522810_1020 rejected, as the previous > attempt times out in 88 seconds. > {noformat} > Looks like if client calls recoverLease on a file before NameNode receives a > block report from DataNodes, it will fail to recover the lease, and the file > remains unclosed. -- This message was sent by Atlassian Jira (v8.3.2#803003) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org