[ 
https://issues.apache.org/jira/browse/HDFS-15725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17247455#comment-17247455
 ] 

Kihwal Lee edited comment on HDFS-15725 at 12/10/20, 7:43 PM:
--------------------------------------------------------------

We have seen this more with later versions of clients.  The namenode, 
regardless of its version, cannot recover lease in this case.  The condition is 
triggered by a client that commits without finalizing. We did not have that 
problem with the 2.8 client.  While we need to harden the namenode side, we can 
also fix the client side.

bq. The client sends the "complete" call to the namenode, moving the block into 
a committed state, but it dies before it can send the final packet to the 
Datanodes telling them to finalize the block.

The client should never call {{completeFile()}} if it has not received the ack 
for the last packet. Older clients do not act like that.

{code}
       NameNode.stateChangeLog.warn(message);
+      // If the block is still not minimally replicated when lease recovery
+      // happens, it means the hard limit has passed, and it is unlikely to get
+      // minimally replicated, or another client is trying to recover the lease
+      // on the file. In both cases, it makes sense to move the file back to
+      // UNDER_CONSTRUCTION so BLOCK RECOVERY can happen.
+      
lastBlock.convertToBlockUnderConstruction(BlockUCState.UNDER_CONSTRUCTION,
+          
lastBlock.getUnderConstructionFeature().getExpectedStorageLocations());
{code}

I am not sure whether uncommitting the block is the best way.  The NN is 
capable of doing block recovery without it. [~daryn] wrote this patch 
internally for 2.10. We were about to push it out to the community.  I am 
attaching  [^lease_recovery_2_10.patch] , please take a look at it and let us 
know what you think.


was (Author: kihwal):
We have seen this more with later versions of clients.  The namenode, 
regardless of its version, cannot recover lease in this case.  The condition is 
triggered by a client that commits without finalizing. We did not have that 
problem with the 2.8 client.  While we need to harden the namenode side, we can 
also fix the client side.

bq. The client sends the "complete" call to the namenode, moving the block into 
a committed state, but it dies before it can send the final packet to the 
Datanodes telling them to finalize the block.

The client should never call {{completeFile()}} if it has not received the ack 
for the last packet. Older clients do not act like that.

{code}
       NameNode.stateChangeLog.warn(message);
+      // If the block is still not minimally replicated when lease recovery
+      // happens, it means the hard limit has passed, and it is unlikely to get
+      // minimally replicated, or another client is trying to recover the lease
+      // on the file. In both cases, it makes sense to move the file back to
+      // UNDER_CONSTRUCTION so BLOCK RECOVERY can happen.
+      
lastBlock.convertToBlockUnderConstruction(BlockUCState.UNDER_CONSTRUCTION,
+          
lastBlock.getUnderConstructionFeature().getExpectedStorageLocations());
{code}

I am not sure whether uncommitting the block is the best way.  The NN is 
capable of doing block recovery without it. @daryn wrote this patch internally 
for 2.10. We were about to push it out to the community.  I am attaching  
[^lease_recovery_2_10.patch] , please take a look at it and let us know what 
you think.

> Lease Recovery never completes for a committed block which the DNs never 
> finalize
> ---------------------------------------------------------------------------------
>
>                 Key: HDFS-15725
>                 URL: https://issues.apache.org/jira/browse/HDFS-15725
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 3.4.0
>            Reporter: Stephen O'Donnell
>            Assignee: Stephen O'Donnell
>            Priority: Major
>         Attachments: HDFS-15725.001.patch, lease_recovery_2_10.patch
>
>
> It a very rare condition, the HDFS client process can get killed right at the 
> time it is completing a block / file.
> The client sends the "complete" call to the namenode, moving the block into a 
> committed state, but it dies before it can send the final packet to the 
> Datanodes telling them to finalize the block.
> This means the blocks are stuck on the datanodes in RBW state and nothing 
> will ever tell them to move out of that state.
> The namenode / lease manager will retry forever to close the file, but it 
> will always complain it is waiting for blocks to reach minimal replication.
> I have a simple test and patch to fix this, but I think it warrants some 
> discussion on whether this is the correct thing to do, or if I need to put 
> the fix behind a config switch.
> My idea, is that if lease recovery occurs, and the block is still waiting on 
> "minimal replication", just put the file back to UNDER_CONSTRUCTION so that 
> on the next lease recovery attempt, BLOCK RECOVERY will happen, close the 
> file and move the replicas to FINALIZED.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to