Junping Du updated HDFS-12747:
    Target Version/s: 2.8.4  (was: 2.8.3)

> Lease monitor may infinitely loop on the same lease
> ---------------------------------------------------
>                 Key: HDFS-12747
>                 URL: https://issues.apache.org/jira/browse/HDFS-12747
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.8.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Critical
> Lease recovery incorrectly handles UC files if the last block is complete but 
> the penultimate block is committed.  Incorrectly handles is the euphemism for 
> infinitely loops for days and leaves all abandoned streams open until 
> customers complain.
> The problem may manifest when:
> # Block1 is committed but seemingly never completed
> # Block2 is allocated
> # Lease recovery is initiated for block2
> # Commit block synchronization invokes {{FSNamesytem#closeFileCommitBlocks}}, 
> causing:
> #* {{commitOrCompleteLastBlock}} to mark block2 as complete
> #* 
> {{finalizeINodeFileUnderConstruction}}/{{INodeFile.assertAllBlocksComplete}} 
> to throw {{IllegalStateException}} because the penultimate block1 is 
> # The next lease recovery results in an infinite loop.
> The {{LeaseManager}} expects that {{FSNamesystem#internalReleaseLease}} will 
> either init recovery and renew the lease, or remove the lease.  In the 
> described state it does neither.  The switch case will break out if the last 
> block is complete.  (The case statement ironically contains an assert).  
> Since nothing changed, the lease is still the “next” lease to be processed.  
> The lease monitor loops for 25ms on the same lease, sleeps for 2s, loops on 
> it again.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to