[jira] [Updated] (HDFS-12747) Lease monitor may infinitely loop on the same lease

2018-09-08 Thread Junping Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-12747:
--
Target Version/s: 2.8.6  (was: 2.8.5)

> Lease monitor may infinitely loop on the same lease
> ---
>
> Key: HDFS-12747
> URL: https://issues.apache.org/jira/browse/HDFS-12747
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
>
> Lease recovery incorrectly handles UC files if the last block is complete but 
> the penultimate block is committed.  Incorrectly handles is the euphemism for 
> infinitely loops for days and leaves all abandoned streams open until 
> customers complain.
> The problem may manifest when:
> # Block1 is committed but seemingly never completed
> # Block2 is allocated
> # Lease recovery is initiated for block2
> # Commit block synchronization invokes {{FSNamesytem#closeFileCommitBlocks}}, 
> causing:
> #* {{commitOrCompleteLastBlock}} to mark block2 as complete
> #* 
> {{finalizeINodeFileUnderConstruction}}/{{INodeFile.assertAllBlocksComplete}} 
> to throw {{IllegalStateException}} because the penultimate block1 is 
> "COMMITTED but not COMPLETE"
> # The next lease recovery results in an infinite loop.
> The {{LeaseManager}} expects that {{FSNamesystem#internalReleaseLease}} will 
> either init recovery and renew the lease, or remove the lease.  In the 
> described state it does neither.  The switch case will break out if the last 
> block is complete.  (The case statement ironically contains an assert).  
> Since nothing changed, the lease is still the “next” lease to be processed.  
> The lease monitor loops for 25ms on the same lease, sleeps for 2s, loops on 
> it again.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12747) Lease monitor may infinitely loop on the same lease

2018-04-13 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-12747:
--
Target Version/s: 2.8.5  (was: 2.8.4)

> Lease monitor may infinitely loop on the same lease
> ---
>
> Key: HDFS-12747
> URL: https://issues.apache.org/jira/browse/HDFS-12747
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
>
> Lease recovery incorrectly handles UC files if the last block is complete but 
> the penultimate block is committed.  Incorrectly handles is the euphemism for 
> infinitely loops for days and leaves all abandoned streams open until 
> customers complain.
> The problem may manifest when:
> # Block1 is committed but seemingly never completed
> # Block2 is allocated
> # Lease recovery is initiated for block2
> # Commit block synchronization invokes {{FSNamesytem#closeFileCommitBlocks}}, 
> causing:
> #* {{commitOrCompleteLastBlock}} to mark block2 as complete
> #* 
> {{finalizeINodeFileUnderConstruction}}/{{INodeFile.assertAllBlocksComplete}} 
> to throw {{IllegalStateException}} because the penultimate block1 is 
> "COMMITTED but not COMPLETE"
> # The next lease recovery results in an infinite loop.
> The {{LeaseManager}} expects that {{FSNamesystem#internalReleaseLease}} will 
> either init recovery and renew the lease, or remove the lease.  In the 
> described state it does neither.  The switch case will break out if the last 
> block is complete.  (The case statement ironically contains an assert).  
> Since nothing changed, the lease is still the “next” lease to be processed.  
> The lease monitor loops for 25ms on the same lease, sleeps for 2s, loops on 
> it again.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12747) Lease monitor may infinitely loop on the same lease

2018-04-06 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-12747:
--
Target Version/s: 2.8.4  (was: 2.8.3)

> Lease monitor may infinitely loop on the same lease
> ---
>
> Key: HDFS-12747
> URL: https://issues.apache.org/jira/browse/HDFS-12747
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
>
> Lease recovery incorrectly handles UC files if the last block is complete but 
> the penultimate block is committed.  Incorrectly handles is the euphemism for 
> infinitely loops for days and leaves all abandoned streams open until 
> customers complain.
> The problem may manifest when:
> # Block1 is committed but seemingly never completed
> # Block2 is allocated
> # Lease recovery is initiated for block2
> # Commit block synchronization invokes {{FSNamesytem#closeFileCommitBlocks}}, 
> causing:
> #* {{commitOrCompleteLastBlock}} to mark block2 as complete
> #* 
> {{finalizeINodeFileUnderConstruction}}/{{INodeFile.assertAllBlocksComplete}} 
> to throw {{IllegalStateException}} because the penultimate block1 is 
> "COMMITTED but not COMPLETE"
> # The next lease recovery results in an infinite loop.
> The {{LeaseManager}} expects that {{FSNamesystem#internalReleaseLease}} will 
> either init recovery and renew the lease, or remove the lease.  In the 
> described state it does neither.  The switch case will break out if the last 
> block is complete.  (The case statement ironically contains an assert).  
> Since nothing changed, the lease is still the “next” lease to be processed.  
> The lease monitor loops for 25ms on the same lease, sleeps for 2s, loops on 
> it again.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12747) Lease monitor may infinitely loop on the same lease

2017-10-31 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-12747:
---
Description: 
Lease recovery incorrectly handles UC files if the last block is complete but 
the penultimate block is committed.  Incorrectly handles is the euphemism for 
infinitely loops for days and leaves all abandoned streams open until customers 
complain.

The problem may manifest when:
# Block1 is committed but seemingly never completed
# Block2 is allocated
# Lease recovery is initiated for block2
# Commit block synchronization invokes {{FSNamesytem#closeFileCommitBlocks}}, 
causing:
#* {{commitOrCompleteLastBlock}} to mark block2 as complete
#* {{finalizeINodeFileUnderConstruction}}/{{INodeFile.assertAllBlocksComplete}} 
to throw {{IllegalStateException}} because the penultimate block1 is "COMMITTED 
but not COMPLETE"
# The next lease recovery results in an infinite loop.

The {{LeaseManager}} expects that {{FSNamesystem#internalReleaseLease}} will 
either init recovery and renew the lease, or remove the lease.  In the 
described state it does neither.  The switch case will break out if the last 
block is complete.  (The case statement ironically contains an assert).  Since 
nothing changed, the lease is still the “next” lease to be processed.  The 
lease monitor loops for 25ms on the same lease, sleeps for 2s, loops on it 
again.

  was:
Lease recovery incorrectly handles UC files if the last block is complete but 
the penultimate block is committed.  Incorrectly handles is the euphemism for 
infinitely loops for days and leaves all abandoned streams open until customers 
complain.

The problem may manifest when:
# Block1 is committed but seemingly never committed
# Block2 is allocated
# Lease recovery is initiated for block2
# Commit block synchronization invokes {{FSNamesytem#closeFileCommitBlocks}}, 
causing:
#* {{commitOrCompleteLastBlock}} to mark block2 as complete
#* {{finalizeINodeFileUnderConstruction}}/{{INodeFile.assertAllBlocksComplete}} 
to throw {{IllegalStateException}} because the penultimate block1 is "COMMITTED 
but not COMPLETE"
# The next lease recovery results in an infinite loop.

The {{LeaseManager}} expects that {{FSNamesystem#internalReleaseLease}} will 
either init recovery and renew the lease, or remove the lease.  In the 
described state it does neither.  The switch case will break out if the last 
block is complete.  (The case statement ironically contains an assert).  Since 
nothing changed, the lease is still the “next” lease to be processed.  The 
lease monitor loops for 25ms on the same lease, sleeps for 2s, loops on it 
again.


> Lease monitor may infinitely loop on the same lease
> ---
>
> Key: HDFS-12747
> URL: https://issues.apache.org/jira/browse/HDFS-12747
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
>Reporter: Daryn Sharp
>Priority: Critical
>
> Lease recovery incorrectly handles UC files if the last block is complete but 
> the penultimate block is committed.  Incorrectly handles is the euphemism for 
> infinitely loops for days and leaves all abandoned streams open until 
> customers complain.
> The problem may manifest when:
> # Block1 is committed but seemingly never completed
> # Block2 is allocated
> # Lease recovery is initiated for block2
> # Commit block synchronization invokes {{FSNamesytem#closeFileCommitBlocks}}, 
> causing:
> #* {{commitOrCompleteLastBlock}} to mark block2 as complete
> #* 
> {{finalizeINodeFileUnderConstruction}}/{{INodeFile.assertAllBlocksComplete}} 
> to throw {{IllegalStateException}} because the penultimate block1 is 
> "COMMITTED but not COMPLETE"
> # The next lease recovery results in an infinite loop.
> The {{LeaseManager}} expects that {{FSNamesystem#internalReleaseLease}} will 
> either init recovery and renew the lease, or remove the lease.  In the 
> described state it does neither.  The switch case will break out if the last 
> block is complete.  (The case statement ironically contains an assert).  
> Since nothing changed, the lease is still the “next” lease to be processed.  
> The lease monitor loops for 25ms on the same lease, sleeps for 2s, loops on 
> it again.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org