[ 
https://issues.apache.org/jira/browse/HDFS-5558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836647#comment-13836647
 ] 

Kihwal Lee commented on HDFS-5558:
----------------------------------

bq. My understanding is that we can only get into this situation if there is 
another bug (such as HDFS-5557) causing an internal inconsistency.

Faulty or busy data nodes might delay the incremental block report for the 
penultimate block of a file or crash before sending it. We have also seen in 
the past a name node getting overwhelmed with RPC calls and falling behind in 
processing incremental block reports. I think it was due to a user using a 
small block size and creating way too many blocks (before min block size fix). 

The block and replica state updates are asynchronous, so we cannot say it won't 
happen. In fact, we even have the close retry logic for this reason. Since it 
should be rare, how about making it WARN?

> LeaseManager monitor thread can crash if the last block is complete but 
> another block is not.
> ---------------------------------------------------------------------------------------------
>
>                 Key: HDFS-5558
>                 URL: https://issues.apache.org/jira/browse/HDFS-5558
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 0.23.9, 2.3.0
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>         Attachments: HDFS-5558.branch-023.patch, HDFS-5558.patch
>
>
> As mentioned in HDFS-5557, if a file has its last and penultimate block not 
> completed and the file is being closed, the last block may be completed but 
> the penultimate one might not. If this condition lasts long and the file is 
> abandoned, LeaseManager will try to recover the lease and the block. But 
> {{internalReleaseLease()}} will fail with invalid cast exception with this 
> kind of file.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to