[
https://issues.apache.org/jira/browse/HDFS-4504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13669636#comment-13669636
]
Kihwal Lee commented on HDFS-4504:
----------------------------------
If the datanodes didn't get the last packet of the last block or they died
before reporting to NN of the completion, completeFile() may never work. I can
think of several ways.
* Delete the incomplete file. It will remove the lease. This will violate the
data durability semantics, so it's not feasible to do it in DFSClient. Apps may
do this if close() throws an exception.
* Introduce a new ClientProtocol method, releaseLease(), which triggers
immediate block recovery if necessary. This is an incompatible change, so less
desirable.
* Extend complete() by adding an optional boolean arg, "force". Things will
stay compatible. If a new client is talking to an old NN, the file may not get
completed right away, but this is no worse than current behavior. The client
(lease renewer) can keep trying periodically. Probably less often than the
lease renewal. We may only allow this when lastBlock is present, since the
acked block length will reduce the risk of truncating valid data.
> DFSOutputStream#close doesn't always release resources (such as leases)
> -----------------------------------------------------------------------
>
> Key: HDFS-4504
> URL: https://issues.apache.org/jira/browse/HDFS-4504
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Colin Patrick McCabe
> Assignee: Colin Patrick McCabe
> Attachments: HDFS-4504.001.patch, HDFS-4504.002.patch
>
>
> {{DFSOutputStream#close}} can throw an {{IOException}} in some cases. One
> example is if there is a pipeline error and then pipeline recovery fails.
> Unfortunately, in this case, some of the resources used by the
> {{DFSOutputStream}} are leaked. One particularly important resource is file
> leases.
> So it's possible for a long-lived HDFS client, such as Flume, to write many
> blocks to a file, but then fail to close it. Unfortunately, the
> {{LeaseRenewerThread}} inside the client will continue to renew the lease for
> the "undead" file. Future attempts to close the file will just rethrow the
> previous exception, and no progress can be made by the client.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira