[ 
https://issues.apache.org/jira/browse/HDFS-14694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Zhang updated HDFS-14694:
------------------------------
    Description: 
HDFS uses file-lease to manage opened files, when a file is not closed 
normally, NN will recover lease automatically after hard limit exceeded. But 
for a long running service(e.g. HBase), the hdfs-client will never die and NN 
don't have any chances to recover the file.

Usually client program needs to handle exceptions by themself to avoid this 
condition(e.g. HBase automatically call recover lease for files that not closed 
normally), but in our experience, most services (in our company) don't process 
this condition properly, which will cause lots of files in abnormal status or 
even data loss.

This Jira propose to add a feature that call recoverLease operation 
automatically when DFSOutputSteam close encounters exception. It should be 
disabled by default, but when somebody builds a long-running service based on 
HDFS, they can enable this option.

We've add this feature to our internal Hadoop distribution for more than 3 
years, it's quite useful according our experience.

  was:
HDFS uses file-lease to manage opened files, when a file is not closed 
normally, NN will recover lease automatically after hard limit exceeded. But 
for a long running service(e.g. HBase), the hdfs-client will never die and NN 
don't have any chances to recover the file.

Usually client program needs to process exceptions by themself to avoid this 
condition(e.g. HBase automatically call recover lease for files that not closed 
normally), but in our experience, most services (in our company) don't process 
this condition properly, which will cause lots of files in abnormal status or 
even data loss.

This Jira propose to add a feature that call recoverLease operation 
automatically when DFSOutputSteam close encounters exception. It should be 
disabled by default, but when somebody builds a long-running service based on 
HDFS, they can enable this option.

We've add this feature to our internal Hadoop distribution for more than 3 
years, it's quite useful according our experience.


> Call recoverLease on DFSOutputStream close exception
> ----------------------------------------------------
>
>                 Key: HDFS-14694
>                 URL: https://issues.apache.org/jira/browse/HDFS-14694
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>            Reporter: Chen Zhang
>            Assignee: Chen Zhang
>            Priority: Major
>         Attachments: HDFS-14694.001.patch
>
>
> HDFS uses file-lease to manage opened files, when a file is not closed 
> normally, NN will recover lease automatically after hard limit exceeded. But 
> for a long running service(e.g. HBase), the hdfs-client will never die and NN 
> don't have any chances to recover the file.
> Usually client program needs to handle exceptions by themself to avoid this 
> condition(e.g. HBase automatically call recover lease for files that not 
> closed normally), but in our experience, most services (in our company) don't 
> process this condition properly, which will cause lots of files in abnormal 
> status or even data loss.
> This Jira propose to add a feature that call recoverLease operation 
> automatically when DFSOutputSteam close encounters exception. It should be 
> disabled by default, but when somebody builds a long-running service based on 
> HDFS, they can enable this option.
> We've add this feature to our internal Hadoop distribution for more than 3 
> years, it's quite useful according our experience.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to