[ 
https://issues.apache.org/jira/browse/HADOOP-89?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12528925
 ] 

Konstantin Shvachko commented on HADOOP-89:
-------------------------------------------

Dhruba convinced me that it is worth committing the patch with the tail 
functionality in it.
There are 2 types of failures that lead to a loss of entire file data. 
# The name-node  failure, and 
# the client failure.

If the name-node dies the length of an incomplete file will be set to 0, which 
correspond to the current behavior, when we just loose the entire file.
If the client dies the name-node automatically closes all files created by the 
client as long as it detects the client lease expiration.
The last one is the most common case of failure, and the code provides 
protection for from loosing data in the case.

> files are not visible until they are closed
> -------------------------------------------
>
>                 Key: HADOOP-89
>                 URL: https://issues.apache.org/jira/browse/HADOOP-89
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.1.0
>            Reporter: Yoram Arnon
>            Assignee: dhruba borthakur
>            Priority: Critical
>             Fix For: 0.15.0
>
>         Attachments: tail.patch, tail3.patch, tail4.patch
>
>
> the current behaviour, whereby a file is not visible until it is closed has 
> several flaws,including:
> 1. no practical way to know if a file/job is progressing
> 2. no way to implement files that never close, such as log files
> 3. failure to close a file results in loss of the file
> The part of the file that's written should be visible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to