[
https://issues.apache.org/jira/browse/HADOOP-6307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764758#action_12764758
]
Tsz Wo (Nicholas), SZE commented on HADOOP-6307:
------------------------------------------------
> Not sure why this issue only hits SequenceFile. The problem applies equally
> to TFile (although this was pushed to the caller).
This problem applies to any implementation which gets the un-closed file length
by calling fs.getFileStatus(file).getLen(). (By "problem", I mean that the
reader may not see all hflushed bytes. It sees some part of the file. This is
the same behavior before append.) I did not check TFile before. TFile does
not have this problem if the caller manage to get the correct length and pass
it to the TFile.Reader constructor.
> Support reading on un-closed SequenceFile
> -----------------------------------------
>
> Key: HADOOP-6307
> URL: https://issues.apache.org/jira/browse/HADOOP-6307
> Project: Hadoop Common
> Issue Type: Improvement
> Components: io
> Reporter: Tsz Wo (Nicholas), SZE
>
> When a SequenceFile.Reader is constructed, it calls
> fs.getFileStatus(file).getLen(). However, fs.getFileStatus(file).getLen()
> does not return the hflushed length for un-closed file since the Namenode
> does not know the hflushed length. DFSClient have to ask a datanode for the
> length last block which is being written; see also HDFS-570.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.