[
https://issues.apache.org/jira/browse/HADOOP-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12667952#action_12667952
]
dhruba borthakur commented on HADOOP-4379:
------------------------------------------
@Doug: the waiting for lease recovery to reclaim the length of the jar file
will work only if the original writer has dies and now a new reader wants to
read all the data that was written by the writer before it died. This was the
use-case for Jim Kellerman.
In your case, you have a FileSystem object that was used to write to the file.
Then your application tried to reopen the same file by invoking fs.apend() on
the same FileSystem object. The namenode logs clearly shows that the same
dfsclient (i.e. filesystem object) was used to try appending to the file. In
this case, the namenode rejects the append call because it knows that the
original writer is alive and is still writing to it. If your aim is to have a
concurrent reader along with a concurrent writer, then the best that this patch
can do is to allow the concurent reader to see the new file length only when
the block is full. On the other hand, if you can make your application ot
depend on the file length, then you can see all the data in the file. Another
alternative would be to implement a new call FileSystem.length() than can
retrieve the latest length from the datanode, but that can be done as part of a
separate JIRA. Please let us know how difficult is it for you to change your
app to not depend on the file length and just read till end-of-file?
> In HDFS, sync() not yet guarantees data available to the new readers
> --------------------------------------------------------------------
>
> Key: HADOOP-4379
> URL: https://issues.apache.org/jira/browse/HADOOP-4379
> Project: Hadoop Core
> Issue Type: New Feature
> Components: dfs
> Reporter: Tsz Wo (Nicholas), SZE
> Assignee: dhruba borthakur
> Fix For: 0.19.1
>
> Attachments: 4379_20081010TC3.java, fsyncConcurrentReaders.txt,
> fsyncConcurrentReaders3.patch, fsyncConcurrentReaders4.patch,
> hypertable-namenode.log.gz, Reader.java, Reader.java, Writer.java, Writer.java
>
>
> In the append design doc
> (https://issues.apache.org/jira/secure/attachment/12370562/Appends.doc), it
> says
> * A reader is guaranteed to be able to read data that was 'flushed' before
> the reader opened the file
> However, this feature is not yet implemented. Note that the operation
> 'flushed' is now called "sync".
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.