hadoop 0.20 append - some clarifications

Gokulakannan M Thu, 10 Feb 2011 07:12:09 -0800

Hi All,

I have run the hadoop 0.20 append branch . Can someone please clarify the
following behavior?


A writer writing a file but he has not flushed the data and not closed the
file. Could a parallel reader read this partial file? 

For example,

1. a writer is writing a 10MB file(block size 2 MB) 

2. wrote the file upto 5MB (2 finalized blocks + 1 blockBeingWritten) . note
that writer is not calling FsDataOutputStream sync( ) at all

3. now a reader tries to read the above partially written file

I can be able to see that the reader can be able to see the partially
written 5MB data but I feel the reader should be able to see the data only
after the writer calls sync() api. 

Is this the correct behavior or my understanding is wrong?

 

 Thanks,

  Gokul

hadoop 0.20 append - some clarifications

Reply via email to