On 1/25/10 2:20 AM, MOHAMMED IRFANULLA S wrote: > Hi, > > I'm using hadoop 0.20.1. I would appreciate any help on the following > issue in HDFS. > > User1 has created a file file1.txt and started writing to this > file(Writer thread). > User2 and user3 try to read from this file. But cannot read anything > until atleast one of the blocks is complete. and they cannot read any > block under development. (Reader threads) > > Is it possible to block/prevent user2 and User3 from reading the > file1.txt completely until the Writer thread calls close(). > If possible, how to achieve it ?
Mohammed: This is the documented behavior of HDFS with regard to data visibility. Currently, there is no way to prevent block access from user 2 and 3 in your scenario until user 1 finishes writing; you'd have to implement it at a layer higher up than HDFS. In theory, one can force a sync by calling FSDataOutputStream#sync() but I think that's still buggy (slated for fix in 0.21.x? - see HDFS-200[1]). This would trade performance for visibility. I think the alternative of forcing the readers to block until some event triggered (after the file is completely written) by the writer is a better plan, though. Hope this helps. [1] - https://issues.apache.org/jira/browse/HDFS-200 -- Eric Sammer e...@lifeless.net http://esammer.blogspot.com