No such API as per my knowledge. copyFromLocal is such API. That may not fit in your scenario I guess.
--Laxman -----Original Message----- From: Meghana [mailto:meghana.mara...@germinait.com] Sent: Thursday, July 28, 2011 4:32 PM To: hdfs-user@hadoop.apache.org; lakshman...@huawei.com Cc: common-u...@hadoop.apache.org Subject: Re: Reader/Writer problem in HDFS Thanks Laxman! That would definitely help things. :) Is there a better FileSystem/other method call to create a file in one go (i.e. atomic i guess?), without having to call create() and then write to the stream? ..meghana On 28 July 2011 16:12, Laxman <lakshman...@huawei.com> wrote: > One approach can be use some ".tmp" extension while writing. Once the write > is completed rename back to original file name. Also, reducer has to filter > out ".tmp" files. > > This will ensure reducer will not pickup the partial files. > > We do have the similar scenario where the a/m approach resolved the issue. > > -----Original Message----- > From: Meghana [mailto:meghana.mara...@germinait.com] > Sent: Thursday, July 28, 2011 1:38 PM > To: common-user; hdfs-user@hadoop.apache.org > Subject: Reader/Writer problem in HDFS > > Hi, > > We have a job where the map tasks are given the path to an output folder. > Each map task writes a single file to that folder. There is no reduce > phase. > There is another thread, which constantly looks for new files in the output > folder. If found, it persists the contents to index, and deletes the file. > > We use this code in the map task: > try { > OutputStream oStream = fileSystem.create(path); > IOUtils.write("xyz", oStream); > } finally { > IOUtils.closeQuietly(oStream); > } > > The problem: Some times the reader thread sees & tries to read a file which > is not yet fully written to HDFS (or the checksum is not written yet, etc), > and throws an error. Is it possible to write an HDFS file in such a way > that > it won't be visible until it is fully written? > > We use Hadoop 0.20.203. > > Thanks, > > Meghana > >