Cool - btw, it might be easier to identify uploaded files via a .tmp or .uploading extension instead of putting them in a temp folder. It's the usual approach... You can check out how firefox handles downloads, if you want to cover all the corner cases.
Take care, -stu -----Original Message----- From: Ishaaq Chandy <ish...@gmail.com> Date: Wed, 2 Mar 2011 08:16:08 To: <hdfs-user@hadoop.apache.org>; <stu24m...@yahoo.com> Reply-To: hdfs-user@hadoop.apache.org Subject: Re: atomicity of copyFromLocal Thanks Stu, That is what I suspected but was hoping was not the case. The rename fix is simple enough, even if a little ugly. Regards, Ishaaq On 1 March 2011 15:51, <stu24m...@yahoo.com> wrote: > Pretty sure it's not atomic. I can read files I write via thrift well > before they're done. > Rename has always worked for me... > > Take care, > -stu > ------------------------------ > *From: * Ishaaq Chandy <ish...@gmail.com> > *Date: *Tue, 1 Mar 2011 15:22:24 +1100 > *To: *<hdfs-user@hadoop.apache.org> > *ReplyTo: * hdfs-user@hadoop.apache.org > *Subject: *atomicity of copyFromLocal > > Hi all, > How "atomic" is the copyFromLocal call? i.e. on process is in the midst of > uploading a file to HDFS is it possible for another process to start reading > it before the upload is complete? > > I am currently safeguarding my code from this possibility by uploading it > to a temporary directory and the renaming it to its final destination (the > assumption being that a rename is "more atomic" than copyFromLocal), but I'd > like to avoid doing this in two steps if possible. > > Regards, > Ishaaq >