Hi Friso, Thank you very much for your answer. I guess I will assume that it's atomic like you did. At least for now.
Again thank you, JP. On Thu, Aug 26, 2010 at 8:51 AM, Friso van Vollenhoven < fvanvollenho...@xebia.com> wrote: > Hi JP, > > I don't actually know the answer to your question, but we do a lot of > things using files and directories on HDFS and use renames to move files out > of directories which are periodically scanned by other processes. All I can > say: it has never gone wrong. We are happily living with the assumptions > that the rename is atomic. Our directory scanning jobs runs every couple of > seconds and has done so without any error for months. > > Short answer: I don't know, but it appears to be that way (ignorance is a > blessing). > > > Friso > > > > On 25 aug 2010, at 02:21, Jean-Pierre OCALAN wrote: > > Hi, > > I would like to know if the *rename* operation (i.e. renaming a directory > or a single file) can be consider as an atomic operation in HDFS. > > Basically what i am trying to achieve is having one process that > continiously add new file into the HDFS and another process that will start > every 15 minutes a map/reduce flow on file that were newly added into the > HDFS. > > In other words a process A continuously read a *local directory "A/in"*where > new files are moved there continuously and put each file in a > *"A/tmp" directory on the HDFS*. When A finish to put one file in "*A/tmp" > * it will *move/rename that file into a "B/in" directory*. At the same > time a process B will, every 15 minutes, push all the files present in > "B/in" to a map/reduce flow. > > Regards, > > -- JP > > > -- jean-pierre ocalan jpoca...@gmail.com