So lets consider a case that I copied the file from local to hdfs temporary 
directory and then after copying, I executed move to some Input dir. This takes 
fraction of seconds but lets assume that my job is running on that Input folder 
at that point in time when the file is getting moved and it tries to access the 
half moved file.

Now what happens? Does HDFS throw some IOExecptions or it will leave the file 
unexecuted till next job runs. 

-----Original Message-----
From: Harsh J [mailto:ha...@cloudera.com] 
Sent: Tuesday, May 01, 2012 6:11 PM
To: hdfs-user@hadoop.apache.org
Subject: Re: File Integrity in HDFS

Yes renames/moves are merely metadata changes, like on your local filesystem 
(unless you move across partitions/disks, a concept that wouldn't apply to a 
DFS).

On Tue, May 1, 2012 at 5:53 PM, Stuti Awasthi <stutiawas...@hcl.com> wrote:
> Thanks Harsh,
> I also looked that when we are doing copying from Local to HDFS or HDFS to 
> HDFS, it takes considerable time depending on file size but if we move within 
> HDFS, it is done instantly.
> So internally does HDFS just rename the file and its metadata?
>
> -----Original Message-----
> From: Harsh J [mailto:ha...@cloudera.com]
> Sent: Tuesday, May 01, 2012 5:22 PM
> To: hdfs-user@hadoop.apache.org
> Subject: Re: File Integrity in HDFS
>
> The easiest way out would be to rename files to pick-up-able name upon 
> successful copy, or have the loading done to a different directory and 
> rename/move the file when successfully closed to the job input directory.
>
> On Tue, May 1, 2012 at 3:22 PM, Stuti Awasthi <stutiawas...@hcl.com> wrote:
>> Hi All,
>>
>>
>>
>> I have a scenario in which Input files are copied to HDFS and MR jobs 
>> run on the input directory.
>>
>> Now there can be a scenario in which file is getting copied to HDFS 
>> and MR jobs starts , in this case I do not want my MR job to pick 
>> those files which are getting copied to hdfs and process of copying is not 
>> complete.
>>
>>
>>
>> Is there any way/api to check that if the file is not completely 
>> written to HDFS we can know.
>>
>>
>>
>> Regards,
>>
>> Stuti Awasthi
>>
>> HCL Comnet Systems and Services Ltd
>>
>> F-8/9 Basement, Sec-3,Noida.
>>
>>
>>
>>
>> ________________________________
>>
>>
>> ::DISCLAIMER::
>> ---------------------------------------------------------------------
>> -
>> -------------------------------------------------
>>
>> The contents of this e-mail and any attachment(s) are confidential 
>> and intended for the named recipient(s) only.
>> E-mail transmission cannot be guaranteed to be secure or error-free 
>> as information could be intercepted, corrupted, lost, destroyed, 
>> arrive late or incomplete, or contain viruses.The e mail and its 
>> contents (with or without referred
>> errors) shall therefore not attach any liability on the originator or 
>> HCL or its affiliates. Any views or opinions presented in this email 
>> are solely those of the author and may not necessarily reflect the 
>> opinions of HCL or its affiliates. Any form of reproduction, 
>> dissemination, copying, disclosure, Modification, distribution and/or 
>> publication of this message without the prior written consent of the 
>> author of this e-mail is strictly prohibited. If you have received 
>> this email in error please delete it and notify the sender 
>> immediately. Before opening any mail and attachments please check 
>> them for viruses and defect.
>>
>> ---------------------------------------------------------------------
>> -
>> -------------------------------------------------
>
>
>
> --
> Harsh J



--
Harsh J

Reply via email to