Hi,

> The -update behavior is by design.

If I am right, -update is to overwrite the file at the destination if it
is already there. But, in this case it is overwriting the folder as a
file at destination which seems to be a bug

 

> 

> Could you provide the command line, and the directory structure before

> and after issuing the copy? -C

 

Cmd is: hadoop distcp -update
'hftp://<srchost>:50070/user/<user>/distcpsrc' distcp_dest

 

hadoop dfs -lsr distcpsrc          

/user/<user>/distcpsrc/1 <dir>           2008-07-24 05:53

/user/<user>/distcpsrc/1/t       <r 3>   4       2008-07-22 06:12

 

hadoop dfs -lsr  distcp_dest

/user/<user>/distcp_dest/1       <r 3>   4       2008-07-24 06:03 <<
expected /user/<user>/distcp_dest/1/t, file is copied as '1' instead of
'1/t'

 

If I run without '-update', destination dir is:

hadoop dfs -lsr  distcp_dest_noupdate

/user/<user>/distcp_dest_noupdate/1      <dir>           2008-07-24
06:08 << file 't' is not copied and '1' is directory

 

Thanks,

Murali

 

> 

> On Jul 22, 2008, at 9:46 PM, Murali Krishna wrote:

> 

> > Hi,

> >   I am using 0.15.3 and the destination is empty. One more

> > behavior that I am seeing is that if I pass '-update' option, it is

> > writing the content of file '2' in folder 1. (Makes the folder '1'
as

> > file in the destination). So, look like it is treating the
destination

> > for file distcpsrc/1/2 as distcpdest/1.

> >

> > Thanks,

> > Murali

> >

> >> -----Original Message-----

> >> From: Chris Douglas [mailto:[EMAIL PROTECTED]

> >> Sent: Wednesday, July 23, 2008 1:13 AM

> >> To: [email protected]

> >> Subject: Re: distcp skipping the file

> >>

> >> There were many fixes and improvements to distcp in 0.16, but most
of

> >> the critical fixes made it into 0.15.2 and 0.15.3. Is the
destination

> >> empty? Anything already existing at the destination is skipped. -C

> >>

> >> On Jul 22, 2008, at 4:39 AM, Murali Krishna wrote:

> >>

> >>> Hi,

> >>>

> >>> My source folder has a single folder and a single file inside
that.

> >>>

> >>> /user/<user>/distcpsrc/1/2 <r 3>   4       2008-07-22 04:22

> >>>

> >>> In the destination, it is creating the folder '1' but not the file

> >>> '2'.

> >>>

> >>> The counters show 1 file has been skipped.

> >>>

> >>> 08/07/22 04:22:36 INFO mapred.JobClient:     Files skipped=1

> >>>

> >>>

> >>>

> >>> If I create one more file in any directory under the distscpsrc

> >>> folder,

> >>> it copies both the files properly. Is this a bug?

> >>>

> >>> [I am using 15.3]

> >>>

> >>>

> >>>

> >>> Thanks,

> >>>

> >>> Murali

> >>>

> >

 

Reply via email to