Hi,

I am confused a bit. What is the difference if I use "hadoop distcp" to upload 
files? I assume "hadoop distcp" using multiple trackers to upload files in 
parallel.

Thanks,

Rui

----- Original Message ----
From: Ted Dunning <[EMAIL PROTECTED]>
To: [email protected]
Sent: Thursday, December 20, 2007 6:01:50 PM
Subject: Re: DFS Block Allocation





On 12/20/07 5:52 PM, "C G" <[EMAIL PROTECTED]> wrote:

>   Ted, when you say "copy in the distro" do you need to include the
> configuration files from the running grid?  You don't need to
 actually start
> HDFS on this node do you?

You are correct.  You only need the config files (and the hadoop script
helps make things easier).

>   If I'm following this approach correctly, I would want to have an
 "xfer
> server" whose job it is to essentially run dfs -copyFromLocal on all
> inbound-to-HDFS data. Once I'm certain that my data has copied
 correctly, I
> can delete the local files on the xfer server.

Yes.

>   This is great news, as my current system wastes a lot of time
 copying data
> from data acquisition servers to the master node. If I can copy to
 HDFS
> directly from ny acquisition servers then I am a happy guy....

You are a happy guy.

If your acquisition systems can see all of your datanodes. 







      
____________________________________________________________________________________
Never miss a thing.  Make Yahoo your home page. 
http://www.yahoo.com/r/hs

Reply via email to