Steve Loughran wrote:
[EMAIL PROTECTED] wrote:
thanks for the replies. So looks like replication might be the real
overhead when compared to scp.

Makes sense, but there's no reason why you couldn't have first node you copy up the data to, continue and pass that data to the other nodes.

Replication can not account for 50% slow down. When the data is written, the writes on replicas are pipelined. So essentially data is written to replicas in parallel.

Raghu.

If its in the same rack, you save on backbone bandwidth, and if it is in a different rack, well, the client operation still finishes faster. A feature for someone to implement, perhaps?


Also dfs put copies multiple replicas unlike scp.

Lohit

On Sep 17, 2008, at 6:03 AM, "��明" <[EMAIL PROTECTED]> wrote:

Actually, No.
As you said, I understand that "dfs -put" breaks the data into blocksand
then copies to datanodes,
but scp do not breaks the data into blocksand , and just copy the data to
the namenode!


2008/9/17, Prasad Pingali <[EMAIL PROTECTED]>:

Hello,
 I observe that scp of data to the namenode is faster than actually
putting
into dfs (all nodes coming from same switch and have same ethernet cards,
homogenous nodes)? I understand that "dfs -put" breaks the data into
blocks
and then copies to datanodes, but shouldn't that be atleast as fast as
copying data to namenode from a single machine, if not faster?

thanks and regards,
Prasad Pingali,
IIIT Hyderabad.





--
Sorry for my english!!  明
Please help me to correct my english expression and error in syntax









Reply via email to