On Thursday 18 September 2008 04:12:13 pm Steve Loughran wrote:
> [EMAIL PROTECTED] wrote:
> > thanks for the replies. So looks like replication might be the real
> > overhead when compared to scp.
>
> Makes sense, but there's no reason why you couldn't have first node you
> copy up the data to, continue and pass that data to the other nodes. If
> its in the same rack, you save on backbone bandwidth, and if it is in a
> different rack, well, the client operation still finishes faster. A
> feature for someone to implement, perhaps?

Yeah even I was thinking what would be the implications of such a feature in 
terms of any failures/block corruption at the first node. If that is a 
non-issue this seems to be something that can improve performance.

- Prasad.

>
> >> Also dfs put copies multiple replicas unlike scp.
> >>
> >> Lohit
> >>
> >> On Sep 17, 2008, at 6:03 AM, "��明" <[EMAIL PROTECTED]> wrote:
> >>
> >> Actually, No.
> >> As you said, I understand that "dfs -put" breaks the data into blocksand
> >> then copies to datanodes,
> >> but scp do not breaks the data into blocksand , and just copy the data
> >> to the namenode!
> >>
> >>
> >> 2008/9/17, Prasad Pingali <[EMAIL PROTECTED]>:
> >>
> >> Hello,
> >>  I observe that scp of data to the namenode is faster than actually
> >> putting
> >> into dfs (all nodes coming from same switch and have same ethernet
> >> cards, homogenous nodes)? I understand that "dfs -put" breaks the data
> >> into blocks
> >> and then copies to datanodes, but shouldn't that be atleast as fast as
> >> copying data to namenode from a single machine, if not faster?
> >>
> >> thanks and regards,
> >> Prasad Pingali,
> >> IIIT Hyderabad.
> >>
> >>
> >>
> >>
> >>
> >> --
> >> Sorry for my english!!  明
> >> Please help me to correct my english expression and error in syntax




Reply via email to