yes, client was a namenode and also a datanode.

thanks Raghu, will try not running datanode.

- Prasad.

On Thursday 18 September 2008 12:00:30 am Raghu Angadi wrote:
> pvvpr wrote:
> > The time seemed to be around double the time taken to scp. Didn't realize
> > it could be due to replication.
>
> twice slow is not expected. One possibility is that your client is also
> one of the datanodes (i.e. you are reading from and writing to the same
> disk).
>
> Raghu.
>
> > Regd dfs being faster than scp, the statement came more out of
> > expectation (or wish list) rather than anything else. Since scp is the
> > most elementary way of copying files, was thinking if the network
> > topology of the cluster can be exploited in any way. The only intuition I
> > had was there may be some approaches faster than scp, if any concepts
> > from P2P file sharing are used here. Though I didn't fully explore P2P, I
> > thought there may be some new developments in that area which may be
> > useful here? After napster's centralized way of copying, I think there
> > were quite a bit of
> > improvements? Just thinking loud.
> >
> > - Prasad.
> >
> >> How much slower is 'dfs -put' any way? How large is the file you are
> >> copying?
> >>
> >>  >  but shouldn't that
> >>  > be atleast as fast as copying data to namenode from a single machine,
> >>
> >> It would be "at most" as fast as scp assuming you are not cpu bound. Why
> >> would you think dfs be faster even if it copying to a single replica?
> >>
> >> Raghu.
> >>
> >> Dennis Kubes wrote:
> >>> While an scp will copy data to the namenode machine, it does *not*
> >>> store the data in dfs, it simply copies the data to namenode machine.
> >>> This is the same as copying data to any other machine.  The data isn't
> >>> in DFS and is not accessible from DFS.  If the box running the namenode
> >>> fails you lose your data.
> >>>
> >>> The reason put is slower is that the data is actually being stored into
> >>> the DFS on multiple machines in block format.  It is then accessible
> >>> from programs accessing the DFS such as MR jobs.
> >>>
> >>> Dennis
> >>>
> >>> Prasad Pingali wrote:
> >>>> Hello,
> >>>>    I observe that scp of data to the namenode is faster than actually
> >>>> putting into dfs (all nodes coming from same switch and have same
> >>>> ethernet cards, homogenous nodes)? I understand that "dfs -put" breaks
> >>>> the data into blocks and then copies to datanodes, but shouldn't that
> >>>> be atleast as fast as copying data to namenode from a single machine,
> >>>> if not faster?
> >>>>
> >>>> thanks and regards,
> >>>> Prasad Pingali,
> >>>> IIIT Hyderabad.





Reply via email to