Harsh,

Thanks for the response bud. Appreciate it!

Thanks,
Ranjith

On May 21, 2012, at 11:09 PM, Harsh J <ha...@cloudera.com> wrote:

> Ranjith,
> 
> MapReduce and HDFS are two different things. MapReduce uses HDFS (and
> can use any other FS as well) to do some efficient work, but HDFS does
> not use MapReduce.
> 
> A simple HDFS transfer is done via network directly - Yes its just a
> block by block copy/write to/from the relevant DataNodes, done over
> network sockets at each end.
> 
> On Tue, May 22, 2012 at 8:58 AM, Ranjith <ranjith.raghuna...@gmail.com> wrote:
>> Thanks harsh. So when it connects directly to the data nodes it does not 
>> fire off any mappers. So how does it get the data over? Is it just a block 
>> by block copy?
>> 
>> Thanks,
>> Ranjith
>> 
>> On May 21, 2012, at 9:22 PM, Harsh J <ha...@cloudera.com> wrote:
>> 
>>> Ranjith,
>>> 
>>> Are you speaking of DistCp?
>>> http://hadoop.apache.org/common/docs/current/distcp.html
>>> 
>>> An 'fs -copyFromLocal' otherwise just runs as a single program that
>>> connects to your DFS nodes and writes data from a single client
>>> thread, and is not distributed on its own.
>>> 
>>> On Tue, May 22, 2012 at 6:48 AM, Ranjith <ranjith.raghuna...@gmail.com> 
>>> wrote:
>>>> 
>>>> I have always wondered about this and and not sure as to phenomenon. When 
>>>> I fire a map reduce job to copy data over in a distributed fashion I would 
>>>> expect to see mappers executing the copy. What happens with a copy command 
>>>> from Hadoop fs?
>>>> 
>>>> Thanks,
>>>> Ranjith
>>> 
>>> 
>>> 
>>> --
>>> Harsh J
> 
> 
> 
> -- 
> Harsh J

Reply via email to