Harsh, Thanks for the response bud. Appreciate it!
Thanks, Ranjith On May 21, 2012, at 11:09 PM, Harsh J <ha...@cloudera.com> wrote: > Ranjith, > > MapReduce and HDFS are two different things. MapReduce uses HDFS (and > can use any other FS as well) to do some efficient work, but HDFS does > not use MapReduce. > > A simple HDFS transfer is done via network directly - Yes its just a > block by block copy/write to/from the relevant DataNodes, done over > network sockets at each end. > > On Tue, May 22, 2012 at 8:58 AM, Ranjith <ranjith.raghuna...@gmail.com> wrote: >> Thanks harsh. So when it connects directly to the data nodes it does not >> fire off any mappers. So how does it get the data over? Is it just a block >> by block copy? >> >> Thanks, >> Ranjith >> >> On May 21, 2012, at 9:22 PM, Harsh J <ha...@cloudera.com> wrote: >> >>> Ranjith, >>> >>> Are you speaking of DistCp? >>> http://hadoop.apache.org/common/docs/current/distcp.html >>> >>> An 'fs -copyFromLocal' otherwise just runs as a single program that >>> connects to your DFS nodes and writes data from a single client >>> thread, and is not distributed on its own. >>> >>> On Tue, May 22, 2012 at 6:48 AM, Ranjith <ranjith.raghuna...@gmail.com> >>> wrote: >>>> >>>> I have always wondered about this and and not sure as to phenomenon. When >>>> I fire a map reduce job to copy data over in a distributed fashion I would >>>> expect to see mappers executing the copy. What happens with a copy command >>>> from Hadoop fs? >>>> >>>> Thanks, >>>> Ranjith >>> >>> >>> >>> -- >>> Harsh J > > > > -- > Harsh J