It isn't the DataNode that does the compute spawn/work, but the TaskTracker.
If you wanted to increase MR parallelism on a single machine, you do not need two DNs, nor two TTs, just higher slot capacities in your TT's mapred-site.xml via properties mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum. On Mon, Jul 28, 2014 at 4:30 PM, sindhu hosamane <[email protected]> wrote: > Hello , > > i set up 2 datanodes on a single machine(ubuntu machine) accordingly > mentioned in the thread > http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201009.mbox/%3ca3ef3f6af24e204b812d1d24ccc8d71a03688...@mse16be2.mse16.exchange.ms%3E > > Ubuntu machine has 2 processors and 8 cores. Assuming that machine is > powerful , i Setup 2 datanodes on that same machine. > > Now when i run jps on that multinode hadoop , i get > Namenode > Datanode > Datanode > Jobtracker > Tasktracker > Secondary Namenode > > The above result Shows 2 datanodes are up and running > > Also i have a single node on that ubuntu machine as well. > Now when i check Performance on singlenode and multinode , both are almost > same.So now , > How do i make sure load is being distributed on both datanodes or each > datanode uses different cores of the ubuntu machine. > > (Note: i know multiple datanodes on same machine is not that advantageous , > but assuming my machine is powerful ..i set it up..) > > would appreciate any advices on this. > > Regards, > Sindhu -- Harsh J
