@Chris thanks a lot that helped a lot.
On Mon, Apr 15, 2013 at 11:02 PM, Chris Nauroth <cnaur...@hortonworks.com>wrote: > Hello Thoihen, > > I'm moving this discussion from common-dev (questions about developing > Hadoop) to user (questions about using Hadoop). > > If you haven't already seen it, then I recommend reading the cluster setup > documentation. It's a bit different depending on the version of the Hadoop > code that you're deploying and running. You mentioned JobTracker, so I > expect that you're using something from the 1.x line, but here are links to > both 1.x and 2.x docs just in case: > > 1.x: http://hadoop.apache.org/docs/r1.1.2/cluster_setup.html > 2.x/trunk: > > http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/ClusterSetup.html > > To address your specific questions: > > 1. You can run the hadoop jar command and submit MapReduce jobs from any > machine that has the Hadoop software and configuration deployed and has > network connectivity to the machines that make up the Hadoop cluster. > > 2. Yes, you can use a separate machine that is not a member of the cluster > (meaning it does not run Hadoop daemons like DataNode, TaskTracker, or > NodeManager). This is your choice. I've found it valuable to isolate > nodes like this to prevent MR job tasks from taking processing resources > away from interactive user commands, but this does mean that the resources > on that node can't be utilized by MR jobs during user idle times, so it > causes a small hit to overall utilization. > > Hope this helps, > --Chris > > > On Mon, Apr 15, 2013 at 9:36 AM, Thoihen Maibam <thoihen...@gmail.com > >wrote: > > > Hi All, > > > > I am really new to Hadoop and installed hadoop in my local ubuntu > machine. > > I also created a wordcount.jar and started hadoop with start-all.sh which > > started all the hadoop daemons and used jps to confirm it. Cd to > hadoop/bin > > and ran hadoop jar x.jar and successfully ran the map reduce program. > > > > Now, can someone please help me how I should run the hadoop jar command > > over a clustered environment say for example a cluster with 50 nodes. I > > know a dedicated machine would be namenode and another jobtracker and > other > > datanodes and tasktrackers. > > > > 1. From which machine should I run the hadoop jar command considering I > > have a mapreduce jar in hand. Is it the jobtracker machine from where I > > should run this hadoop jar command or can I run this hadoop jar command > > from any machine in the cluster. > > > > 2, Can I run the map reduce job from another machine which is not part of > > the cluster , if yes how should I do it. > > > > Please help me. > > > > Regards > > thoihen > > >