Hello Ashish 2) Run the example again using the command ./hadoop dfs wordCount.jar /opt/ApacheHadoop/temp/worker.log /opt/ApacheHadoop/out/
Unless if it typo mistake the command should be ./hadoop jar wordCount.jar WordCount /opt/ApacheHadoop/temp/worker.log /opt/ApacheHadoop/out/ One more thing try , just stop datanode process in 10.12.11.210 and run the job On Wed, Jan 15, 2014 at 2:07 PM, Ashish Jain <[email protected]> wrote: > Hello Sudhakara, > > Thanks for your suggestion. However once I change the mapreduce framework > to yarn my map reduce jobs does not get executed at all. It seems it is > waiting on some thread indefinitely. Here is what I have done > > 1) Set the mapreduce framework to yarn in mapred-site.xml > <property> > <name>mapreduce.framework.name</name> > <value>yarn</value> > </property> > 2) Run the example again using the command > ./hadoop dfs wordCount.jar /opt/ApacheHadoop/temp/worker.log > /opt/ApacheHadoop/out/ > > The jobs are just stuck and do not move further. > > > I also tried the following and it complains of filenotfound exception and > some security exception > > ./hadoop dfs wordCount.jar file:///opt/ApacheHadoop/temp/worker.log > file:///opt/ApacheHadoop/out/ > > Below is the status of the job from hadoop application console. The > progress bar does not move at all. > > ID > User > Name > Application Type > Queue > StartTime > FinishTime > State > FinalStatus > Progress > Tracking UI > application_1389771586883_0002<http://10.12.11.210:8088/cluster/app/application_1389771586883_0002> > rootwordcount MAPREDUCEdefaultWed, 15 Jan 2014 07:52:04 GMTN/AACCEPTED > UNDEFINED > > UNASSIGNE <http://10.12.11.210:8088/cluster/apps#> > Please advice what should I do > > --Ashish > > > On Tue, Jan 14, 2014 at 5:48 PM, sudhakara st <[email protected]>wrote: > >> Hello Ashish >> It seems job is running in Local job runner(LocalJobRunner) by reading >> the Local file system. Can you try by give the full URI path of the input >> and output path. >> like >> $hadoop jar program.jar ProgramName -Dmapreduce.framework.name=yarn >> file:///home/input/ file:///home/output/ >> >> >> On Mon, Jan 13, 2014 at 3:02 PM, Ashish Jain <[email protected]> wrote: >> >>> German, >>> >>> This does not seem to be helping. I tried to use the Fairscheduler as my >>> resource manger but the behavior remains same. I could see the >>> fairscheduler log getting continuous heart beat from both the other nodes. >>> But it is still not distributing the work to other nodes. What I did next >>> was started 3 jobs simultaneously so that may be some part of one of the >>> job be distributed to other nodes. However still only one node is being >>> used :(((. What is that is going wrong can some one help? >>> >>> Sample of fairsheduler log: >>> 2014-01-13 15:13:54,293 HEARTBEAT l1dev-211 >>> 2014-01-13 15:13:54,953 HEARTBEAT l1-dev06 >>> 2014-01-13 15:13:54,988 HEARTBEAT l1-DEV05 >>> 2014-01-13 15:13:55,295 HEARTBEAT l1dev-211 >>> 2014-01-13 15:13:55,956 HEARTBEAT l1-dev06 >>> 2014-01-13 15:13:55,993 HEARTBEAT l1-DEV05 >>> 2014-01-13 15:13:56,297 HEARTBEAT l1dev-211 >>> 2014-01-13 15:13:56,960 HEARTBEAT l1-dev06 >>> 2014-01-13 15:13:56,997 HEARTBEAT l1-DEV05 >>> 2014-01-13 15:13:57,299 HEARTBEAT l1dev-211 >>> 2014-01-13 15:13:57,964 HEARTBEAT l1-dev06 >>> 2014-01-13 15:13:58,001 HEARTBEAT l1-DEV05 >>> >>> My Data distributed as blocks to other nodes. The host with IP >>> 10.12.11.210 has all the data and this is the one which is serving all the >>> request. >>> >>> Total number of blocks: 8 >>> 1073741866: 10.12.11.211:50010 View Block Info >>> 10.12.11.210:50010 View Block Info >>> 1073741867: 10.12.11.211:50010 View Block Info >>> 10.12.11.210:50010 View Block Info >>> 1073741868: 10.12.11.210:50010 View Block Info >>> 10.12.11.209:50010 View Block Info >>> 1073741869: 10.12.11.210:50010 View Block Info >>> 10.12.11.209:50010 View Block Info >>> 1073741870: 10.12.11.211:50010 View Block Info >>> 10.12.11.210:50010 View Block Info >>> 1073741871: 10.12.11.210:50010 View Block Info >>> 10.12.11.209:50010 View Block Info >>> 1073741872: 10.12.11.211:50010 View Block Info >>> 10.12.11.210:50010 View Block Info >>> 1073741873: 10.12.11.210:50010 View Block Info >>> 10.12.11.209:50010 View Block Info >>> >>> Someone please advice on how to go about this. >>> >>> --Ashish >>> >>> >>> On Fri, Jan 10, 2014 at 12:58 PM, Ashish Jain <[email protected]>wrote: >>> >>>> Thanks for all these suggestions. Somehow I do not have access to the >>>> servers today and will try the suggestions made on monday and will let you >>>> know how it goes. >>>> >>>> --Ashish >>>> >>>> >>>> On Thu, Jan 9, 2014 at 7:53 PM, German Florez-Larrahondo < >>>> [email protected]> wrote: >>>> >>>>> Ashish >>>>> >>>>> Could this be related to the scheduler you are using and its settings?. >>>>> >>>>> >>>>> >>>>> On lab environments when running a single type of job I often use >>>>> FairScheduler (the YARN default in 2.2.0 is CapacityScheduler) and it does >>>>> a good job distributing the load. >>>>> >>>>> >>>>> >>>>> You could give that a try ( >>>>> https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html >>>>> ) >>>>> >>>>> >>>>> >>>>> I think just changing yarn-site.xml as follows could demonstrate this >>>>> theory (note that how the jobs are scheduled depend on resources such as >>>>> memory on the nodes and you would need to setup yarn-site.xml >>>>> accordingly). >>>>> >>>>> >>>>> >>>>> <property> >>>>> >>>>> <name>yarn.resourcemanager.scheduler.class</name> >>>>> >>>>> >>>>> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value> >>>>> >>>>> </property> >>>>> >>>>> >>>>> >>>>> Regards >>>>> >>>>> ./g >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> *From:* Ashish Jain [mailto:[email protected]] >>>>> *Sent:* Thursday, January 09, 2014 6:46 AM >>>>> *To:* [email protected] >>>>> *Subject:* Re: Distributing the code to multiple nodes >>>>> >>>>> >>>>> >>>>> Another point to add here 10.12.11.210 is the host which has >>>>> everything running including a slave datanode. Data was also distributed >>>>> this host as well as the jar file. Following are running on 10.12.11.210 >>>>> >>>>> 7966 DataNode >>>>> 8480 NodeManager >>>>> 8353 ResourceManager >>>>> 8141 SecondaryNameNode >>>>> 7834 NameNode >>>>> >>>>> >>>>> >>>>> On Thu, Jan 9, 2014 at 6:12 PM, Ashish Jain <[email protected]> >>>>> wrote: >>>>> >>>>> Logs were updated only when I copied the data. After copying the data >>>>> there has been no updates on the log files. >>>>> >>>>> >>>>> >>>>> On Thu, Jan 9, 2014 at 5:08 PM, Chris Mawata <[email protected]> >>>>> wrote: >>>>> >>>>> Do the logs on the three nodes contain anything interesting? >>>>> Chris >>>>> >>>>> On Jan 9, 2014 3:47 AM, "Ashish Jain" <[email protected]> wrote: >>>>> >>>>> Here is the block info for the record I distributed. As can be seen >>>>> only 10.12.11.210 has all the data and this is the node which is serving >>>>> all the request. Replicas are available with 209 as well as 210 >>>>> >>>>> 1073741857: 10.12.11.210:50010 View Block Info >>>>> 10.12.11.209:50010 View Block Info >>>>> 1073741858: 10.12.11.210:50010 View Block Info >>>>> 10.12.11.211:50010 View Block Info >>>>> 1073741859: 10.12.11.210:50010 View Block Info >>>>> 10.12.11.209:50010 View Block Info >>>>> 1073741860: 10.12.11.210:50010 View Block Info >>>>> 10.12.11.211:50010 View Block Info >>>>> 1073741861: 10.12.11.210:50010 View Block Info >>>>> 10.12.11.209:50010 View Block Info >>>>> 1073741862: 10.12.11.210:50010 View Block Info >>>>> 10.12.11.209:50010 View Block Info >>>>> 1073741863: 10.12.11.210:50010 View Block Info >>>>> 10.12.11.209:50010 View Block Info >>>>> 1073741864: 10.12.11.210:50010 View Block Info >>>>> 10.12.11.209:50010 View Block Info >>>>> >>>>> --Ashish >>>>> >>>>> >>>>> >>>>> On Thu, Jan 9, 2014 at 2:11 PM, Ashish Jain <[email protected]> >>>>> wrote: >>>>> >>>>> Hello Chris, >>>>> >>>>> I have now a cluster with 3 nodes and replication factor being 2. When >>>>> I distribute a file I could see that there are replica of data available >>>>> in >>>>> other nodes. However when I run a map reduce job again only one node is >>>>> serving all the request :(. Can you or anyone please provide some more >>>>> inputs. >>>>> >>>>> Thanks >>>>> Ashish >>>>> >>>>> >>>>> >>>>> On Wed, Jan 8, 2014 at 7:16 PM, Chris Mawata <[email protected]> >>>>> wrote: >>>>> >>>>> 2 nodes and replication factor of 2 results in a replica of each block >>>>> present on each node. This would allow the possibility that a single node >>>>> would do the work and yet be data local. It will probably happen if that >>>>> single node has the needed capacity. More nodes than the replication >>>>> factor are needed to force distribution of the processing. >>>>> Chris >>>>> >>>>> On Jan 8, 2014 7:35 AM, "Ashish Jain" <[email protected]> wrote: >>>>> >>>>> Guys, >>>>> >>>>> I am sure that only one node is being used. I just know ran the job >>>>> again and could see that CPU usage only for one server going high other >>>>> server CPU usage remains constant and hence it means other node is not >>>>> being used. Can someone help me to debug this issue? >>>>> >>>>> ++Ashish >>>>> >>>>> >>>>> >>>>> On Wed, Jan 8, 2014 at 5:04 PM, Ashish Jain <[email protected]> >>>>> wrote: >>>>> >>>>> Hello All, >>>>> >>>>> I have a 2 node hadoop cluster running with a replication factor of 2. >>>>> I have a file of size around 1 GB which when copied to HDFS is replicated >>>>> to both the nodes. Seeing the block info I can see the file has been >>>>> subdivided into 8 parts which means it has been subdivided into 8 blocks >>>>> each of size 128 MB. I use this file as input to run the word count >>>>> program. Some how I feel only one node is doing all the work and the code >>>>> is not distributed to other node. How can I make sure code is distributed >>>>> to both the nodes? Also is there a log or GUI which can be used for this? >>>>> >>>>> Please note I am using the latest stable release that is 2.2.0. >>>>> >>>>> ++Ashish >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>> >> >> >> -- >> >> Regards, >> ...Sudhakara.st >> >> > > -- Regards, ...Sudhakara.st
