Dear all, This is a copy of a conversation from the hadoop mailing list - I am forwarding it to this list as I thought there may be interested parties on this list that are not on the hadoop list. Apologies for cross-posting. It describes a problem I had with jobs hanging in Hadoop and how I fixed it. (Essentially, the problem was caused by the Jetty web server not running properly.)
Regards, Jon -----Original Message----- From: Jon Blower [mailto:[EMAIL PROTECTED] Sent: 27 February 2006 21:20 To: [email protected] Subject: RE: Running WordCount in pseudo-distributed configuration Hi everyone, After some investigation I've managed to fix my problem. In short, the problem is caused by the Jetty web server failing to run. This is part of the jobtracker component: if this isn't running then jobs will simply hang. Jetty was not starting up properly because of two problems with the web.xml file that is generated when building hadoop: 1) web.xml is validated with a DTD that is supposed to be downloaded from the web. However, my server does not have a connection to the Internet. 2) The contents of web.xml do not seem to be valid anyway (they cause ClassNotFoundExceptions). You need to make a few changes to ensure that the system sees a valid web.xml and doesn't try to validate it with a DTD. Here's how I did it. Basically, I created the webapps directory manually instead of allowing it to be built by the build script: 1) Edited build.xml: i) In the "init" target, removed the "<mkdir dir="${build.webapps}/WEB-INF"/>" and the "<copy todir="${build.webapps}" ..." portions. This stops the build file from automatically building the build/webapps directory. ii) In the "compile" target, removed the "jsp-compile" portion. This stops the system from building a new web.xml iii) In the "jar" target, removed the "<zipfileset dir="webapps"..." portion. This stops the webapps directory from appearing in the hadoop jar file 2) Created build/webapps and populated it: i) Moved src/webapps/index.html and src/webapps/mapred/*.jsp into build/webapps. ii) Created a directory called WEB-INF in build/webapps iii) Inside WEB-INF, created a file called web.xml with the following contents: <?xml version="1.0" encoding="ISO-8859-1"?> <web-app> </web-app> The file doesn't need to contain any more information: the web application is made up of JSP files that do not need to be deployed. 3) Edited bin/hadoop: Added the argument "-Dorg.mortbay.xml.XmlParser.NotValidating=true" to the final line in the file (the one that starts 'exec "$JAVA" ...'). This stops Jetty from trying to validate web.xml with a DTD. 4 In src/java/org/apache/hadoop/mapred/, changed the source files so that the TaskTrackerStatus, JobInProgress, JobProfile and JobStatus classes are public, not package-private. This is required for the JSP files to be able to use these classes. 5) Ran "ant jar" to build the hadoop library and ran "cp build/hadoop-0.1-dev.jar ." to copy the JAR into the hadoop home directory. Deleted the existing hadoop-nightly.jar file. This makes sure that the new JAR file is picked up. 6) Edited conf/hadoop.site.xml to have the following contents: <property> <name>fs.default.name</name> <value>localhost:9000</value> </property> <property> <name>mapred.job.tracker</name> <value>localhost:9001</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> 7) Ran bin/start-all.sh. Checked the log files to make sure no exceptions were being thrown. Note that the build/webapps directory that I created in step (2) is automatically found and used as the base for the web application. It's important to follow step (1) above to prevent a conflicting webapps directory from appearing in the JAR file. 8) In a web browser, opened http://localhost:50030. This gave me a link to the Job Tracker, which I could follow and monitor the jobs as I submitted them: 9) uploaded some input files (in a directory called "in") to the distributed file system "bin/hadoop dfs -put in in" 10) ran a test job: "bin/hadoop org.apache.hadoop.examples.WordCount in out". Monitored the job's progress using the web interface. 11) Downloaded the output files "bin/hadoop dfs -get out out" Success! I hope this is useful to others who have been struggling to get started, and for the developers. There is one more thing that I needed to do on my system (Red Hat 9), because the version of ssh is older than is assumed by hadoop: In bin/slaves.sh, near the bottom, removed the option "-o ConnectTimeout=1" from the call to ssh. The ConnectTimeout option is not understood by my version of ssh (OpenSSH 3.5p1) Jon > -----Original Message----- > From: Jon Blower [mailto:[EMAIL PROTECTED] > Sent: 27 February 2006 11:55 > To: Hadoop mailing list > Subject: Running WordCount in pseudo-distributed configuration > > Hi all, > > I am having the same problem that Ramanan reported to this list (I > haven't seen a reply to Ramanan's question). I am trying to run the > WordCount example in pseudo-distributed mode, i.e. everything running > on localhost with the following contents in hadoop-site.xml: > > <property> <name>fs.default.name</name> > <value>localhost:9000</value> > </property> > <property> <name>mapred.job.tracker</name> > <value>localhost:9001</value> </property> > <property> <name>dfs.replication</name> <value>1</value> > </property> > > I can run bin/start-all.sh without problems and I can upload and > download material to and from the DFS. However, when I run > > bin/hadoop org.apache.hadoop.examples.WordCount in out > > I get the following output: > > 060227 114628 parsing > file:/users/resc/programs/hadoop-nightly/conf/hadoop-default.xml > 060227 114628 parsing > file:/users/resc/programs/hadoop-nightly/conf/mapred-default.xml > 060227 114628 parsing > file:/users/resc/programs/hadoop-nightly/conf/hadoop-site.xml > 060227 114628 Client connection to 127.0.0.1:9001: starting > 060227 114628 Client connection to 127.0.0.1:9000: starting > 060227 114628 parsing > file:/users/resc/programs/hadoop-nightly/conf/hadoop-default.xml > 060227 114628 parsing > file:/users/resc/programs/hadoop-nightly/conf/hadoop-site.xml > 060227 114629 Running job: job_17c13e > 060227 114630 map 0% reduce 0% > > ... and the program just hangs. The input directory "in" has been > uploaded to the DFS. The WordCount program works fine in standalone > mode. > > Does anyone know what's going wrong? I am using the nightly build > from > 27-02-2006 on Red Hat Linux 9. > > There is suspicious activity in the log files. The jobtracker log > contains a lot of exceptions, including: > > 060227 114952 Web application not found > /users/resc/programs/hadoop-nightly/file:/users/resc/programs/ > hadoop-nightly > /hadoop-nightly.jar!/webapps > 060227 114952 Configuration error on > /users/resc/programs/hadoop-nightly/file:/users/resc/programs/ > hadoop-nightly > /hadoop-nightly.jar!/webapps > java.io.FileNotFoundException: > /users/resc/programs/hadoop-nightly/file:/users/resc/programs/ > hadoop-nightly > /hadoop-nightly.jar!/webapps > > Also: > > 060227 114953 Starting tracker > java.io.IOException: Could not start HTTP server > at > org.apache.hadoop.mapred.JobTrackerInfoServer.start(JobTracker > InfoServer.jav > a:104) > at > org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:304) > at > org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:50) > at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:820) > 060227 114954 parsing > file:/users/resc/programs/hadoop-nightly/conf/hadoop-default.xml > 060227 114954 parsing > file:/users/resc/programs/hadoop-nightly/conf/mapred-default.xml > 060227 114954 parsing > file:/users/resc/programs/hadoop-nightly/conf/hadoop-site.xml > 060227 114954 Starting tracker > java.net.BindException: Address already in use > at java.net.PlainSocketImpl.socketBind(Native Method) > > There are a large number of BindExceptions, all of which follow a > "Starting tracker" message. This happens for only one invocation of > start-all.sh. I don't understand why it's apparently trying to start > the tracker so many times. > > Thanks in advance, > Jon > > > -------------------------------------------------------------- > Dr Jon Blower Tel: +44 118 378 5213 (direct line) > Technical Director Tel: +44 118 378 8741 (ESSC) > Reading e-Science Centre Fax: +44 118 378 6413 > ESSC Email: [EMAIL PROTECTED] > University of Reading > 3 Earley Gate > Reading RG6 6AL, UK > -------------------------------------------------------------- > > ------------------------------------------------------- This SF.Net email is sponsored by xPML, a groundbreaking scripting language that extends applications into web and mobile media. Attend the live webcast and join the prime developer group breaking into this new coding territory! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
