That was just specifying the command line wrong. It starts the crawl not but just stalls:
060317 101821 parsing file:/nutch/search/conf/hadoop-site.xml 060317 101829 Running job: job_1ko8i3 060317 101830 map 0% reduce 0% I am seeing this a lot in the namenode log: 060317 102009 Zero targets found, forbidden1.size=1 forbidden2.size()=0 060317 102009 Zero targets found, forbidden1.size=1 forbidden2.size()=0 -----Original Message----- From: Dennis Kubes [mailto:[EMAIL PROTECTED] Sent: Friday, March 17, 2006 9:55 AM To: [email protected] Subject: RE: Help Setting Up Nutch 0.8 Distributed Ok, the servers are starting now but when I try to do a crawl I am getting an error like below. I think that I am missing a configuration option, but I don't know which one. I have included my hadoop-site.xml as well. error upon crawl: 060317 093312 Client connection to 127.0.0.1:9000: starting 060317 093312 parsing jar:file:/nutch/search/lib/hadoop-0.1-dev.jar!/hadoop-default.xml 060317 093312 parsing file:/nutch/search/conf/hadoop-site.xml 060317 093322 Running job: job_c78m3c 060317 093323 map 100% reduce 100% Exception in thread "main" java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:310) at org.apache.nutch.crawl.Injector.inject(Injector.java:114) at org.apache.nutch.crawl.Crawl.main(Crawl.java:104) job tracker log file: 060317 093322 parsing jar:file:/nutch/search/lib/hadoop-0.1-dev.jar!/mapred-default.xml 060317 093322 parsing /nutch/filesystem/mapreduce/local/job_c78m3c.xml/jobTracker 060317 093322 parsing file:/nutch/search/conf/hadoop-site.xml 060317 093322 job init failed java.io.IOException: No input directories specified in: Configuration: defaults: hadoop-default.xml , mapred-default.xml , /nutch/filesystem/mapreduce/local/ job_c78m3c.xml/jobTrackerfinal: hadoop-site.xml at org.apache.hadoop.mapred.InputFormatBase.listFiles(InputFormatBase.java:84) at org.apache.hadoop.mapred.InputFormatBase.getSplits(InputFormatBase.java:94) at org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:127) at org.apache.hadoop.mapred.JobTracker$JobInitThread.run(JobTracker.java:208) at java.lang.Thread.run(Thread.java:595) Exception in thread "Thread-21" java.lang.NullPointerException at org.apache.hadoop.mapred.JobInProgress.kill(JobInProgress.java:437) at org.apache.hadoop.mapred.JobTracker$JobInitThread.run(JobTracker.java:212) at java.lang.Thread.run(Thread.java:595) 060317 093325 Server connection on port 9001 from 127.0.0.1: exiting hadoop-site.xml: <property> <name>mapred.job.tracker</name> <value>localhost:9001</value> <description> The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task. </description> </property> <property> <name>mapred.map.tasks</name> <value>2</value> <description> define mapred.map tasks to be number of slave hosts </description> </property> <property> <name>mapred.reduce.tasks</name> <value>2</value> <description> define mapred.reduce tasks to be number of slave hosts </description> </property> <property> <name>dfs.name.dir</name> <value>/nutch/filesystem/name</value> </property> <property> <name>dfs.data.dir</name> <value>/nutch/filesystem/data</value> </property> <property> <name>mapred.system.dir</name> <value>/nutch/filesystem/mapreduce/system</value> </property> <property> <name>mapred.local.dir</name> <value>/nutch/filesystem/mapreduce/local</value> </property> -----Original Message----- From: Dennis Kubes [mailto:[EMAIL PROTECTED] Sent: Friday, March 17, 2006 9:05 AM To: [email protected] Subject: RE: Help Setting Up Nutch 0.8 Distributed I got one of the issues fixed. The output like below is caused by the hadoop-env.sh file being in dos format and not being executable. A dos2unix and chmod 700 fixed the command not found output. Still working on why the server won't start. caused by hadoop-env.sh in dos format and not being executable: : command not found line 2: : command not found line 7: : command not found line 10: : command not found line 13: : command not found line 16: : command not found line 20: : command not found line 23: : command not found line 26: : command not found line 29: : command not found line 32: Dennis -----Original Message----- From: Doug Cutting [mailto:[EMAIL PROTECTED] Sent: Thursday, March 16, 2006 6:50 PM To: [email protected] Subject: Re: Help Setting Up Nutch 0.8 Distributed Dennis Kubes wrote: > : command not foundlaves.sh: line 29: > : command not foundlaves.sh: line 32: > localhost: ssh: \015: Name or service not known > devcluster02: ssh: \015: Name or service not known > > And still getting this error: > > 060316 175355 parsing file:/nutch/search/conf/hadoop-site.xml > Exception in thread "main" java.io.IOException: Cannot create file > /tmp/hadoop/mapred/system/submit_mmuodk/job.jar on client > DFSClient_-913777457 > at org.apache.hadoop.ipc.Client.call(Client.java:301) > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:141) > at org.apache.hadoop.dfs.$Proxy0.create(Unknown Source) > at > org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSCli > ent.java:587) > at org > > My ssh version is: > > openssh-clients-3.6.1p2-33.30.3 > openssh-server-3.6.1p2-33.30.3 > openssh-askpass-gnome-3.6.1p2-33.30.3 > openssh-3.6.1p2-33.30.3 > openssh-askpass-3.6.1p2-33.30.3 > > Is it something to do with my slaves file? The \015 looks like a file has a CR where perhaps an LF is expected? What does 'od -c conf/slaves' print? What happens when you try something like 'bin/slaves uptime'? Doug ------------------------------------------------------- This SF.Net email is sponsored by xPML, a groundbreaking scripting language that extends applications into web and mobile media. Attend the live webcast and join the prime developer group breaking into this new coding territory! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
