I tried again with absolute paths, but it didn't make any difference. All the local directories that are accessed by the java process are within the user home directory, so write access is not an issue. As a test, I also tried to revert my nutch-site.xml and only put the following, so it would use the defaults for directories location (/tmp/nutch/...):
<property> <name>fs.default.name</name> <value>mapred01:10000</value> <description>The name of the default file system. Either the literal string "local" or a host:port for NDFS.</description> </property> <property> <name>mapred.job.tracker</name> <value>mapred01:11000</value> <description>The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task. </description> </property> Unfortunately, it didn't make any difference either, I still get the exact same error. What I'm doing is very simple, I'm following what's explained here: http://wiki.media-style.com/display/nutchDocu/setup+a+map+reduce+multi+box+system The only difference is that I'm using a .slaves file and I run start-all.sh to avoid having to log on the slave machine and start the daemons manually. --Flo Stefan Groschupf wrote: > Sounds strange, I had the a similar probelm, but this related to > different user names on different boxes. > Please try to use absolute path something like bin/nutch fetch / > Users/yourUser/segments/30000004344 > Also check that your users that runs the java processes have write > access to the local folders. > :-? > > Stefan > > > Am 02.12.2005 um 02:34 schrieb Florent Gluck: > >> Hi all, >> >> I just started experimenting with the mapred branch, but unfortunately >> I'm not even able to get an entire crawl cycle to complete properly. >> I'm using 2 machines: >> mapred01: master that just acts as a JobTracker only (doesn't crawl) >> mapred02: slave that executes the tasks >> >> Both machines have exactly the same install and config files. >> Here is what I put in nutch-site.xml: >> >> <property> >> <name>fs.default.name</name> >> <value>mapred01:10000</value> >> </property> >> >> <property> >> <name>mapred.job.tracker</name> >> <value>mapred01:11000</value> >> </property> >> >> <property> >> <name>ndfs.name.dir</name> >> <value>/home/epile/ndfs/name</value> >> </property> >> >> <property> >> <name>ndfs.data.dir</name> >> <value>/home/epile/ndfs/data</value> >> </property> >> >> <property> >> <name>mapred.local.dir</name> >> <value>/home/epile/mapred/local</value> >> </property> >> >> <property> >> <name>mapred.system.dir</name> >> <value>/home/epile/mapred/system</value> >> </property> >> >> <property> >> <name>mapred.temp.dir</name> >> <value>/home/epile/mapred/temp</value> >> </property> >> >> Then, I do the following steps on the master: >> >> 1. echo mapred02 > .slaves >> 2. start-all.sh >> 3. mkdir seeds >> 4. echo http://www.cnn.com/ > seeds/urls.txt >> ndfs -put seeds seeds >> 5. inject crawldb seeds >> 6. generate crawldb segments >> 7. fetch segments/SEG_NAME (looked up using nutch ndfs -ls segments) >> 8. invertlinks linkdb segments/SEG_NAME >> >> Up to step 7, everything completes properly. >> However, step 8 always fails and I get this java exception: >> >> Exception in thread "main" java.io.IOException: No input directories >> specified i n: NutchConf: nutch-default.xml , mapred-default.xml , >> /home/epile/mapred/local/jobTracker/job_gjrlvu.xml , nutch-site.xml >> at org.apache.nutch.ipc.Client.call(Client.java:294) >> at org.apache.nutch.ipc.RPC$Invoker.invoke(RPC.java:127) >> at $Proxy0.submitJob(Unknown Source) >> at org.apache.nutch.mapred.JobClient.submitJob >> (JobClient.java:259) >> at org.apache.nutch.mapred.JobClient.runJob(JobClient.java: 288) >> at org.apache.nutch.crawl.LinkDb.invert(LinkDb.java:131) >> at org.apache.nutch.crawl.LinkDb.main(LinkDb.java:192) >> >> I saw a few messages on the dev mailing list regarding similar "no >> input >> directories specified", but I'm not really clear on what's the cause of >> this error. >> By looking at /home/epile/mapred/local/jobTracker/job_gjrlvu.xml I >> didn't see any missing input.dir properties. >> My configuration is very simple, both machine use exactly the same >> paths >> and same users. There is no distinction between them besides their >> hostnames and their respective tasks. >> Am I missing something or do I do something wrong ? >> >> I included the whole output, as well as the jobtracker and namenode >> logs >> in attachment. >> >> Any help would be greatly appreciated. >> Thanks, >> --Flo >> >> mapred02: starting datanode, logging to /home/epile/log/nutch-epile- >> datanode-mapred02.blah.com.log >> mapred02: 051202 010100 10 parsing file:/home/epile/nutch-mapred/ >> conf/nutch-default.xml >> rsync from mapred01:/home/epile/nutch-mapred >> starting namenode, logging to /home/epile/log/nutch-epile-namenode- >> mapred01.blah.com.log >> 051201 210249 parsing file:/home/epile/nutch-mapred/conf/nutch- >> default.xml >> 051201 210250 parsing file:/home/epile/nutch-mapred/conf/nutch- site.xml >> rsync from mapred01:/home/epile/nutch-mapred >> starting jobtracker, logging to /home/epile/log/nutch-epile- >> jobtracker-mapred01.blah.com.log >> 051201 210251 parsing file:/home/epile/nutch-mapred/conf/nutch- >> default.xml >> mapred02: starting tasktracker, logging to /home/epile/log/nutch- >> epile-tasktracker-mapred02.blah.com.log >> mapred02: 051202 010104 parsing file:/home/epile/nutch-mapred/conf/ >> nutch-default.xml >> 051201 210254 parsing file:/home/epile/nutch-mapred/conf/nutch- >> default.xml >> 051201 210255 parsing file:/home/epile/nutch-mapred/conf/nutch- site.xml >> 051201 210255 No FS indicated, using default:mapred01:10000 >> 051201 210255 Client connection to 192.168.15.50:10000: starting >> 051201 210256 Injector: starting >> 051201 210256 Injector: crawlDb: crawldb >> 051201 210256 Injector: urlDir: seeds >> 051201 210256 parsing file:/home/epile/nutch-mapred/conf/nutch- >> default.xml >> 051201 210257 parsing file:/home/epile/nutch-mapred/conf/nutch- site.xml >> 051201 210257 Injector: Converting injected urls to crawl db entries. >> 051201 210257 parsing file:/home/epile/nutch-mapred/conf/nutch- >> default.xml >> 051201 210257 parsing file:/home/epile/nutch-mapred/conf/mapred- >> default.xml >> 051201 210257 parsing file:/home/epile/nutch-mapred/conf/nutch- site.xml >> 051201 210257 Client connection to 192.168.15.50:11000: starting >> 051201 210257 Client connection to 192.168.15.50:10000: starting >> 051201 210258 Running job: job_bby846 >> 051201 210259 map 0% >> 051201 210302 map 100% >> 051201 210309 reduce 100% >> 051201 210309 Job complete: job_bby846 >> 051201 210309 Injector: Merging injected urls into crawl db. >> 051201 210309 parsing file:/home/epile/nutch-mapred/conf/nutch- >> default.xml >> 051201 210309 parsing file:/home/epile/nutch-mapred/conf/mapred- >> default.xml >> 051201 210309 parsing file:/home/epile/nutch-mapred/conf/nutch- site.xml >> 051201 210310 Running job: job_haomlk >> 051201 210311 map 0% >> 051201 210314 map 100% >> 051201 210318 reduce 100% >> 051201 210321 Job complete: job_haomlk >> 051201 210322 Injector: done >> 051201 210323 parsing file:/home/epile/nutch-mapred/conf/nutch- >> default.xml >> 051201 210323 parsing file:/home/epile/nutch-mapred/conf/nutch- site.xml >> 051201 210323 Generator: starting >> 051201 210323 Generator: segment: segments/20051201210323 >> 051201 210323 Generator: Selecting most-linked urls due for fetch. >> 051201 210323 parsing file:/home/epile/nutch-mapred/conf/nutch- >> default.xml >> 051201 210323 parsing file:/home/epile/nutch-mapred/conf/mapred- >> default.xml >> 051201 210323 parsing file:/home/epile/nutch-mapred/conf/nutch- site.xml >> 051201 210323 Client connection to 192.168.15.50:11000: starting >> 051201 210323 Client connection to 192.168.15.50:10000: starting >> 051201 210324 Running job: job_vcjx1z >> 051201 210325 map 0% >> 051201 210330 map 100% >> 051201 210333 reduce 100% >> 051201 210333 Job complete: job_vcjx1z >> 051201 210333 Generator: Partitioning selected urls by host, for >> politeness. >> 051201 210333 parsing file:/home/epile/nutch-mapred/conf/nutch- >> default.xml >> 051201 210333 parsing file:/home/epile/nutch-mapred/conf/mapred- >> default.xml >> 051201 210333 parsing file:/home/epile/nutch-mapred/conf/nutch- site.xml >> 051201 210334 Running job: job_oyzp06 >> 051201 210335 map 0% >> 051201 210339 map 100% >> 051201 210342 reduce 100% >> 051201 210342 Job complete: job_oyzp06 >> 051201 210342 Generator: done. >> 051201 210343 parsing file:/home/epile/nutch-mapred/conf/nutch- >> default.xml >> 051201 210343 parsing file:/home/epile/nutch-mapred/conf/nutch- site.xml >> 051201 210343 No FS indicated, using default:mapred01:10000 >> 051201 210344 Client connection to 192.168.15.50:10000: starting >> 051201 210345 parsing file:/home/epile/nutch-mapred/conf/nutch- >> default.xml >> 051201 210345 parsing file:/home/epile/nutch-mapred/conf/nutch- site.xml >> 051201 210345 Fetcher: starting >> 051201 210345 Fetcher: segment: segments/20051201210323 >> 051201 210345 parsing file:/home/epile/nutch-mapred/conf/nutch- >> default.xml >> 051201 210345 parsing file:/home/epile/nutch-mapred/conf/mapred- >> default.xml >> 051201 210345 parsing file:/home/epile/nutch-mapred/conf/nutch- site.xml >> 051201 210345 Client connection to 192.168.15.50:11000: starting >> 051201 210345 Client connection to 192.168.15.50:10000: starting >> 051201 210346 Running job: job_r878fx >> 051201 210347 map 0% >> 051201 210354 map 100% >> 051201 210359 reduce 100% >> 051201 210359 Job complete: job_r878fx >> 051201 210359 Fetcher: done >> 051201 210400 LinkDb: starting >> 051201 210400 LinkDb: linkdb: linkdb >> 051201 210400 LinkDb: segments: segments/20051201210323 >> 051201 210401 parsing file:/home/epile/nutch-mapred/conf/nutch- >> default.xml >> 051201 210401 parsing file:/home/epile/nutch-mapred/conf/mapred- >> default.xml >> 051201 210401 parsing file:/home/epile/nutch-mapred/conf/nutch- site.xml >> 051201 210401 parsing file:/home/epile/nutch-mapred/conf/nutch- >> default.xml >> 051201 210401 parsing file:/home/epile/nutch-mapred/conf/nutch- site.xml >> 051201 210401 Client connection to 192.168.15.50:11000: starting >> 051201 210401 Client connection to 192.168.15.50:10000: starting >> Exception in thread "main" java.io.IOException: No input directories >> specified in: NutchConf: nutch-default.xml , mapred- default.xml , >> /home/epile/mapred/local/jobTracker/job_e336wf.xml , nutch-site.xml >> at org.apache.nutch.ipc.Client.call(Client.java:294) >> at org.apache.nutch.ipc.RPC$Invoker.invoke(RPC.java:127) >> at $Proxy0.submitJob(Unknown Source) >> at org.apache.nutch.mapred.JobClient.submitJob(JobClient.java:259) >> at org.apache.nutch.mapred.JobClient.runJob(JobClient.java:288) >> at org.apache.nutch.crawl.LinkDb.invert(LinkDb.java:131) >> at org.apache.nutch.crawl.LinkDb.main(LinkDb.java:192) >> 051201 210251 parsing file:/home/epile/nutch-mapred/conf/nutch- >> default.xml >> 051201 210251 parsing file:/home/epile/nutch-mapred/conf/nutch- site.xml >> 051201 210251 Client connection to 192.168.15.50:10000: starting >> 051201 210252 Server listener on port 11000: starting >> 051201 210252 Server handler on 11000: starting >> 051201 210252 Server handler on 11000: starting >> 051201 210252 Server handler on 11000: starting >> 051201 210252 Server handler on 11000: starting >> 051201 210252 Server handler on 11000: starting >> 051201 210252 Server handler on 11000: starting >> 051201 210252 Server handler on 11000: starting >> 051201 210252 Server handler on 11000: starting >> 051201 210252 Server handler on 11000: starting >> 051201 210252 Property 'java.runtime.name' is Java(TM) 2 Runtime >> Environment, Standard Edition >> 051201 210252 Property 'sun.boot.library.path' is /usr/lib/j2sdk1.5- >> sun/jre/lib/i386 >> 051201 210252 Property 'java.vm.version' is 1.5.0_03-b07 >> 051201 210252 Property 'java.vm.vendor' is Sun Microsystems Inc. >> 051201 210252 Property 'java.vendor.url' is http://java.sun.com/ >> 051201 210252 Property 'path.separator' is : >> 051201 210252 Property 'java.vm.name' is Java HotSpot(TM) Client VM >> 051201 210252 Property 'file.encoding.pkg' is sun.io >> 051201 210252 Property 'user.country' is US >> 051201 210252 Property 'sun.os.patch.level' is unknown >> 051201 210252 Property 'java.vm.specification.name' is Java Virtual >> Machine Specification >> 051201 210252 Property 'user.dir' is /home/epile/nutch-mapred >> 051201 210252 Property 'java.runtime.version' is 1.5.0_03-b07 >> 051201 210252 Property 'java.awt.graphicsenv' is >> sun.awt.X11GraphicsEnvironment >> 051201 210252 Property 'java.endorsed.dirs' is /usr/lib/j2sdk1.5- >> sun/jre/lib/endorsed >> 051201 210252 Property 'os.arch' is i386 >> 051201 210252 Property 'java.io.tmpdir' is /tmp >> 051201 210252 Property 'line.separator' is >> >> 051201 210252 Property 'java.vm.specification.vendor' is Sun >> Microsystems Inc. >> 051201 210252 Property 'os.name' is Linux >> 051201 210252 Property 'sun.jnu.encoding' is ANSI_X3.4-1968 >> 051201 210252 Property 'java.library.path' is /usr/lib/j2sdk1.5-sun/ >> jre/lib/i386/client:/usr/lib/j2sdk1.5-sun/jre/lib/i386:/usr/lib/ >> j2sdk1.5-sun/jre/../lib/i386 >> 051201 210252 Property 'java.specification.name' is Java Platform >> API Specification >> 051201 210252 Property 'java.class.version' is 49.0 >> 051201 210252 Property 'sun.management.compiler' is HotSpot Client >> Compiler >> 051201 210252 Property 'os.version' is 2.6.12-9-386 >> 051201 210252 Property 'user.home' is /home/epile >> 051201 210252 Property 'user.timezone' is GMT >> 051201 210252 Property 'java.awt.printerjob' is sun.print.PSPrinterJob >> 051201 210252 Property 'file.encoding' is ANSI_X3.4-1968 >> 051201 210252 Property 'java.specification.version' is 1.5 >> 051201 210252 Server handler on 11000: starting >> 051201 210252 Property 'java.class.path' is /home/epile/nutch- >> mapred/conf:/usr/lib/tools.jar:/home/epile/nutch-mapred/build/ >> classes:/home/epile/nutch-mapred/build:/home/epile/nutch-mapred/ >> build/test/classes:/home/epile/nutch-mapred/nutch-*.jar:/home/epile/ >> nutch-mapred/lib/commons-lang-2.1.jar:/home/epile/nutch-mapred/lib/ >> commons-logging-api-1.0.4.jar:/home/epile/nutch-mapred/lib/ >> concurrent-1.3.4.jar:/home/epile/nutch-mapred/lib/jakarta- >> oro-2.0.7.jar:/home/epile/nutch-mapred/lib/jetty-5.1.4.jar:/home/ >> epile/nutch-mapred/lib/junit-3.8.1.jar:/home/epile/nutch-mapred/lib/ >> lucene-1.9-rc1-dev.jar:/home/epile/nutch-mapred/lib/lucene-misc-1.9- >> rc1-dev.jar:/home/epile/nutch-mapred/lib/servlet-api.jar:/home/ >> epile/nutch-mapred/lib/taglibs-i18n.jar:/home/epile/nutch-mapred/ >> lib/xerces-2_6_2-apis.jar:/home/epile/nutch-mapred/lib/ >> xerces-2_6_2.jar:/home/epile/nutch-mapred/lib/jetty-ext/ant.jar:/ >> home/epile/nutch-mapred/lib/jetty-ext/commons-el.jar:/home/epile/ >> nutch-mapred/lib/jetty-ext/jasper-compiler.jar:/home/epile/nutch- >> mapred/lib/jetty-ext/jasper-runtime.jar:/home/epile/nutch-mapred/ >> lib/jetty-ext/jsp-api.jar >> 051201 210252 Property 'user.name' is epile >> 051201 210252 Property 'java.vm.specification.version' is 1.0 >> 051201 210252 Property 'java.home' is /usr/lib/j2sdk1.5-sun/jre >> 051201 210252 Property 'sun.arch.data.model' is 32 >> 051201 210252 Property 'user.language' is en >> 051201 210252 Property 'java.specification.vendor' is Sun >> Microsystems Inc. >> 051201 210252 Property 'java.vm.info' is mixed mode, sharing >> 051201 210252 Property 'java.version' is 1.5.0_03 >> 051201 210252 Property 'java.ext.dirs' is /usr/lib/j2sdk1.5-sun/jre/ >> lib/ext >> 051201 210252 Property 'sun.boot.class.path' is /usr/lib/j2sdk1.5- >> sun/jre/lib/rt.jar:/usr/lib/j2sdk1.5-sun/jre/lib/i18n.jar:/usr/lib/ >> j2sdk1.5-sun/jre/lib/sunrsasign.jar:/usr/lib/j2sdk1.5-sun/jre/lib/ >> jsse.jar:/usr/lib/j2sdk1.5-sun/jre/lib/jce.jar:/usr/lib/j2sdk1.5- >> sun/jre/lib/charsets.jar:/usr/lib/j2sdk1.5-sun/jre/classes >> 051201 210252 Property 'java.vendor' is Sun Microsystems Inc. >> 051201 210252 Property 'file.separator' is / >> 051201 210252 Property 'java.vendor.url.bug' is http://java.sun.com/ >> cgi-bin/bugreport.cgi >> 051201 210252 Property 'sun.io.unicode.encoding' is UnicodeLittle >> 051201 210252 Property 'sun.cpu.endian' is little >> 051201 210252 Property 'sun.cpu.isalist' is >> 051201 210252 Version Jetty/5.1.4 >> 051201 210252 Checking Resource aliases >> 051201 210253 Server connection on port 11000 from 192.168.15.51: >> starting >> 051201 210254 Started >> [EMAIL PROTECTED] >> 051201 210254 Started WebApplicationContext[/,/] >> 051201 210254 Started SocketListener on 0.0.0.0:7845 >> 051201 210254 Started [EMAIL PROTECTED] >> 051201 210257 Server connection on port 11000 from 192.168.15.50: >> starting >> 051201 210258 parsing file:/home/epile/nutch-mapred/conf/nutch- >> default.xml >> 051201 210258 parsing file:/home/epile/nutch-mapred/conf/mapred- >> default.xml >> 051201 210258 parsing /home/epile/mapred/local/jobTracker/ >> job_bby846.xml >> 051201 210258 parsing file:/home/epile/nutch-mapred/conf/nutch- site.xml >> 051201 210258 parsing file:/home/epile/nutch-mapred/conf/nutch- >> default.xml >> 051201 210258 parsing file:/home/epile/nutch-mapred/conf/mapred- >> default.xml >> 051201 210258 parsing /home/epile/mapred/local/jobTracker/ >> job_bby846.xml >> 051201 210258 parsing file:/home/epile/nutch-mapred/conf/nutch- site.xml >> 051201 210259 Adding task 'task_m_ihqm4i' to set for tracker >> 'tracker_50075' >> 051201 210305 Task 'task_m_ihqm4i' has finished successfully. >> 051201 210305 Adding task 'task_r_f1ykb1' to set for tracker >> 'tracker_50075' >> 051201 210308 Task 'task_r_f1ykb1' has finished successfully. >> 051201 210310 parsing file:/home/epile/nutch-mapred/conf/nutch- >> default.xml >> 051201 210310 parsing file:/home/epile/nutch-mapred/conf/mapred- >> default.xml >> 051201 210310 parsing /home/epile/mapred/local/jobTracker/ >> job_haomlk.xml >> 051201 210310 parsing file:/home/epile/nutch-mapred/conf/nutch- site.xml >> 051201 210310 parsing file:/home/epile/nutch-mapred/conf/nutch- >> default.xml >> 051201 210310 parsing file:/home/epile/nutch-mapred/conf/mapred- >> default.xml >> 051201 210310 parsing /home/epile/mapred/local/jobTracker/ >> job_haomlk.xml >> 051201 210310 parsing file:/home/epile/nutch-mapred/conf/nutch- site.xml >> 051201 210311 Adding task 'task_m_a2frqg' to set for tracker >> 'tracker_50075' >> 051201 210314 Task 'task_m_a2frqg' has finished successfully. >> 051201 210314 Adding task 'task_r_sw6zcc' to set for tracker >> 'tracker_50075' >> 051201 210320 Task 'task_r_sw6zcc' has finished successfully. >> 051201 210322 Server connection on port 11000 from 192.168.15.50: >> exiting >> 051201 210323 Server connection on port 11000 from 192.168.15.50: >> starting >> 051201 210324 parsing file:/home/epile/nutch-mapred/conf/nutch- >> default.xml >> 051201 210324 parsing file:/home/epile/nutch-mapred/conf/mapred- >> default.xml >> 051201 210324 parsing /home/epile/mapred/local/jobTracker/ >> job_vcjx1z.xml >> 051201 210324 parsing file:/home/epile/nutch-mapred/conf/nutch- site.xml >> 051201 210324 parsing file:/home/epile/nutch-mapred/conf/nutch- >> default.xml >> 051201 210324 parsing file:/home/epile/nutch-mapred/conf/mapred- >> default.xml >> 051201 210324 parsing /home/epile/mapred/local/jobTracker/ >> job_vcjx1z.xml >> 051201 210324 parsing file:/home/epile/nutch-mapred/conf/nutch- site.xml >> 051201 210326 Adding task 'task_m_1oh66k' to set for tracker >> 'tracker_50075' >> 051201 210329 Task 'task_m_1oh66k' has finished successfully. >> 051201 210329 Adding task 'task_r_mfdo41' to set for tracker >> 'tracker_50075' >> 051201 210332 Task 'task_r_mfdo41' has finished successfully. >> 051201 210334 parsing file:/home/epile/nutch-mapred/conf/nutch- >> default.xml >> 051201 210334 parsing file:/home/epile/nutch-mapred/conf/mapred- >> default.xml >> 051201 210334 parsing /home/epile/mapred/local/jobTracker/ >> job_oyzp06.xml >> 051201 210334 parsing file:/home/epile/nutch-mapred/conf/nutch- site.xml >> 051201 210334 parsing file:/home/epile/nutch-mapred/conf/nutch- >> default.xml >> 051201 210334 parsing file:/home/epile/nutch-mapred/conf/mapred- >> default.xml >> 051201 210334 parsing /home/epile/mapred/local/jobTracker/ >> job_oyzp06.xml >> 051201 210334 parsing file:/home/epile/nutch-mapred/conf/nutch- site.xml >> 051201 210335 Adding task 'task_m_8o4pj1' to set for tracker >> 'tracker_50075' >> 051201 210338 Task 'task_m_8o4pj1' has finished successfully. >> 051201 210338 Adding task 'task_r_iv805p' to set for tracker >> 'tracker_50075' >> 051201 210341 Task 'task_r_iv805p' has finished successfully. >> 051201 210342 Server connection on port 11000 from 192.168.15.50: >> exiting >> 051201 210345 Server connection on port 11000 from 192.168.15.50: >> starting >> 051201 210346 parsing file:/home/epile/nutch-mapred/conf/nutch- >> default.xml >> 051201 210346 parsing file:/home/epile/nutch-mapred/conf/mapred- >> default.xml >> 051201 210346 parsing /home/epile/mapred/local/jobTracker/ >> job_r878fx.xml >> 051201 210346 parsing file:/home/epile/nutch-mapred/conf/nutch- site.xml >> 051201 210346 parsing file:/home/epile/nutch-mapred/conf/nutch- >> default.xml >> 051201 210346 parsing file:/home/epile/nutch-mapred/conf/mapred- >> default.xml >> 051201 210346 parsing /home/epile/mapred/local/jobTracker/ >> job_r878fx.xml >> 051201 210346 parsing file:/home/epile/nutch-mapred/conf/nutch- site.xml >> 051201 210347 Adding task 'task_m_ndssqr' to set for tracker >> 'tracker_50075' >> 051201 210353 Task 'task_m_ndssqr' has finished successfully. >> 051201 210353 Adding task 'task_r_184i7z' to set for tracker >> 'tracker_50075' >> 051201 210359 Task 'task_r_184i7z' has finished successfully. >> 051201 210400 Server connection on port 11000 from 192.168.15.50: >> exiting >> 051201 210401 Server connection on port 11000 from 192.168.15.50: >> starting >> 051201 210402 parsing file:/home/epile/nutch-mapred/conf/nutch- >> default.xml >> 051201 210402 parsing file:/home/epile/nutch-mapred/conf/mapred- >> default.xml >> 051201 210402 parsing /home/epile/mapred/local/jobTracker/ >> job_e336wf.xml >> 051201 210402 parsing file:/home/epile/nutch-mapred/conf/nutch- site.xml >> 051201 210402 parsing file:/home/epile/nutch-mapred/conf/nutch- >> default.xml >> 051201 210402 parsing file:/home/epile/nutch-mapred/conf/mapred- >> default.xml >> 051201 210402 parsing /home/epile/mapred/local/jobTracker/ >> job_e336wf.xml >> 051201 210402 parsing file:/home/epile/nutch-mapred/conf/nutch- site.xml >> 051201 210402 Server handler on 11000 call error: >> java.io.IOException: No input directories specified in: NutchConf: >> nutch-default.xml , mapred-default.xml , /home/epile/mapred/local/ >> jobTracker/job_e336wf.xml , nutch-site.xml >> java.io.IOException: No input directories specified in: NutchConf: >> nutch-default.xml , mapred-default.xml , /home/epile/mapred/local/ >> jobTracker/job_e336wf.xml , nutch-site.xml >> at org.apache.nutch.mapred.InputFormatBase.listFiles >> (InputFormatBase.java:85) >> at org.apache.nutch.mapred.SequenceFileInputFormat.listFiles >> (SequenceFileInputFormat.java:41) >> at org.apache.nutch.mapred.InputFormatBase.getSplits >> (InputFormatBase.java:95) >> at org.apache.nutch.mapred.JobTracker$JobInProgress.launch >> (JobTracker.java:617) >> at org.apache.nutch.mapred.JobTracker.createJob(JobTracker.java:537) >> at org.apache.nutch.mapred.JobTracker.submitJob(JobTracker.java:439) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at sun.reflect.NativeMethodAccessorImpl.invoke >> (NativeMethodAccessorImpl.java:39) >> at sun.reflect.DelegatingMethodAccessorImpl.invoke >> (DelegatingMethodAccessorImpl.java:25) >> at java.lang.reflect.Method.invoke(Method.java:585) >> at org.apache.nutch.ipc.RPC$1.call(RPC.java:186) >> at org.apache.nutch.ipc.Server$Handler.run(Server.java:198) >> 051201 210403 Server connection on port 11000 from 192.168.15.50: >> exiting >> 051201 210249 parsing file:/home/epile/nutch-mapred/conf/nutch- >> default.xml >> 051201 210250 parsing file:/home/epile/nutch-mapred/conf/nutch- site.xml >> 051201 210250 Server listener on port 10000: starting >> 051201 210250 Server handler on 10000: starting >> 051201 210250 Server handler on 10000: starting >> 051201 210250 Server handler on 10000: starting >> 051201 210250 Server handler on 10000: starting >> 051201 210250 Server handler on 10000: starting >> 051201 210250 Server handler on 10000: starting >> 051201 210250 Server handler on 10000: starting >> 051201 210250 Server handler on 10000: starting >> 051201 210250 Server handler on 10000: starting >> 051201 210250 Server handler on 10000: starting >> 051201 210251 Server connection on port 10000 from 192.168.15.50: >> starting >> 051201 210253 Server connection on port 10000 from 192.168.15.51: >> starting >> 051201 210254 Server connection on port 10000 from 192.168.15.51: >> starting >> 051201 210254 Got brand-new heartbeat from mapred02.blah.com:50010 >> 051201 210254 Block report from mapred02.blah.com:50010: 0 blocks. >> 051201 210255 Server connection on port 10000 from 192.168.15.50: >> starting >> 051201 210255 Completed file /user/epile/seeds/.urls.txt.crc, at >> holder NDFSClient_-1308246188. There is/are only 1 copies of block >> blk_-1752043025371703199, so replicating up to 3 >> 051201 210255 Completed file /user/epile/seeds/urls.txt, at holder >> NDFSClient_-1308246188. There is/are only 1 copies of block >> blk_7440619803556266327, so replicating up to 3 >> 051201 210256 Server connection on port 10000 from 192.168.15.50: >> exiting >> 051201 210257 Server connection on port 10000 from 192.168.15.50: >> starting >> 051201 210257 Completed file /home/epile/mapred/system/ >> submit_x4iojg/.job.xml.crc, at holder NDFSClient_1478832272. There >> is/are only 1 copies of block blk_-4529820447367794240, so >> replicating up to 3 >> 051201 210257 Completed file /home/epile/mapred/system/ >> submit_x4iojg/job.xml, at holder NDFSClient_1478832272. There is/ >> are only 1 copies of block blk_1156728927747436860, so replicating >> up to 3 >> 051201 210301 Server connection on port 10000 from 192.168.15.51: >> starting >> 051201 210302 Server connection on port 10000 from 192.168.15.51: >> exiting >> 051201 210307 Server connection on port 10000 from 192.168.15.51: >> starting >> 051201 210307 Completed file /home/epile/mapred/temp/inject- >> temp-108905173/.part-00000.crc, at holder NDFSClient_-658104922. >> There is/are only 1 copies of block blk_6188484864483763693, so >> replicating up to 3 >> 051201 210307 Completed file /home/epile/mapred/temp/inject- >> temp-108905173/part-00000, at holder NDFSClient_-658104922. There >> is/are only 1 copies of block blk_-3096236394792281687, so >> replicating up to 3 >> 051201 210308 Server connection on port 10000 from 192.168.15.51: >> exiting >> 051201 210309 Completed file /home/epile/mapred/system/ >> submit_qs9e69/.job.xml.crc, at holder NDFSClient_1478832272. There >> is/are only 1 copies of block blk_-1428427177411186743, so >> replicating up to 3 >> 051201 210309 Completed file /home/epile/mapred/system/ >> submit_qs9e69/job.xml, at holder NDFSClient_1478832272. There is/ >> are only 1 copies of block blk_-1436946628881789312, so replicating >> up to 3 >> 051201 210313 Server connection on port 10000 from 192.168.15.51: >> starting >> 051201 210313 Server connection on port 10000 from 192.168.15.51: >> exiting >> 051201 210316 Server connection on port 10000 from 192.168.15.51: >> starting >> 051201 210316 Completed file /user/epile/crawldb/1283885563/ >> part-00000/.data.crc, at holder NDFSClient_-1295537174. There is/ >> are only 1 copies of block blk_-4102979784890914229, so replicating >> up to 3 >> 051201 210316 Completed file /user/epile/crawldb/1283885563/ >> part-00000/data, at holder NDFSClient_-1295537174. There is/are >> only 1 copies of block blk_-902708794151400000, so replicating up to 3 >> 051201 210316 Completed file /user/epile/crawldb/1283885563/ >> part-00000/.index.crc, at holder NDFSClient_-1295537174. There is/ >> are only 1 copies of block blk_-5931238748697806484, so replicating >> up to 3 >> 051201 210316 Completed file /user/epile/crawldb/1283885563/ >> part-00000/index, at holder NDFSClient_-1295537174. There is/are >> only 1 copies of block blk_-8170047166085229022, so replicating up to 3 >> 051201 210317 Server connection on port 10000 from 192.168.15.51: >> exiting >> 051201 210322 Server connection on port 10000 from 192.168.15.50: >> exiting >> 051201 210323 Server connection on port 10000 from 192.168.15.50: >> starting >> 051201 210324 Completed file /home/epile/mapred/system/ >> submit_c8o35n/.job.xml.crc, at holder NDFSClient_116468727. There >> is/are only 1 copies of block blk_-3279058159932025125, so >> replicating up to 3 >> 051201 210324 Completed file /home/epile/mapred/system/ >> submit_c8o35n/job.xml, at holder NDFSClient_116468727. There is/ are >> only 1 copies of block blk_4281417546376498182, so replicating up to 3 >> 051201 210328 Server connection on port 10000 from 192.168.15.51: >> starting >> 051201 210328 Server connection on port 10000 from 192.168.15.51: >> exiting >> 051201 210331 Server connection on port 10000 from 192.168.15.51: >> starting >> 051201 210331 Completed file /home/epile/mapred/temp/generate- >> temp-1957426409/.part-00000.crc, at holder NDFSClient_1057314937. >> There is/are only 1 copies of block blk_3703840583708700181, so >> replicating up to 3 >> 051201 210331 Completed file /home/epile/mapred/temp/generate- >> temp-1957426409/part-00000, at holder NDFSClient_1057314937. There >> is/are only 1 copies of block blk_-697396903224795005, so >> replicating up to 3 >> 051201 210332 Server connection on port 10000 from 192.168.15.51: >> exiting >> 051201 210334 Completed file /home/epile/mapred/system/ >> submit_y7hvpq/.job.xml.crc, at holder NDFSClient_116468727. There >> is/are only 1 copies of block blk_-5486480837709133340, so >> replicating up to 3 >> 051201 210334 Completed file /home/epile/mapred/system/ >> submit_y7hvpq/job.xml, at holder NDFSClient_116468727. There is/ are >> only 1 copies of block blk_-1013524710870268885, so replicating up to 3 >> 051201 210337 Server connection on port 10000 from 192.168.15.51: >> starting >> 051201 210337 Server connection on port 10000 from 192.168.15.51: >> exiting >> 051201 210340 Server connection on port 10000 from 192.168.15.51: >> starting >> 051201 210340 Completed file /user/epile/segments/20051201210323/ >> crawl_generate/.part-00000.crc, at holder NDFSClient_1881696276. >> There is/are only 1 copies of block blk_-8147654018606192317, so >> replicating up to 3 >> 051201 210340 Completed file /user/epile/segments/20051201210323/ >> crawl_generate/part-00000, at holder NDFSClient_1881696276. There >> is/are only 1 copies of block blk_3501261362541032446, so >> replicating up to 3 >> 051201 210341 Server connection on port 10000 from 192.168.15.51: >> exiting >> 051201 210342 Server connection on port 10000 from 192.168.15.50: >> exiting >> 051201 210343 Server connection on port 10000 from 192.168.15.50: >> starting >> 051201 210344 Server connection on port 10000 from 192.168.15.50: >> exiting >> 051201 210345 Server connection on port 10000 from 192.168.15.50: >> starting >> 051201 210346 Completed file /home/epile/mapred/system/ >> submit_z4ug5y/.job.xml.crc, at holder NDFSClient_83920825. There >> is/are only 1 copies of block blk_-6870895568417527795, so >> replicating up to 3 >> 051201 210346 Completed file /home/epile/mapred/system/ >> submit_z4ug5y/job.xml, at holder NDFSClient_83920825. There is/are >> only 1 copies of block blk_926898854081987743, so replicating up to 3 >> 051201 210349 Server connection on port 10000 from 192.168.15.51: >> starting >> 051201 210350 Server connection on port 10000 from 192.168.15.51: >> exiting >> 051201 210355 Server connection on port 10000 from 192.168.15.51: >> starting >> 051201 210356 Completed file /user/epile/segments/20051201210323/ >> crawl_fetch/part-00000/.data.crc, at holder NDFSClient_647238187. >> There is/are only 1 copies of block blk_9182869764851924134, so >> replicating up to 3 >> 051201 210356 Completed file /user/epile/segments/20051201210323/ >> crawl_fetch/part-00000/data, at holder NDFSClient_647238187. There >> is/are only 1 copies of block blk_-2756164678598933213, so >> replicating up to 3 >> 051201 210356 Completed file /user/epile/segments/20051201210323/ >> crawl_fetch/part-00000/.index.crc, at holder NDFSClient_647238187. >> There is/are only 1 copies of block blk_-5075677998560174819, so >> replicating up to 3 >> 051201 210356 Completed file /user/epile/segments/20051201210323/ >> crawl_fetch/part-00000/index, at holder NDFSClient_647238187. There >> is/are only 1 copies of block blk_-1711420249337549804, so >> replicating up to 3 >> 051201 210356 Completed file /user/epile/segments/20051201210323/ >> content/part-00000/.data.crc, at holder NDFSClient_647238187. There >> is/are only 1 copies of block blk_-8676288182071306365, so >> replicating up to 3 >> 051201 210356 Completed file /user/epile/segments/20051201210323/ >> content/part-00000/data, at holder NDFSClient_647238187. There is/ >> are only 1 copies of block blk_5712126219901943888, so replicating >> up to 3 >> 051201 210357 Completed file /user/epile/segments/20051201210323/ >> content/part-00000/.index.crc, at holder NDFSClient_647238187. >> There is/are only 1 copies of block blk_8727239729283794406, so >> replicating up to 3 >> 051201 210357 Completed file /user/epile/segments/20051201210323/ >> content/part-00000/index, at holder NDFSClient_647238187. There is/ >> are only 1 copies of block blk_7442946611411036186, so replicating >> up to 3 >> 051201 210357 Completed file /user/epile/segments/20051201210323/ >> parse_text/part-00000/.data.crc, at holder NDFSClient_647238187. >> There is/are only 1 copies of block blk_7308454053611234058, so >> replicating up to 3 >> 051201 210357 Completed file /user/epile/segments/20051201210323/ >> parse_text/part-00000/data, at holder NDFSClient_647238187. There >> is/are only 1 copies of block blk_9154249508503313268, so >> replicating up to 3 >> 051201 210357 Completed file /user/epile/segments/20051201210323/ >> parse_text/part-00000/.index.crc, at holder NDFSClient_647238187. >> There is/are only 1 copies of block blk_5550390520109217677, so >> replicating up to 3 >> 051201 210357 Completed file /user/epile/segments/20051201210323/ >> parse_text/part-00000/index, at holder NDFSClient_647238187. There >> is/are only 1 copies of block blk_-8335442137185412194, so >> replicating up to 3 >> 051201 210357 Completed file /user/epile/segments/20051201210323/ >> parse_data/part-00000/.data.crc, at holder NDFSClient_647238187. >> There is/are only 1 copies of block blk_7793344192339293515, so >> replicating up to 3 >> 051201 210357 Completed file /user/epile/segments/20051201210323/ >> parse_data/part-00000/data, at holder NDFSClient_647238187. There >> is/are only 1 copies of block blk_6340855549657308893, so >> replicating up to 3 >> 051201 210357 Completed file /user/epile/segments/20051201210323/ >> parse_data/part-00000/.index.crc, at holder NDFSClient_647238187. >> There is/are only 1 copies of block blk_2705525466413868291, so >> replicating up to 3 >> 051201 210357 Completed file /user/epile/segments/20051201210323/ >> parse_data/part-00000/index, at holder NDFSClient_647238187. There >> is/are only 1 copies of block blk_-3587285255992396675, so >> replicating up to 3 >> 051201 210358 Completed file /user/epile/segments/20051201210323/ >> crawl_parse/.part-00000.crc, at holder NDFSClient_647238187. There >> is/are only 1 copies of block blk_-2663159440041981382, so >> replicating up to 3 >> 051201 210358 Completed file /user/epile/segments/20051201210323/ >> crawl_parse/part-00000, at holder NDFSClient_647238187. There is/ >> are only 1 copies of block blk_-4597337746504817385, so replicating >> up to 3 >> 051201 210358 Server connection on port 10000 from 192.168.15.51: >> exiting >> 051201 210400 Server connection on port 10000 from 192.168.15.50: >> exiting >> 051201 210401 Server connection on port 10000 from 192.168.15.50: >> starting >> 051201 210401 Completed file /home/epile/mapred/system/ >> submit_ux0lpz/.job.xml.crc, at holder NDFSClient_-919051633. There >> is/are only 1 copies of block blk_-2079585474380663469, so >> replicating up to 3 >> 051201 210402 Completed file /home/epile/mapred/system/ >> submit_ux0lpz/job.xml, at holder NDFSClient_-919051633. There is/ >> are only 1 copies of block blk_-44160486757954604, so replicating up >> to 3 >> 051201 210403 Server connection on port 10000 from 192.168.15.50: >> exiting > > > ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
