Hi all,
I just started experimenting with the mapred branch, but unfortunately
I'm not even able to get an entire crawl cycle to complete properly.
I'm using 2 machines:
mapred01: master that just acts as a JobTracker only (doesn't crawl)
mapred02: slave that executes the tasks
Both machines have exactly the same install and config files.
Here is what I put in nutch-site.xml:
<property>
<name>fs.default.name</name>
<value>mapred01:10000</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>mapred01:11000</value>
</property>
<property>
<name>ndfs.name.dir</name>
<value>/home/epile/ndfs/name</value>
</property>
<property>
<name>ndfs.data.dir</name>
<value>/home/epile/ndfs/data</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/home/epile/mapred/local</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>/home/epile/mapred/system</value>
</property>
<property>
<name>mapred.temp.dir</name>
<value>/home/epile/mapred/temp</value>
</property>
Then, I do the following steps on the master:
1. echo mapred02 > .slaves
2. start-all.sh
3. mkdir seeds
4. echo http://www.cnn.com/ > seeds/urls.txt
ndfs -put seeds seeds
5. inject crawldb seeds
6. generate crawldb segments
7. fetch segments/SEG_NAME (looked up using nutch ndfs -ls segments)
8. invertlinks linkdb segments/SEG_NAME
Up to step 7, everything completes properly.
However, step 8 always fails and I get this java exception:
Exception in thread "main" java.io.IOException: No input directories
specified i n: NutchConf: nutch-default.xml , mapred-default.xml ,
/home/epile/mapred/local/jobTracker/job_gjrlvu.xml , nutch-site.xml
at org.apache.nutch.ipc.Client.call(Client.java:294)
at org.apache.nutch.ipc.RPC$Invoker.invoke(RPC.java:127)
at $Proxy0.submitJob(Unknown Source)
at org.apache.nutch.mapred.JobClient.submitJob
(JobClient.java:259)
at org.apache.nutch.mapred.JobClient.runJob(JobClient.java:
288)
at org.apache.nutch.crawl.LinkDb.invert(LinkDb.java:131)
at org.apache.nutch.crawl.LinkDb.main(LinkDb.java:192)
I saw a few messages on the dev mailing list regarding similar "no
input
directories specified", but I'm not really clear on what's the
cause of
this error.
By looking at /home/epile/mapred/local/jobTracker/job_gjrlvu.xml I
didn't see any missing input.dir properties.
My configuration is very simple, both machine use exactly the same
paths
and same users. There is no distinction between them besides their
hostnames and their respective tasks.
Am I missing something or do I do something wrong ?
I included the whole output, as well as the jobtracker and namenode
logs
in attachment.
Any help would be greatly appreciated.
Thanks,
--Flo
mapred02: starting datanode, logging to /home/epile/log/nutch-epile-
datanode-mapred02.blah.com.log
mapred02: 051202 010100 10 parsing file:/home/epile/nutch-mapred/
conf/nutch-default.xml
rsync from mapred01:/home/epile/nutch-mapred
starting namenode, logging to /home/epile/log/nutch-epile-namenode-
mapred01.blah.com.log
051201 210249 parsing file:/home/epile/nutch-mapred/conf/nutch-
default.xml
051201 210250 parsing file:/home/epile/nutch-mapred/conf/nutch-
site.xml
rsync from mapred01:/home/epile/nutch-mapred
starting jobtracker, logging to /home/epile/log/nutch-epile-
jobtracker-mapred01.blah.com.log
051201 210251 parsing file:/home/epile/nutch-mapred/conf/nutch-
default.xml
mapred02: starting tasktracker, logging to /home/epile/log/nutch-
epile-tasktracker-mapred02.blah.com.log
mapred02: 051202 010104 parsing file:/home/epile/nutch-mapred/conf/
nutch-default.xml
051201 210254 parsing file:/home/epile/nutch-mapred/conf/nutch-
default.xml
051201 210255 parsing file:/home/epile/nutch-mapred/conf/nutch-
site.xml
051201 210255 No FS indicated, using default:mapred01:10000
051201 210255 Client connection to 192.168.15.50:10000: starting
051201 210256 Injector: starting
051201 210256 Injector: crawlDb: crawldb
051201 210256 Injector: urlDir: seeds
051201 210256 parsing file:/home/epile/nutch-mapred/conf/nutch-
default.xml
051201 210257 parsing file:/home/epile/nutch-mapred/conf/nutch-
site.xml
051201 210257 Injector: Converting injected urls to crawl db entries.
051201 210257 parsing file:/home/epile/nutch-mapred/conf/nutch-
default.xml
051201 210257 parsing file:/home/epile/nutch-mapred/conf/mapred-
default.xml
051201 210257 parsing file:/home/epile/nutch-mapred/conf/nutch-
site.xml
051201 210257 Client connection to 192.168.15.50:11000: starting
051201 210257 Client connection to 192.168.15.50:10000: starting
051201 210258 Running job: job_bby846
051201 210259 map 0%
051201 210302 map 100%
051201 210309 reduce 100%
051201 210309 Job complete: job_bby846
051201 210309 Injector: Merging injected urls into crawl db.
051201 210309 parsing file:/home/epile/nutch-mapred/conf/nutch-
default.xml
051201 210309 parsing file:/home/epile/nutch-mapred/conf/mapred-
default.xml
051201 210309 parsing file:/home/epile/nutch-mapred/conf/nutch-
site.xml
051201 210310 Running job: job_haomlk
051201 210311 map 0%
051201 210314 map 100%
051201 210318 reduce 100%
051201 210321 Job complete: job_haomlk
051201 210322 Injector: done
051201 210323 parsing file:/home/epile/nutch-mapred/conf/nutch-
default.xml
051201 210323 parsing file:/home/epile/nutch-mapred/conf/nutch-
site.xml
051201 210323 Generator: starting
051201 210323 Generator: segment: segments/20051201210323
051201 210323 Generator: Selecting most-linked urls due for fetch.
051201 210323 parsing file:/home/epile/nutch-mapred/conf/nutch-
default.xml
051201 210323 parsing file:/home/epile/nutch-mapred/conf/mapred-
default.xml
051201 210323 parsing file:/home/epile/nutch-mapred/conf/nutch-
site.xml
051201 210323 Client connection to 192.168.15.50:11000: starting
051201 210323 Client connection to 192.168.15.50:10000: starting
051201 210324 Running job: job_vcjx1z
051201 210325 map 0%
051201 210330 map 100%
051201 210333 reduce 100%
051201 210333 Job complete: job_vcjx1z
051201 210333 Generator: Partitioning selected urls by host, for
politeness.
051201 210333 parsing file:/home/epile/nutch-mapred/conf/nutch-
default.xml
051201 210333 parsing file:/home/epile/nutch-mapred/conf/mapred-
default.xml
051201 210333 parsing file:/home/epile/nutch-mapred/conf/nutch-
site.xml
051201 210334 Running job: job_oyzp06
051201 210335 map 0%
051201 210339 map 100%
051201 210342 reduce 100%
051201 210342 Job complete: job_oyzp06
051201 210342 Generator: done.
051201 210343 parsing file:/home/epile/nutch-mapred/conf/nutch-
default.xml
051201 210343 parsing file:/home/epile/nutch-mapred/conf/nutch-
site.xml
051201 210343 No FS indicated, using default:mapred01:10000
051201 210344 Client connection to 192.168.15.50:10000: starting
051201 210345 parsing file:/home/epile/nutch-mapred/conf/nutch-
default.xml
051201 210345 parsing file:/home/epile/nutch-mapred/conf/nutch-
site.xml
051201 210345 Fetcher: starting
051201 210345 Fetcher: segment: segments/20051201210323
051201 210345 parsing file:/home/epile/nutch-mapred/conf/nutch-
default.xml
051201 210345 parsing file:/home/epile/nutch-mapred/conf/mapred-
default.xml
051201 210345 parsing file:/home/epile/nutch-mapred/conf/nutch-
site.xml
051201 210345 Client connection to 192.168.15.50:11000: starting
051201 210345 Client connection to 192.168.15.50:10000: starting
051201 210346 Running job: job_r878fx
051201 210347 map 0%
051201 210354 map 100%
051201 210359 reduce 100%
051201 210359 Job complete: job_r878fx
051201 210359 Fetcher: done
051201 210400 LinkDb: starting
051201 210400 LinkDb: linkdb: linkdb
051201 210400 LinkDb: segments: segments/20051201210323
051201 210401 parsing file:/home/epile/nutch-mapred/conf/nutch-
default.xml
051201 210401 parsing file:/home/epile/nutch-mapred/conf/mapred-
default.xml
051201 210401 parsing file:/home/epile/nutch-mapred/conf/nutch-
site.xml
051201 210401 parsing file:/home/epile/nutch-mapred/conf/nutch-
default.xml
051201 210401 parsing file:/home/epile/nutch-mapred/conf/nutch-
site.xml
051201 210401 Client connection to 192.168.15.50:11000: starting
051201 210401 Client connection to 192.168.15.50:10000: starting
Exception in thread "main" java.io.IOException: No input
directories specified in: NutchConf: nutch-default.xml , mapred-
default.xml , /home/epile/mapred/local/jobTracker/job_e336wf.xml ,
nutch-site.xml
at org.apache.nutch.ipc.Client.call(Client.java:294)
at org.apache.nutch.ipc.RPC$Invoker.invoke(RPC.java:127)
at $Proxy0.submitJob(Unknown Source)
at org.apache.nutch.mapred.JobClient.submitJob(JobClient.java:259)
at org.apache.nutch.mapred.JobClient.runJob(JobClient.java:288)
at org.apache.nutch.crawl.LinkDb.invert(LinkDb.java:131)
at org.apache.nutch.crawl.LinkDb.main(LinkDb.java:192)
051201 210251 parsing file:/home/epile/nutch-mapred/conf/nutch-
default.xml
051201 210251 parsing file:/home/epile/nutch-mapred/conf/nutch-
site.xml
051201 210251 Client connection to 192.168.15.50:10000: starting
051201 210252 Server listener on port 11000: starting
051201 210252 Server handler on 11000: starting
051201 210252 Server handler on 11000: starting
051201 210252 Server handler on 11000: starting
051201 210252 Server handler on 11000: starting
051201 210252 Server handler on 11000: starting
051201 210252 Server handler on 11000: starting
051201 210252 Server handler on 11000: starting
051201 210252 Server handler on 11000: starting
051201 210252 Server handler on 11000: starting
051201 210252 Property 'java.runtime.name' is Java(TM) 2 Runtime
Environment, Standard Edition
051201 210252 Property 'sun.boot.library.path' is /usr/lib/j2sdk1.5-
sun/jre/lib/i386
051201 210252 Property 'java.vm.version' is 1.5.0_03-b07
051201 210252 Property 'java.vm.vendor' is Sun Microsystems Inc.
051201 210252 Property 'java.vendor.url' is http://java.sun.com/
051201 210252 Property 'path.separator' is :
051201 210252 Property 'java.vm.name' is Java HotSpot(TM) Client VM
051201 210252 Property 'file.encoding.pkg' is sun.io
051201 210252 Property 'user.country' is US
051201 210252 Property 'sun.os.patch.level' is unknown
051201 210252 Property 'java.vm.specification.name' is Java Virtual
Machine Specification
051201 210252 Property 'user.dir' is /home/epile/nutch-mapred
051201 210252 Property 'java.runtime.version' is 1.5.0_03-b07
051201 210252 Property 'java.awt.graphicsenv' is
sun.awt.X11GraphicsEnvironment
051201 210252 Property 'java.endorsed.dirs' is /usr/lib/j2sdk1.5-
sun/jre/lib/endorsed
051201 210252 Property 'os.arch' is i386
051201 210252 Property 'java.io.tmpdir' is /tmp
051201 210252 Property 'line.separator' is
051201 210252 Property 'java.vm.specification.vendor' is Sun
Microsystems Inc.
051201 210252 Property 'os.name' is Linux
051201 210252 Property 'sun.jnu.encoding' is ANSI_X3.4-1968
051201 210252 Property 'java.library.path' is /usr/lib/j2sdk1.5-sun/
jre/lib/i386/client:/usr/lib/j2sdk1.5-sun/jre/lib/i386:/usr/lib/
j2sdk1.5-sun/jre/../lib/i386
051201 210252 Property 'java.specification.name' is Java Platform
API Specification
051201 210252 Property 'java.class.version' is 49.0
051201 210252 Property 'sun.management.compiler' is HotSpot Client
Compiler
051201 210252 Property 'os.version' is 2.6.12-9-386
051201 210252 Property 'user.home' is /home/epile
051201 210252 Property 'user.timezone' is GMT
051201 210252 Property 'java.awt.printerjob' is sun.print.PSPrinterJob
051201 210252 Property 'file.encoding' is ANSI_X3.4-1968
051201 210252 Property 'java.specification.version' is 1.5
051201 210252 Server handler on 11000: starting
051201 210252 Property 'java.class.path' is /home/epile/nutch-
mapred/conf:/usr/lib/tools.jar:/home/epile/nutch-mapred/build/
classes:/home/epile/nutch-mapred/build:/home/epile/nutch-mapred/
build/test/classes:/home/epile/nutch-mapred/nutch-*.jar:/home/epile/
nutch-mapred/lib/commons-lang-2.1.jar:/home/epile/nutch-mapred/lib/
commons-logging-api-1.0.4.jar:/home/epile/nutch-mapred/lib/
concurrent-1.3.4.jar:/home/epile/nutch-mapred/lib/jakarta-
oro-2.0.7.jar:/home/epile/nutch-mapred/lib/jetty-5.1.4.jar:/home/
epile/nutch-mapred/lib/junit-3.8.1.jar:/home/epile/nutch-mapred/lib/
lucene-1.9-rc1-dev.jar:/home/epile/nutch-mapred/lib/lucene-misc-1.9-
rc1-dev.jar:/home/epile/nutch-mapred/lib/servlet-api.jar:/home/
epile/nutch-mapred/lib/taglibs-i18n.jar:/home/epile/nutch-mapred/
lib/xerces-2_6_2-apis.jar:/home/epile/nutch-mapred/lib/
xerces-2_6_2.jar:/home/epile/nutch-mapred/lib/jetty-ext/ant.jar:/
home/epile/nutch-mapred/lib/jetty-ext/commons-el.jar:/home/epile/
nutch-mapred/lib/jetty-ext/jasper-compiler.jar:/home/epile/nutch-
mapred/lib/jetty-ext/jasper-runtime.jar:/home/epile/nutch-mapred/
lib/jetty-ext/jsp-api.jar
051201 210252 Property 'user.name' is epile
051201 210252 Property 'java.vm.specification.version' is 1.0
051201 210252 Property 'java.home' is /usr/lib/j2sdk1.5-sun/jre
051201 210252 Property 'sun.arch.data.model' is 32
051201 210252 Property 'user.language' is en
051201 210252 Property 'java.specification.vendor' is Sun
Microsystems Inc.
051201 210252 Property 'java.vm.info' is mixed mode, sharing
051201 210252 Property 'java.version' is 1.5.0_03
051201 210252 Property 'java.ext.dirs' is /usr/lib/j2sdk1.5-sun/jre/
lib/ext
051201 210252 Property 'sun.boot.class.path' is /usr/lib/j2sdk1.5-
sun/jre/lib/rt.jar:/usr/lib/j2sdk1.5-sun/jre/lib/i18n.jar:/usr/lib/
j2sdk1.5-sun/jre/lib/sunrsasign.jar:/usr/lib/j2sdk1.5-sun/jre/lib/
jsse.jar:/usr/lib/j2sdk1.5-sun/jre/lib/jce.jar:/usr/lib/j2sdk1.5-
sun/jre/lib/charsets.jar:/usr/lib/j2sdk1.5-sun/jre/classes
051201 210252 Property 'java.vendor' is Sun Microsystems Inc.
051201 210252 Property 'file.separator' is /
051201 210252 Property 'java.vendor.url.bug' is http://java.sun.com/
cgi-bin/bugreport.cgi
051201 210252 Property 'sun.io.unicode.encoding' is UnicodeLittle
051201 210252 Property 'sun.cpu.endian' is little
051201 210252 Property 'sun.cpu.isalist' is
051201 210252 Version Jetty/5.1.4
051201 210252 Checking Resource aliases
051201 210253 Server connection on port 11000 from 192.168.15.51:
starting
051201 210254 Started
[EMAIL PROTECTED]
051201 210254 Started WebApplicationContext[/,/]
051201 210254 Started SocketListener on 0.0.0.0:7845
051201 210254 Started [EMAIL PROTECTED]
051201 210257 Server connection on port 11000 from 192.168.15.50:
starting
051201 210258 parsing file:/home/epile/nutch-mapred/conf/nutch-
default.xml
051201 210258 parsing file:/home/epile/nutch-mapred/conf/mapred-
default.xml
051201 210258 parsing /home/epile/mapred/local/jobTracker/
job_bby846.xml
051201 210258 parsing file:/home/epile/nutch-mapred/conf/nutch-
site.xml
051201 210258 parsing file:/home/epile/nutch-mapred/conf/nutch-
default.xml
051201 210258 parsing file:/home/epile/nutch-mapred/conf/mapred-
default.xml
051201 210258 parsing /home/epile/mapred/local/jobTracker/
job_bby846.xml
051201 210258 parsing file:/home/epile/nutch-mapred/conf/nutch-
site.xml
051201 210259 Adding task 'task_m_ihqm4i' to set for tracker
'tracker_50075'
051201 210305 Task 'task_m_ihqm4i' has finished successfully.
051201 210305 Adding task 'task_r_f1ykb1' to set for tracker
'tracker_50075'
051201 210308 Task 'task_r_f1ykb1' has finished successfully.
051201 210310 parsing file:/home/epile/nutch-mapred/conf/nutch-
default.xml
051201 210310 parsing file:/home/epile/nutch-mapred/conf/mapred-
default.xml
051201 210310 parsing /home/epile/mapred/local/jobTracker/
job_haomlk.xml
051201 210310 parsing file:/home/epile/nutch-mapred/conf/nutch-
site.xml
051201 210310 parsing file:/home/epile/nutch-mapred/conf/nutch-
default.xml
051201 210310 parsing file:/home/epile/nutch-mapred/conf/mapred-
default.xml
051201 210310 parsing /home/epile/mapred/local/jobTracker/
job_haomlk.xml
051201 210310 parsing file:/home/epile/nutch-mapred/conf/nutch-
site.xml
051201 210311 Adding task 'task_m_a2frqg' to set for tracker
'tracker_50075'
051201 210314 Task 'task_m_a2frqg' has finished successfully.
051201 210314 Adding task 'task_r_sw6zcc' to set for tracker
'tracker_50075'
051201 210320 Task 'task_r_sw6zcc' has finished successfully.
051201 210322 Server connection on port 11000 from 192.168.15.50:
exiting
051201 210323 Server connection on port 11000 from 192.168.15.50:
starting
051201 210324 parsing file:/home/epile/nutch-mapred/conf/nutch-
default.xml
051201 210324 parsing file:/home/epile/nutch-mapred/conf/mapred-
default.xml
051201 210324 parsing /home/epile/mapred/local/jobTracker/
job_vcjx1z.xml
051201 210324 parsing file:/home/epile/nutch-mapred/conf/nutch-
site.xml
051201 210324 parsing file:/home/epile/nutch-mapred/conf/nutch-
default.xml
051201 210324 parsing file:/home/epile/nutch-mapred/conf/mapred-
default.xml
051201 210324 parsing /home/epile/mapred/local/jobTracker/
job_vcjx1z.xml
051201 210324 parsing file:/home/epile/nutch-mapred/conf/nutch-
site.xml
051201 210326 Adding task 'task_m_1oh66k' to set for tracker
'tracker_50075'
051201 210329 Task 'task_m_1oh66k' has finished successfully.
051201 210329 Adding task 'task_r_mfdo41' to set for tracker
'tracker_50075'
051201 210332 Task 'task_r_mfdo41' has finished successfully.
051201 210334 parsing file:/home/epile/nutch-mapred/conf/nutch-
default.xml
051201 210334 parsing file:/home/epile/nutch-mapred/conf/mapred-
default.xml
051201 210334 parsing /home/epile/mapred/local/jobTracker/
job_oyzp06.xml
051201 210334 parsing file:/home/epile/nutch-mapred/conf/nutch-
site.xml
051201 210334 parsing file:/home/epile/nutch-mapred/conf/nutch-
default.xml
051201 210334 parsing file:/home/epile/nutch-mapred/conf/mapred-
default.xml
051201 210334 parsing /home/epile/mapred/local/jobTracker/
job_oyzp06.xml
051201 210334 parsing file:/home/epile/nutch-mapred/conf/nutch-
site.xml
051201 210335 Adding task 'task_m_8o4pj1' to set for tracker
'tracker_50075'
051201 210338 Task 'task_m_8o4pj1' has finished successfully.
051201 210338 Adding task 'task_r_iv805p' to set for tracker
'tracker_50075'
051201 210341 Task 'task_r_iv805p' has finished successfully.
051201 210342 Server connection on port 11000 from 192.168.15.50:
exiting
051201 210345 Server connection on port 11000 from 192.168.15.50:
starting
051201 210346 parsing file:/home/epile/nutch-mapred/conf/nutch-
default.xml
051201 210346 parsing file:/home/epile/nutch-mapred/conf/mapred-
default.xml
051201 210346 parsing /home/epile/mapred/local/jobTracker/
job_r878fx.xml
051201 210346 parsing file:/home/epile/nutch-mapred/conf/nutch-
site.xml
051201 210346 parsing file:/home/epile/nutch-mapred/conf/nutch-
default.xml
051201 210346 parsing file:/home/epile/nutch-mapred/conf/mapred-
default.xml
051201 210346 parsing /home/epile/mapred/local/jobTracker/
job_r878fx.xml
051201 210346 parsing file:/home/epile/nutch-mapred/conf/nutch-
site.xml
051201 210347 Adding task 'task_m_ndssqr' to set for tracker
'tracker_50075'
051201 210353 Task 'task_m_ndssqr' has finished successfully.
051201 210353 Adding task 'task_r_184i7z' to set for tracker
'tracker_50075'
051201 210359 Task 'task_r_184i7z' has finished successfully.
051201 210400 Server connection on port 11000 from 192.168.15.50:
exiting
051201 210401 Server connection on port 11000 from 192.168.15.50:
starting
051201 210402 parsing file:/home/epile/nutch-mapred/conf/nutch-
default.xml
051201 210402 parsing file:/home/epile/nutch-mapred/conf/mapred-
default.xml
051201 210402 parsing /home/epile/mapred/local/jobTracker/
job_e336wf.xml
051201 210402 parsing file:/home/epile/nutch-mapred/conf/nutch-
site.xml
051201 210402 parsing file:/home/epile/nutch-mapred/conf/nutch-
default.xml
051201 210402 parsing file:/home/epile/nutch-mapred/conf/mapred-
default.xml
051201 210402 parsing /home/epile/mapred/local/jobTracker/
job_e336wf.xml
051201 210402 parsing file:/home/epile/nutch-mapred/conf/nutch-
site.xml
051201 210402 Server handler on 11000 call error:
java.io.IOException: No input directories specified in: NutchConf:
nutch-default.xml , mapred-default.xml , /home/epile/mapred/local/
jobTracker/job_e336wf.xml , nutch-site.xml
java.io.IOException: No input directories specified in: NutchConf:
nutch-default.xml , mapred-default.xml , /home/epile/mapred/local/
jobTracker/job_e336wf.xml , nutch-site.xml
at org.apache.nutch.mapred.InputFormatBase.listFiles
(InputFormatBase.java:85)
at org.apache.nutch.mapred.SequenceFileInputFormat.listFiles
(SequenceFileInputFormat.java:41)
at org.apache.nutch.mapred.InputFormatBase.getSplits
(InputFormatBase.java:95)
at org.apache.nutch.mapred.JobTracker$JobInProgress.launch
(JobTracker.java:617)
at org.apache.nutch.mapred.JobTracker.createJob(JobTracker.java:537)
at org.apache.nutch.mapred.JobTracker.submitJob(JobTracker.java:439)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke
(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke
(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.apache.nutch.ipc.RPC$1.call(RPC.java:186)
at org.apache.nutch.ipc.Server$Handler.run(Server.java:198)
051201 210403 Server connection on port 11000 from 192.168.15.50:
exiting
051201 210249 parsing file:/home/epile/nutch-mapred/conf/nutch-
default.xml
051201 210250 parsing file:/home/epile/nutch-mapred/conf/nutch-
site.xml
051201 210250 Server listener on port 10000: starting
051201 210250 Server handler on 10000: starting
051201 210250 Server handler on 10000: starting
051201 210250 Server handler on 10000: starting
051201 210250 Server handler on 10000: starting
051201 210250 Server handler on 10000: starting
051201 210250 Server handler on 10000: starting
051201 210250 Server handler on 10000: starting
051201 210250 Server handler on 10000: starting
051201 210250 Server handler on 10000: starting
051201 210250 Server handler on 10000: starting
051201 210251 Server connection on port 10000 from 192.168.15.50:
starting
051201 210253 Server connection on port 10000 from 192.168.15.51:
starting
051201 210254 Server connection on port 10000 from 192.168.15.51:
starting
051201 210254 Got brand-new heartbeat from mapred02.blah.com:50010
051201 210254 Block report from mapred02.blah.com:50010: 0 blocks.
051201 210255 Server connection on port 10000 from 192.168.15.50:
starting
051201 210255 Completed file /user/epile/seeds/.urls.txt.crc, at
holder NDFSClient_-1308246188. There is/are only 1 copies of block
blk_-1752043025371703199, so replicating up to 3
051201 210255 Completed file /user/epile/seeds/urls.txt, at holder
NDFSClient_-1308246188. There is/are only 1 copies of block
blk_7440619803556266327, so replicating up to 3
051201 210256 Server connection on port 10000 from 192.168.15.50:
exiting
051201 210257 Server connection on port 10000 from 192.168.15.50:
starting
051201 210257 Completed file /home/epile/mapred/system/
submit_x4iojg/.job.xml.crc, at holder NDFSClient_1478832272. There
is/are only 1 copies of block blk_-4529820447367794240, so
replicating up to 3
051201 210257 Completed file /home/epile/mapred/system/
submit_x4iojg/job.xml, at holder NDFSClient_1478832272. There is/
are only 1 copies of block blk_1156728927747436860, so replicating
up to 3
051201 210301 Server connection on port 10000 from 192.168.15.51:
starting
051201 210302 Server connection on port 10000 from 192.168.15.51:
exiting
051201 210307 Server connection on port 10000 from 192.168.15.51:
starting
051201 210307 Completed file /home/epile/mapred/temp/inject-
temp-108905173/.part-00000.crc, at holder NDFSClient_-658104922.
There is/are only 1 copies of block blk_6188484864483763693, so
replicating up to 3
051201 210307 Completed file /home/epile/mapred/temp/inject-
temp-108905173/part-00000, at holder NDFSClient_-658104922. There
is/are only 1 copies of block blk_-3096236394792281687, so
replicating up to 3
051201 210308 Server connection on port 10000 from 192.168.15.51:
exiting
051201 210309 Completed file /home/epile/mapred/system/
submit_qs9e69/.job.xml.crc, at holder NDFSClient_1478832272. There
is/are only 1 copies of block blk_-1428427177411186743, so
replicating up to 3
051201 210309 Completed file /home/epile/mapred/system/
submit_qs9e69/job.xml, at holder NDFSClient_1478832272. There is/
are only 1 copies of block blk_-1436946628881789312, so replicating
up to 3
051201 210313 Server connection on port 10000 from 192.168.15.51:
starting
051201 210313 Server connection on port 10000 from 192.168.15.51:
exiting
051201 210316 Server connection on port 10000 from 192.168.15.51:
starting
051201 210316 Completed file /user/epile/crawldb/1283885563/
part-00000/.data.crc, at holder NDFSClient_-1295537174. There is/
are only 1 copies of block blk_-4102979784890914229, so replicating
up to 3
051201 210316 Completed file /user/epile/crawldb/1283885563/
part-00000/data, at holder NDFSClient_-1295537174. There is/are
only 1 copies of block blk_-902708794151400000, so replicating up to 3
051201 210316 Completed file /user/epile/crawldb/1283885563/
part-00000/.index.crc, at holder NDFSClient_-1295537174. There is/
are only 1 copies of block blk_-5931238748697806484, so replicating
up to 3
051201 210316 Completed file /user/epile/crawldb/1283885563/
part-00000/index, at holder NDFSClient_-1295537174. There is/are
only 1 copies of block blk_-8170047166085229022, so replicating up
to 3
051201 210317 Server connection on port 10000 from 192.168.15.51:
exiting
051201 210322 Server connection on port 10000 from 192.168.15.50:
exiting
051201 210323 Server connection on port 10000 from 192.168.15.50:
starting
051201 210324 Completed file /home/epile/mapred/system/
submit_c8o35n/.job.xml.crc, at holder NDFSClient_116468727. There
is/are only 1 copies of block blk_-3279058159932025125, so
replicating up to 3
051201 210324 Completed file /home/epile/mapred/system/
submit_c8o35n/job.xml, at holder NDFSClient_116468727. There is/
are only 1 copies of block blk_4281417546376498182, so replicating
up to 3
051201 210328 Server connection on port 10000 from 192.168.15.51:
starting
051201 210328 Server connection on port 10000 from 192.168.15.51:
exiting
051201 210331 Server connection on port 10000 from 192.168.15.51:
starting
051201 210331 Completed file /home/epile/mapred/temp/generate-
temp-1957426409/.part-00000.crc, at holder NDFSClient_1057314937.
There is/are only 1 copies of block blk_3703840583708700181, so
replicating up to 3
051201 210331 Completed file /home/epile/mapred/temp/generate-
temp-1957426409/part-00000, at holder NDFSClient_1057314937. There
is/are only 1 copies of block blk_-697396903224795005, so
replicating up to 3
051201 210332 Server connection on port 10000 from 192.168.15.51:
exiting
051201 210334 Completed file /home/epile/mapred/system/
submit_y7hvpq/.job.xml.crc, at holder NDFSClient_116468727. There
is/are only 1 copies of block blk_-5486480837709133340, so
replicating up to 3
051201 210334 Completed file /home/epile/mapred/system/
submit_y7hvpq/job.xml, at holder NDFSClient_116468727. There is/
are only 1 copies of block blk_-1013524710870268885, so replicating
up to 3
051201 210337 Server connection on port 10000 from 192.168.15.51:
starting
051201 210337 Server connection on port 10000 from 192.168.15.51:
exiting
051201 210340 Server connection on port 10000 from 192.168.15.51:
starting
051201 210340 Completed file /user/epile/segments/20051201210323/
crawl_generate/.part-00000.crc, at holder NDFSClient_1881696276.
There is/are only 1 copies of block blk_-8147654018606192317, so
replicating up to 3
051201 210340 Completed file /user/epile/segments/20051201210323/
crawl_generate/part-00000, at holder NDFSClient_1881696276. There
is/are only 1 copies of block blk_3501261362541032446, so
replicating up to 3
051201 210341 Server connection on port 10000 from 192.168.15.51:
exiting
051201 210342 Server connection on port 10000 from 192.168.15.50:
exiting
051201 210343 Server connection on port 10000 from 192.168.15.50:
starting
051201 210344 Server connection on port 10000 from 192.168.15.50:
exiting
051201 210345 Server connection on port 10000 from 192.168.15.50:
starting
051201 210346 Completed file /home/epile/mapred/system/
submit_z4ug5y/.job.xml.crc, at holder NDFSClient_83920825. There
is/are only 1 copies of block blk_-6870895568417527795, so
replicating up to 3
051201 210346 Completed file /home/epile/mapred/system/
submit_z4ug5y/job.xml, at holder NDFSClient_83920825. There is/are
only 1 copies of block blk_926898854081987743, so replicating up to 3
051201 210349 Server connection on port 10000 from 192.168.15.51:
starting
051201 210350 Server connection on port 10000 from 192.168.15.51:
exiting
051201 210355 Server connection on port 10000 from 192.168.15.51:
starting
051201 210356 Completed file /user/epile/segments/20051201210323/
crawl_fetch/part-00000/.data.crc, at holder NDFSClient_647238187.
There is/are only 1 copies of block blk_9182869764851924134, so
replicating up to 3
051201 210356 Completed file /user/epile/segments/20051201210323/
crawl_fetch/part-00000/data, at holder NDFSClient_647238187. There
is/are only 1 copies of block blk_-2756164678598933213, so
replicating up to 3
051201 210356 Completed file /user/epile/segments/20051201210323/
crawl_fetch/part-00000/.index.crc, at holder NDFSClient_647238187.
There is/are only 1 copies of block blk_-5075677998560174819, so
replicating up to 3
051201 210356 Completed file /user/epile/segments/20051201210323/
crawl_fetch/part-00000/index, at holder NDFSClient_647238187.
There is/are only 1 copies of block blk_-1711420249337549804, so
replicating up to 3
051201 210356 Completed file /user/epile/segments/20051201210323/
content/part-00000/.data.crc, at holder NDFSClient_647238187.
There is/are only 1 copies of block blk_-8676288182071306365, so
replicating up to 3
051201 210356 Completed file /user/epile/segments/20051201210323/
content/part-00000/data, at holder NDFSClient_647238187. There is/
are only 1 copies of block blk_5712126219901943888, so replicating
up to 3
051201 210357 Completed file /user/epile/segments/20051201210323/
content/part-00000/.index.crc, at holder NDFSClient_647238187.
There is/are only 1 copies of block blk_8727239729283794406, so
replicating up to 3
051201 210357 Completed file /user/epile/segments/20051201210323/
content/part-00000/index, at holder NDFSClient_647238187. There is/
are only 1 copies of block blk_7442946611411036186, so replicating
up to 3
051201 210357 Completed file /user/epile/segments/20051201210323/
parse_text/part-00000/.data.crc, at holder NDFSClient_647238187.
There is/are only 1 copies of block blk_7308454053611234058, so
replicating up to 3
051201 210357 Completed file /user/epile/segments/20051201210323/
parse_text/part-00000/data, at holder NDFSClient_647238187. There
is/are only 1 copies of block blk_9154249508503313268, so
replicating up to 3
051201 210357 Completed file /user/epile/segments/20051201210323/
parse_text/part-00000/.index.crc, at holder NDFSClient_647238187.
There is/are only 1 copies of block blk_5550390520109217677, so
replicating up to 3
051201 210357 Completed file /user/epile/segments/20051201210323/
parse_text/part-00000/index, at holder NDFSClient_647238187. There
is/are only 1 copies of block blk_-8335442137185412194, so
replicating up to 3
051201 210357 Completed file /user/epile/segments/20051201210323/
parse_data/part-00000/.data.crc, at holder NDFSClient_647238187.
There is/are only 1 copies of block blk_7793344192339293515, so
replicating up to 3
051201 210357 Completed file /user/epile/segments/20051201210323/
parse_data/part-00000/data, at holder NDFSClient_647238187. There
is/are only 1 copies of block blk_6340855549657308893, so
replicating up to 3
051201 210357 Completed file /user/epile/segments/20051201210323/
parse_data/part-00000/.index.crc, at holder NDFSClient_647238187.
There is/are only 1 copies of block blk_2705525466413868291, so
replicating up to 3
051201 210357 Completed file /user/epile/segments/20051201210323/
parse_data/part-00000/index, at holder NDFSClient_647238187. There
is/are only 1 copies of block blk_-3587285255992396675, so
replicating up to 3
051201 210358 Completed file /user/epile/segments/20051201210323/
crawl_parse/.part-00000.crc, at holder NDFSClient_647238187. There
is/are only 1 copies of block blk_-2663159440041981382, so
replicating up to 3
051201 210358 Completed file /user/epile/segments/20051201210323/
crawl_parse/part-00000, at holder NDFSClient_647238187. There is/
are only 1 copies of block blk_-4597337746504817385, so replicating
up to 3
051201 210358 Server connection on port 10000 from 192.168.15.51:
exiting
051201 210400 Server connection on port 10000 from 192.168.15.50:
exiting
051201 210401 Server connection on port 10000 from 192.168.15.50:
starting
051201 210401 Completed file /home/epile/mapred/system/
submit_ux0lpz/.job.xml.crc, at holder NDFSClient_-919051633. There
is/are only 1 copies of block blk_-2079585474380663469, so
replicating up to 3
051201 210402 Completed file /home/epile/mapred/system/
submit_ux0lpz/job.xml, at holder NDFSClient_-919051633. There is/
are only 1 copies of block blk_-44160486757954604, so replicating
up to 3
051201 210403 Server connection on port 10000 from 192.168.15.50:
exiting