There are some peculiarities in your log:

2011-08-23 14:47:34,833 DEBUG conf.Configuration - java.io.IOException: 
config()
        at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:211)
        at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:198)
        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:213)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(LocalJobRunner.java:93)
        at 
org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:373)
        at 
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:800)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1249)
        at org.apache.nutch.crawl.LinkDb.invert(LinkDb.java:190)
        at org.apache.nutch.crawl.LinkDb.run(LinkDb.java:292)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.nutch.crawl.LinkDb.main(LinkDb.java:255)

2011-08-23 14:47:34,922 INFO  mapred.JobClient - Running job: job_local_0002
2011-08-23 14:47:34,923 DEBUG conf.Configuration - java.io.IOException: 
config(config)
        at org.apache.hadoop.conf.Configuration.<init>(Configuration.java:226)
        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:184)
        at org.apache.hadoop.mapreduce.JobContext.<init>(JobContext.java:52)
        at org.apache.hadoop.mapred.JobContext.<init>(JobContext.java:32)
        at org.apache.hadoop.mapred.JobContext.<init>(JobContext.java:38)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:111)


Can you check permissions, disk space etc?



On Tuesday 23 August 2011 16:05:16 Marek Bachmann wrote:
> Hey Ho,
> 
> for some reasons the inverlinks command produces an empty linkdb.
> 
> I did:
> 
> root@hrz-vm180:/home/nutchServer/relaunch_nutch/runtime/local/bin#
> ./nutch invertlinks crawl/linkdb crawl/segments/* -noNormalize -noFilter
> LinkDb: starting at 2011-08-23 14:47:21
> LinkDb: linkdb: crawl/linkdb
> LinkDb: URL normalize: false
> LinkDb: URL filter: false
> LinkDb: adding segment: crawl/segments/20110817164804
> LinkDb: adding segment: crawl/segments/20110817164912
> LinkDb: adding segment: crawl/segments/20110817165053
> LinkDb: adding segment: crawl/segments/20110817165524
> LinkDb: adding segment: crawl/segments/20110817170729
> LinkDb: adding segment: crawl/segments/20110817171757
> LinkDb: adding segment: crawl/segments/20110817172919
> LinkDb: adding segment: crawl/segments/20110819135218
> LinkDb: adding segment: crawl/segments/20110819165658
> LinkDb: adding segment: crawl/segments/20110819170807
> LinkDb: adding segment: crawl/segments/20110819171841
> LinkDb: adding segment: crawl/segments/20110819173350
> LinkDb: adding segment: crawl/segments/20110822135934
> LinkDb: adding segment: crawl/segments/20110822141229
> LinkDb: adding segment: crawl/segments/20110822143419
> LinkDb: adding segment: crawl/segments/20110822143824
> LinkDb: adding segment: crawl/segments/20110822144031
> LinkDb: adding segment: crawl/segments/20110822144232
> LinkDb: adding segment: crawl/segments/20110822144435
> LinkDb: adding segment: crawl/segments/20110822144617
> LinkDb: adding segment: crawl/segments/20110822144750
> LinkDb: adding segment: crawl/segments/20110822144927
> LinkDb: adding segment: crawl/segments/20110822145249
> LinkDb: adding segment: crawl/segments/20110822150757
> LinkDb: adding segment: crawl/segments/20110822152354
> LinkDb: adding segment: crawl/segments/20110822152503
> LinkDb: adding segment: crawl/segments/20110822153900
> LinkDb: adding segment: crawl/segments/20110822155321
> LinkDb: adding segment: crawl/segments/20110822155732
> LinkDb: merging with existing linkdb: crawl/linkdb
> LinkDb: finished at 2011-08-23 14:47:35, elapsed: 00:00:14
> 
> After that:
> 
> root@hrz-vm180:/home/nutchServer/relaunch_nutch/runtime/local/bin#
> ./nutch readlinkdb crawl/linkdb/ -dump linkdump
> LinkDb dump: starting at 2011-08-23 14:48:26
> LinkDb dump: db: crawl/linkdb/
> LinkDb dump: finished at 2011-08-23 14:48:27, elapsed: 00:00:01
> 
> And then:
> 
> root@hrz-vm180:/home/nutchServer/relaunch_nutch/runtime/local/bin# cd
> linkdump/
> root@hrz-vm180:/home/nutchServer/relaunch_nutch/runtime/local/bin/linkdump#
> ll
> total 0
> -rwxrwxrwx 1 root root 0 Aug 23 14:48 part-00000
> root@hrz-vm180:/home/nutchServer/relaunch_nutch/runtime/local/bin/linkdump#
> 
> As you see, the dump size is 0 byte.
> 
> Unfortunately I have no idea what went wrong.
> 
> I have attached the hadoop.log for the inverlinks process. Perhaps that
> helps anybody?

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Reply via email to