No links to process, is the webgraph empty?

Thomas Anderson Wed, 21 Sep 2011 23:53:42 -0700

I follow the example tutorial at
http://wiki.apache.org/nutch/NewScoringIndexingExample. Nearly all
command executes well except LinkRank command.


When executing LinkRank command `nutch
org.apache.nutch.scoring.webgraph.LinkRank -webgraphdb
crawl/webgraphdb/,` it throws following exception.

11/09/22 14:44:56 FATAL webgraph.LinkRank: LinkAnalysis:
java.io.IOException: No links to process, is the webgraph empty?
        at 
org.apache.nutch.scoring.webgraph.LinkRank.runCounter(LinkRank.java:131)
        at org.apache.nutch.scoring.webgraph.LinkRank.analyze(LinkRank.java:610)
        at org.apache.nutch.scoring.webgraph.LinkRank.run(LinkRank.java:686)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.nutch.scoring.webgraph.LinkRank.main(LinkRank.java:656)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

At beginning I do not add the following properties to
hadoop/conf/nutch-site.xml

<!-- linkrank scoring properties -->
<property>
  <name>link.ignore.internal.host</name>
  <value>true</value>
  <description>Ignore outlinks to the same hostname.</description>
</property>

<property>
  <name>link.ignore.internal.domain</name>
  <value>true</value>
  <description>Ignore outlinks to the same domain.</description>
</property>

<property>
  <name>link.ignore.limit.page</name>
  <value>true</value>
  <description>Limit to only a single outlink to the same page.</description>
</property>

<property>
  <name>link.ignore.limit.domain</name>
  <value>true</value>
  <description>Limit to only a single outlink to the same domain.</description>
</property>

But after adding those properties, the exception remains.
What may cause such error?

Environment: java "1.6.0_26", debian with 2.6.39-2-686-pae kernel,
nutch 1.3, hadoop 0.20.2

Thanks

No links to process, is the webgraph empty?

Reply via email to