[ https://issues.apache.org/jira/browse/NUTCH-2041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14601852#comment-14601852 ]
Hudson commented on NUTCH-2041: ------------------------------- SUCCESS: Integrated in Nutch-trunk #3176 (See [https://builds.apache.org/job/Nutch-trunk/3176/]) NUTCH-2041 indexer fails if linkdb is missing (snagel: http://svn.apache.org/viewvc/nutch/trunk/?view=rev&rev=1687612) * /nutch/trunk/CHANGES.txt * /nutch/trunk/src/java/org/apache/nutch/indexer/IndexerMapReduce.java > indexer fails if linkdb is missing > ---------------------------------- > > Key: NUTCH-2041 > URL: https://issues.apache.org/jira/browse/NUTCH-2041 > Project: Nutch > Issue Type: Bug > Components: indexer, linkdb > Affects Versions: 1.10 > Reporter: Sebastian Nagel > Assignee: Sebastian Nagel > Fix For: 1.11 > > Attachments: NUTCH-2014-v1.patch > > > If the linkdb is missing the indexer fails with > {noformat} > 2015-06-17 12:52:10,621 ERROR > ...cause:org.apache.hadoop.mapred.InvalidInputException: Input path does not > exist: .../linkdb/current > {noformat} > If both db.ignore.internal.links and db.ignore.external.links there will be > no LinkDb even if "invertlinks" is run (as consequence of NUTCH-1913). The > script "bin/crawl" does not know about the values of these two properties and > calls indexer with "-linkdb .../linkdb" which will then fail. > Since "bin/crawl" is agnostic to properties defined in nutch-site.xml we > solution similar to NUTCH-1854: make the tool/job more tolerant and log a > warning instead of raising an error. -- This message was sent by Atlassian JIRA (v6.3.4#6332)