[
https://issues.apache.org/jira/browse/NUTCH-1054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche updated NUTCH-1054:
---------------------------------
Attachment: NUTCH-1054-1.4.patch
Patch which prevents getting an exception when the linkDB specified doesn't
exist. This allows to keep the current syntax
{quote}
Usage: SolrIndexer <solr url> <crawldb> <linkdb> (<segment> ... | -dir
<segments>) [-noCommit]
{quote}
This way, users can specify a non-existing path if they don't want to use the
linkdb for indexing.
A cleaner approach would be to change the syntax e.g.
{quote}
Usage: SolrIndexer <solr url> <crawldb> [-linkdb <linkdb>] (<segment> ... |
-dir <segments>) [-noCommit]
{quote}
Any thoughts on this?
> Make linkDB optional during indexing
> ------------------------------------
>
> Key: NUTCH-1054
> URL: https://issues.apache.org/jira/browse/NUTCH-1054
> Project: Nutch
> Issue Type: Task
> Components: indexer
> Affects Versions: 1.4
> Reporter: Julien Nioche
> Fix For: 1.4, 2.0
>
> Attachments: NUTCH-1054-1.4.patch
>
>
> Having a linkDB is currently mandatory for indexing, however not all users
> are interested in using the anchors. The linkDB should be optional while
> indexing
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira