[
https://issues.apache.org/jira/browse/NUTCH-2220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma resolved NUTCH-2220.
----------------------------------
Resolution: Fixed
Committed to trunk in revision 1731831. Thanks for your comments Sebastian!
> Rename db.* options used only by the linkdb to linkdb.*
> -------------------------------------------------------
>
> Key: NUTCH-2220
> URL: https://issues.apache.org/jira/browse/NUTCH-2220
> Project: Nutch
> Issue Type: Task
> Components: linkdb
> Affects Versions: 1.11
> Reporter: Markus Jelsma
> Assignee: Markus Jelsma
> Fix For: 1.12
>
> Attachments: NUTCH-2220.patch
>
>
> We need an option db.ignore.internal.links that operates in FetcherThread,
> just like db.ignore.external.links. It already exists but it only used by the
> LinkDB, and defaults to true, which is no good option for FetcherThread.
> I propose to make a clear distinction between which are used for LinkDB or
> not. Most options used by LinkDB already use the right prefix but
> db.ignore.*.links, db.max.inlinks and db.max.anchor.length not yet.
> This patch will rename those options to linkdb.* prefixes so afterwards we
> can implement db.ignore.internal.links that operates in FetcherThread, just
> like db.ignore.external.links.
> This will introduce a change in default parameters. Please comment.
> h2. How to upgrade from earlier releases
> * replace your old conf/nutch-default.xml with the conf/nutch-default.xml
> from Nutch 1.12 release
> * if you use LinkDB (e.g. invertlinks) and modified parameters
> {{db.max.inlinks}} and/or {{db.max.anchor.length}} and/or
> {{db.ignore.internal.links}}, rename those parameters to
> {{linkdb.max.inlinks}} and {{linkdb.max.anchor.length}} and
> {{linkdb.ignore.internal.links}}
> * {{db.ignore.internal.links}} and {{db.ignore.external.links}} now operate
> on the CrawlDB only
> * {{linkdb.ignore.internal.links}} and {{linkdb.ignore.external.links}} now
> operate on the LinkDB only
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)