[
https://issues.apache.org/jira/browse/NUTCH-1971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-1971:
-----------------------------------
Fix Version/s: 1.17
> The crawldb.url.filters property is not present in any configuration file
> -------------------------------------------------------------------------
>
> Key: NUTCH-1971
> URL: https://issues.apache.org/jira/browse/NUTCH-1971
> Project: Nutch
> Issue Type: Improvement
> Components: crawldb
> Affects Versions: 1.9
> Reporter: Luis Lopez
> Priority: Major
> Labels: configuration, crawldb, nutch-default.xml
> Fix For: 1.17
>
>
> In CrawlDbFilter.java there is a line for getting a boolean that sets if the
> filters are going to be applied or not:
> public static final String URL_FILTERING = "crawldb.url.filters";
> However in nutch-default.xml that property is not present. Currently the only
> way to set this value is using the -filter parameter from the command line.
> The same applies to:
> public static final String URL_NORMALIZING = "crawldb.url.normalizers";
> public static final String URL_NORMALIZING_SCOPE =
> "crawldb.url.normalizers.scope";
--
This message was sent by Atlassian Jira
(v8.3.4#803005)