Am 02.11.2010 15:43, schrieb Erlend Garåsen:
> Since it defaults to regex-urlfilter.txt, I removed "urlfilter-regex"
> from "plugin.includes", so now it is just:
> <value>protocol-httpclient|parse-(text|html|tika)|index-(basic|more)|query-(basic|site|url|lang)</value>

i guess there is some misunderstanding.
you need urlfilter-regex in plugin.includes nutch-site.xml!!
dont remove it, if you want to apply it.

it defaults to regex-urlfilter.txt means  that if this plugin is
included, the patterns are taken from this file.
but you have not included the plugin.

Reply via email to