Well it depends what you mean by exclude. If you don't want those
plugins to be called by the running jobs then you would need to remove
the plugin from the plugins.includes configuration variable in
nutch-site.xml.
From this:
<property>
<name>plugin.includes</name>
<value>protocol-http|urlfilter-regex|parse-(text|html|js)|index-basic|query...
to this
<property>
<name>plugin.includes</name>
<value>protocol-http|parse-(text|html|js)|index-basic|query...
If you mean you want to change the file name then change the
urlfilter.regex.file variable in nutch-site.xml
<property>
<name>urlfilter.regex.file</name>
<value>regex-urlfilter.txt</value>
<description>Name of file on CLASSPATH containing regular expressions
used by urlfilter-regex (RegexURLFilter) plugin.</description>
</property>
If you mean you want to remove from the build, then change the
plugins.includes and delete the files.
Dennis Kubes
Tobias Wolf wrote:
Hi there,
How can I exclude the files regex-urlfilter.txt and regex-urlnormalizer.txt? Is
there a possibility to overgive an parameter or setting a propertie where these
files are stored so that the plugin can find them? A solution withous touching
the source code would be fine :)
Greetings
Tobias