Well it depends what you mean by exclude. If you don't want those plugins to be called by the running jobs then you would need to remove the plugin from the plugins.includes configuration variable in nutch-site.xml.

From this:

<property>
<name>plugin.includes</name> <value>protocol-http|urlfilter-regex|parse-(text|html|js)|index-basic|query...

to this

<property>
<name>plugin.includes</name> <value>protocol-http|parse-(text|html|js)|index-basic|query...

If you mean you want to change the file name then change the urlfilter.regex.file variable in nutch-site.xml

<property>
  <name>urlfilter.regex.file</name>
  <value>regex-urlfilter.txt</value>
  <description>Name of file on CLASSPATH containing regular expressions
  used by urlfilter-regex (RegexURLFilter) plugin.</description>
</property>

If you mean you want to remove from the build, then change the plugins.includes and delete the files.

Dennis Kubes


Tobias Wolf wrote:
Hi there,

How can I exclude the files regex-urlfilter.txt and regex-urlnormalizer.txt? Is 
there a possibility to overgive an parameter or setting a propertie where these 
files are stored so that the plugin can find them? A solution withous touching 
the source code would be fine :)


Greetings

Tobias


Reply via email to