It depends on how you are building and your classpath. Lets call your plugin myhtmlfilter. If running on a single server and you added it to your src/plugin/build.xml under the deploy section, a myhtmlfilter folder with the plugin should show up in under the build/plugins folder upon build. Then you would just have to copy over that myhtmlfilter folder to where your deployment plugins directory.

If running on a cluster, even in pseudo-distributed mode you would need to copy over the nutch-*.job file. It has the plugins inside of it and it gets distributed out to the cluster. If referencing from a webapp or the nutch war file, you would need to copy to web-inf/classes/plugins.

Dennis

david.stu...@progressivealliance.co.uk wrote:
  Hi,

I am trying to write a plugin for nutch and am having real troubles getting it registered in the system. I have created in src/plugin and added it to both the build.xml in plugin and to nutch-site.xml now it seems to build ok but when I try to run a basic crawl urls -dir crawl -depth 3 -topN 2 I see the plugin registered in the hadoop.log

2009-11-14 14:57:45,739 INFO plugin.PluginRepository - Html Filter Parse Plug-in (parse-htmlfilter)

But then get the error message below. I have followed all of the tutorials but they are mostly for nutch 0.9 and have error in them which I have worked through

Thanks for your help

regards,
Dave
java.lang.RuntimeException: org.apache.nutch.plugin.PluginRuntimeException: java.lang.ClassNotFoundException: org.apache.nutch.parse.htmlfilter.HtmlfilterIndexer at org.apache.nutch.indexer.IndexingFilters.<init>(IndexingFilters.java:100) at org.apache.nutch.indexer.IndexerMapReduce.configure(IndexerMapReduce.java:61) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:83)
        at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:83)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:338)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138) Caused by: org.apache.nutch.plugin.PluginRuntimeException: java.lang.ClassNotFoundException: org.apache.nutch.parse.htmlfilter.HtmlfilterIndexer at org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:166) at org.apache.nutch.indexer.IndexingFilters.<init>(IndexingFilters.java:70)
        ... 8 more
Caused by: java.lang.ClassNotFoundException: org.apache.nutch.parse.htmlfilter.HtmlfilterIndexer
        at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:319)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:254)
at org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:156)
  • Plugin Help david.stu...@progressivealliance.co.uk
    • Re: Plugin Help david.stu...@progressivealliance.co.uk
    • Re: Plugin Help Dennis Kubes

Reply via email to