Unless you are indexing nothing will happen. You specify an indexing filter so you actually need to index something before the filter is run. Although it is loaded this doesn't mean that anything is being indexed.
Lewis On Sun, Aug 12, 2012 at 3:22 PM, Alaak <[email protected]> wrote: > Thanks for your answer. > > I managed to make it run now. The problem was in the parse-html plugin. It > was missing the dependencies to nekohtml and tagsoup. I added both as > external jars to my environment. > > Currently I get the message that my plugin is loaded successfully in > hadoop.log > > 2012-08-12 16:06:43,712 INFO plugin.PluginRepository - URL Meta > Indexing Filter (simpletestplugin) > > However it is never called by the crawler. Neither my 'Test' message is > printed nor does the execution stop if I set a break point within the filter > method of my plugin class. > > I didn't see any error message. I also double checked the plugin.xml, > build.xml src/plugin/build.xml and nutch-site.xml and compared all of them > to some existing plugin code. Everything seems to be correct, so I am > basically quite clueless on how to proceed. > > Do you have any tips? > > > Am 12.08.2012 14:01, schrieb Lewis John Mcgibbney: >> >> Please carefully read the xml configuration in the file you have pasted >> >> >> On Sun, Aug 12, 2012 at 12:11 PM, Alaak <[email protected]> wrote: >> >>> <extension id="de.effingo.crawler" name="Some Simple Test Plugin" >>> point="org.apache.nutch.indexer.IndexingFilter"> >>> <implementation id="page-filter" >>> class="testplugin.SimpleFilter"/> >>> </extension> >>> </plugin> >> >> The extension id attribute should equal the package name followed by >> your class name. Looking at your Java code this should be >> >> testplugin.SimpleFilter >> >> additionally the implementation id attribute should be SimpleFilter >> >> Do you have the build.xml correctly configured? Have you added the >> plugin to plugin.includes property in nutch-site.xml > > -- Lewis

