nutch-extensionpoints is the plugin that defines all the nutch standard extension points: ie all the other plugins have a dependency on it. So, it is mandatory to include it in the list of activated plugin, or you must turn to true the plugin.auto-activation property, so that when a plugin is activated, all its dependencies will be automatically loaded.
Jérôme On 12/18/05, Stephen Fitch <[EMAIL PROTECTED]> wrote: > > I can get the 'crawl' to run without a 'SEVERE' error by altering my > conf/nutch-site.xml to read: > > <?xml version="1.0"?> > <?xml-stylesheet type="text/xsl" href="nutch-conf.xsl"?> > <!-- Put site-specific property overrides in this file. --> > <nutch-conf> > <property> > <name>plugin.includes</name> > > <value>nutch-extensionpoints|protocol-file|protocol-http|urlfilter-regex|parse-(text|html)|index-basic|query-(basic|site|url)</value> > </property> > <property> > <name>file.content.limit</name> <value>-1</value> > </property> > </nutch-conf> > > The file originally had: > > > <value>protocol-file|urlfilter-regex|parse-(xml|text|html|js|pdf)|index-basic|query-(basic|site|url)</value> > > So the key appears to be the string 'nutch-extensionpoints|' prefixing the > <value> > > Hmmmm... now to understand why this makes a difference and to see if I can > get tomcat > to use my brand new file-system crawl... > > Light at the end of the tunnel... :-) and I hope it's not a train... > > -Stephen > > > ---- > > > On 12/18/05, Stephen Fitch <[EMAIL PROTECTED]> wrote: > > > > I tried J2SE v1.4.2_10 and 5.0 u6... same issue.. > > > > I threw away my nutch directory and un-tarballed a new one... same > > issue... > > > > I should add this is issue is on a Windows box with CYGWIN/bash and > > the following env variables > > > > [EMAIL PROTECTED] /cygdrive/h/p/nutch/nutch-0.7.1 > > $ env | grep NUTCH > > NUTCH_HOME=/cygdrive/h/p/nutch/nutch-0.7.1 > > NUTCH_CONF_DIR=/cygdrive/h/p/nutch/nutch-0.7.1/conf > > > > [EMAIL PROTECTED] /cygdrive/h/p/nutch/nutch-0.7.1 > > $ env | grep CLASSPATH > > CLASSPATH=/cygdrive/h/p/nutch/nutch-0.7.1/lib > > > > [EMAIL PROTECTED] /cygdrive/h/p/nutch/nutch-0.7.1 > > $ env | grep JAVA > > QTJAVA="D:\Program Files\Java\jre1.5.0\lib\ext\QTJava.zip" > > JAVA_HOME=/cygdrive/e/Program Files/Java/jdk1.4.2_10 > > > > I see... > > > > 051218 125130 parsing: H:\p\nutch\nutch- > > 0.7.1\plugins\urlfilter-regex\plugin.xml > > 051218 125130 impl: point=org.apache.nutch.net.URLFilter class= > > org.apache.nutch.net.RegexURLFilter > > 051218 125130 SEVERE org.apache.nutch.plugin.PluginRuntimeException: > > extension point: org.apache.nutch.indexer.IndexingFilter does not exist. > > java.lang.ExceptionInInitializerError > > at org.apache.nutch.db.WebDBInjector.addPage(WebDBInjector.java > > :437) > > at org.apache.nutch.db.WebDBInjector.injectURLFile( > > WebDBInjector.java:378) > > at org.apache.nutch.db.WebDBInjector.main(WebDBInjector.java > :535) > > at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:134) > > Caused by: java.lang.RuntimeException: > > org.apache.nutch.plugin.PluginRuntimeException: extension point: > > org.apache.nutch.indexer.IndexingFilter does not exist. > > at org.apache.nutch.plugin.PluginRepository.getInstance( > > PluginRepository.java:147) > > at org.apache.nutch.net.URLFilters.<clinit>(URLFilters.java:40) > > ... 4 more > > Caused by: org.apache.nutch.plugin.PluginRuntimeException: extension > > point: org.apache.nutch.indexer.IndexingFilter does not exist. > > at org.apache.nutch.plugin.PluginRepository.installExtensions( > > PluginRepository.java:78) > > at org.apache.nutch.plugin.PluginRepository.<init>( > > PluginRepository.java:61) > > at org.apache.nutch.plugin.PluginRepository.getInstance( > > PluginRepository.java:144) > > ... 5 more > > > > -- http://motrech.free.fr/ http://www.frutch.org/
