i've copied the whole default.xml into site.xml. and also included js and analysis plugins. extension point plugin is included. by crawling it doesen't work:
060310 142318 Plugins: looking in: /home/../plugins 060310 142318 parsing: /home/../plugins/query-site/plugin.xml 060310 142318 impl: point=org.apache.nutch.searcher.QueryFilter class=org.apache.nutch.searcher.site.SiteQueryFilter 060310 142318 not including: /home/../plugins/parse-pdf 060310 142318 parsing: /home/../plugins/nutch-extensionpoints/plugin.xml 060310 142318 not including: /home/../plugins/language-identifier 060310 142318 not including: /home/../plugins/query-more 060310 142318 not including: /home/../plugins/parse-js site xml includes: .. <property> <name>plugin.includes</name> <value>nutch-extensionpoints|protocol-http|urlfilter-regex|parse-(text|html|js)|index-basic|query-(basic|site|url)|analysis-(de)</value> <description>Regular expression naming plugin directory names to include. Any plugin not matching this expression is excluded. In any case you need at least include the nutch-extensionpoints plugin. By default Nutch includes crawling just HTML and plain text via HTTP, and basic indexing and search plugins. </description> </property> .. > --- Ursprüngliche Nachricht --- > Von: Stefan Groschupf <[EMAIL PROTECTED]> > An: [email protected] > Betreff: Re: extension point: org.apache.nutch.parse.Parser does not > exist. > Datum: Fri, 10 Mar 2006 13:39:00 +0100 > > Hi, > the extension point plugin need to be included in the includes also. > Please note that nutc-site do not extend parameters but overwrite it =20 > and it is not a good idea to have just the parser plugins installed, =20 > at least you need one protocol plugin, a query and a index filter also. > > Stefan > Am 10.03.2006 um 12:51 schrieb Peter Swoboda: > > > I tried to include two more plugins. > > changed the nutch-site.xml to > > > > <?xml version=3D"1.0"?> Please note that nutc-site do not extend parameters but overwrite > > <?xml-stylesheet type=3D"text/xsl" href=3D"nutch-conf.xsl"?> > > > > <!-- Put site-specific property overrides in this file. --> > > > > <nutch-conf> > > > > <property> > > <name>plugin.includes</name> > > <value>parse-(js)|analysis-(de)</value> > > <description>Regular expression naming plugin directory names to > > include. Any plugin not matching this expression is excluded. > > In any case you need at least include the nutch-extensionpoints =20 > > plugin. By > > default Nutch includes crawling just HTML and plain text via HTTP, > > and basic indexing and search plugins. > > </description> > > </property> > > > > > > </nutch-conf> > > > > starting crawling gets following errormesage: > > > > 060310 122551 SEVERE org.apache.nutch.plugin.PluginRuntimeException: > > extension point: org.apache.nutch.parse.Parser does not exist. > > Exception in thread "main" java.lang.ExceptionInInitializerError > > at = > org.apache.nutch.db.WebDBInjector.addPage(WebDBInjector.java:437) > > at org.apache.nutch.db.WebDBInjector.injectURLFile=20 > > (WebDBInjector.java:378) > > at = > org.apache.nutch.db.WebDBInjector.main(WebDBInjector.java:535) > > at nutch.Test.main(Test.java:128) > > Caused by: java.lang.RuntimeException: > > org.apache.nutch.plugin.PluginRuntimeException: extension point: > > org.apache.nutch.parse.Parser does not exist. > > at > > org.apache.nutch.plugin.PluginRepository.getInstance=20 > > (PluginRepository.java:147) > > at org.apache.nutch.net.URLFilters.<clinit>(URLFilters.java:40) > > ... 4 more > > Caused by: org.apache.nutch.plugin.PluginRuntimeException: =20 > > extension point: > > org.apache.nutch.parse.Parser does not exist. > > at > > org.apache.nutch.plugin.PluginRepository.installExtensions=20 > > (PluginRepository.java:78) > > at > > org.apache.nutch.plugin.PluginRepository.<init>=20 > > (PluginRepository.java:61) > > at > > org.apache.nutch.plugin.PluginRepository.getInstance=20 > > (PluginRepository.java:144) > > ... 5 more > > > > > > nutch.default.xml is still unchanged. > > What to do? > > > > > > greetings > > Peter > > > > --=20 > > "Feel free" mit GMX FreeMail! > > Monat f=FCr Monat 10 FreeSMS inklusive! http://www.gmx.net > > > > --------------------------------------------------------------- > company: http://www.media-style.com > forum: http://www.text-mining.org > blog: http://www.find23.net > > > -- Bis zu 70% Ihrer Onlinekosten sparen: GMX SmartSurfer! Kostenlos downloaden: http://www.gmx.net/de/go/smartsurfer ------------------------------------------------------- This SF.Net email is sponsored by xPML, a groundbreaking scripting language that extends applications into web and mobile media. Attend the live webcast and join the prime developer group breaking into this new coding territory! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
