I'm seeing the same issue...

I'll share an answer if/when I find it...

Stephen

On 12/17/05, Alfred Ostermeier <[EMAIL PROTECTED]> wrote:
>
> Hello,
>
> I have just installed nutch 0.7.1. I'm running it on Win XP and cygwin.
> Crawling an http-URL worked well. But: Crawling an file-URL failed. I did
> configure nutch exactly as described in
>
> http://wiki.apache.org/nutch/FAQ#head-c721b23b43b15885f5ea7d8da62c1c40a37878
> e6. That means, I activated the "protocol-file" plugin. Below is the
> content
> of the log-file with the errors.
>
> The only Google hit for "IndexingFilter does not exist" (
>
> http://www.mail-archive.com/[email protected]/msg00878.htmlsuspect
> ed ) suspected the CLASSPATH among other things. To which folders or jar
> file(s) has the CLASSPATH to be set - if yes? Currently mine is set to the
> current directory. I unfortunately couldn't find the jar-file with the
> class
> IndexingFilter.
>
> Regards,
> Alfred
>
>
> ----------------------------------------------------------------------------
> -------------------------------------------------
>
> run java in c:\j2sdk1.4.2_04\jre
> 051217 235048 parsing file:/C:/nutch-0.7.1/conf/nutch-default.xml
> 051217 235049 parsing file:/C:/nutch-0.7.1/conf/crawl-tool.xml
> 051217 235049 parsing file:/C:/nutch-0.7.1/conf/nutch-site.xml
> 051217 235049 No FS indicated, using default:local
> 051217 235049 crawl started in: crawl.test
> 051217 235049 rootUrlFile = urls
> 051217 235049 threads = 10
> 051217 235049 depth = 3
> 051217 235049 Created webdb at LocalFS,C:\nutch-0.7.1\crawl.test\db
> 051217 235049 Starting URL processing
> 051217 235049 Plugins: looking in: C:\nutch-0.7.1\plugins
> 051217 235049 not including: C:\nutch-0.7.1\plugins\clustering-carrot2
> 051217 235049 not including: C:\nutch-0.7.1\plugins\creativecommons
> 051217 235049 parsing: C:\nutch-0.7.1\plugins\index-basic\plugin.xml
> 051217 235049 impl: point=org.apache.nutch.indexer.IndexingFilter
> class=org.apache.nutch.indexer.basic.BasicIndexingFilter
> 051217 235049 not including: C:\nutch-0.7.1\plugins\index-more
> 051217 235049 not including: C:\nutch-0.7.1\plugins\language-identifier
> 051217 235049 not including: C:\nutch-0.7.1\plugins\nutch-extensionpoints
> 051217 235049 not including: C:\nutch-0.7.1\plugins\ontology
> 051217 235049 not including: C:\nutch-0.7.1\plugins\parse-ext
> 051217 235049 parsing: C:\nutch-0.7.1\plugins\parse-html\plugin.xml
> 051217 235049 impl: point=org.apache.nutch.parse.Parser
> class=org.apache.nutch.parse.html.HtmlParser
> 051217 235049 not including: C:\nutch-0.7.1\plugins\parse-js
> 051217 235049 not including: C:\nutch-0.7.1\plugins\parse-msword
> 051217 235049 not including: C:\nutch-0.7.1\plugins\parse-pdf
> 051217 235049 not including: C:\nutch-0.7.1\plugins\parse-rss
> 051217 235049 parsing: C:\nutch-0.7.1\plugins\parse-text\plugin.xml
> 051217 235049 impl: point=org.apache.nutch.parse.Parser
> class=org.apache.nutch.parse.text.TextParser
> 051217 235049 parsing: C:\nutch-0.7.1\plugins\protocol-file\plugin.xml
> 051217 235049 impl: point=org.apache.nutch.protocol.Protocol
> class=org.apache.nutch.protocol.file.File
> 051217 235049 not including: C:\nutch-0.7.1\plugins\protocol-ftp
> 051217 235049 parsing: C:\nutch-0.7.1\plugins\protocol-http\plugin.xml
> 051217 235049 impl: point=org.apache.nutch.protocol.Protocol
> class=org.apache.nutch.protocol.http.Http
> 051217 235049 not including: C:\nutch-0.7.1\plugins\protocol-httpclient
> 051217 235049 parsing: C:\nutch-0.7.1\plugins\query-basic\plugin.xml
> 051217 235049 impl: point=org.apache.nutch.searcher.QueryFilter
> class=org.apache.nutch.searcher.basic.BasicQueryFilter
> 051217 235049 not including: C:\nutch-0.7.1\plugins\query-more
> 051217 235049 parsing: C:\nutch-0.7.1\plugins\query-site\plugin.xml
> 051217 235049 impl: point=org.apache.nutch.searcher.QueryFilter
> class=org.apache.nutch.searcher.site.SiteQueryFilter
> 051217 235049 parsing: C:\nutch-0.7.1\plugins\query-url\plugin.xml
> 051217 235049 impl: point=org.apache.nutch.searcher.QueryFilter
> class=org.apache.nutch.searcher.url.URLQueryFilter
> 051217 235049 not including: C:\nutch-0.7.1\plugins\urlfilter-prefix
> 051217 235049 not including: C:\nutch-0.7.1\plugins\urlfilter-regex
> 051217 235049 SEVERE org.apache.nutch.plugin.PluginRuntimeException:
> extension point: org.apache.nutch.indexer.IndexingFilter does not exist.
> java.lang.ExceptionInInitializerError
>         at org.apache.nutch.db.WebDBInjector.addPage(WebDBInjector.java
> :437)
>         at
> org.apache.nutch.db.WebDBInjector.injectURLFile(WebDBInjector.java:378)
>         at org.apache.nutch.db.WebDBInjector.main(WebDBInjector.java:535)
>         at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:134)
> Caused by: java.lang.RuntimeException:
> org.apache.nutch.plugin.PluginRuntimeException: extension point:
> org.apache.nutch.indexer.IndexingFilter does not exist.
>         at
> org.apache.nutch.plugin.PluginRepository.getInstance(PluginRepository.java
> :1
> 47)
>         at org.apache.nutch.net.URLFilters.<clinit>(URLFilters.java:40)
>         ... 4 more
> Caused by: org.apache.nutch.plugin.PluginRuntimeException: extension
> point:
> org.apache.nutch.indexer.IndexingFilter does not exist.
>         at
> org.apache.nutch.plugin.PluginRepository.installExtensions
> (PluginRepository.
> java:78)
>         at
> org.apache.nutch.plugin.PluginRepository.<init>(PluginRepository.java:61)
>         at
> org.apache.nutch.plugin.PluginRepository.getInstance(PluginRepository.java
> :1
> 44)
>         ... 5 more
> Exception in thread "main"
>
>
>

Reply via email to