Re: [Nutch-cvs] svn commit: r280179 - in /lucene/nutch/trunk/src/plugin: clustering-carrot2/ creativecommons/ index-basic/ index-more/ languageidentifier/ ontology/ parse-ext/ parse-html/ parse-js/ parse-mp3/ parse-mspowerpoint/ parse-msword/ parse-pdf/ parse-rss/ par...
[EMAIL PROTECTED] wrote: Author: jerome Date: Sun Sep 11 13:34:12 2005 New Revision: 280179 URL: http://svn.apache.org/viewcvs?rev=280179view=rev Log: Add a dependency to nutch-extensionpoints plugin Looks like something broke after this commit. When I run a nutch crawl using the out-of-the-box configuration I get the following (with logging turned to ALL): 050913 125223 not including: creativecommons 050913 125223 not including: parse-pdf 050913 125223 not including: parse-ext 050913 125223 not including: ontology 050913 125223 not including: protocol-ftp 050913 125223 not including: protocol-http 050913 125223 not including: parse-zip 050913 125223 not including: nutch-extensionpoints 050913 125223 not including: index-more 050913 125223 not including: clustering-carrot2 050913 125223 not including: query-more 050913 125223 not including: language-identifier 050913 125223 not including: urlfilter-prefix 050913 125223 not including: parse-mspowerpoint 050913 125223 not including: parse-msword 050913 125223 not including: protocol-file 050913 125223 not including: lib-jakarta-poi 050913 125223 not including: parse-rss 050913 125223 Missing dependency nutch-extensionpoints for plugin query-url 050913 125223 Missing dependency nutch-extensionpoints for plugin query-site 050913 125223 Missing dependency nutch-extensionpoints for plugin protocol-httpc lient 050913 125223 Missing dependency nutch-extensionpoints for plugin parse-html 050913 125223 Missing dependency nutch-extensionpoints for plugin index-basic 050913 125223 Missing dependency nutch-extensionpoints for plugin parse-text 050913 125223 Missing dependency nutch-extensionpoints for plugin parse-js 050913 125223 Missing dependency nutch-extensionpoints for plugin query-basic 050913 125223 Missing dependency nutch-extensionpoints for plugin urlfilter-rege x 050913 125223 Plugin Auto-activation mode: [false] 050913 125223 Registered Plugins: 050913 125223 NONE 050913 125223 Registered Extension-Points: 050913 125223 NONE Exception in thread main java.lang.ExceptionInInitializerError at org.apache.nutch.db.WebDBInjector.addPage(WebDBInjector.java:437) at org.apache.nutch.db.WebDBInjector.injectURLFile(WebDBInjector.java:37 8) at org.apache.nutch.db.WebDBInjector.main(WebDBInjector.java:535) at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:134) Caused by: java.lang.RuntimeException: org.apache.nutch.net.URLFilter not found. at org.apache.nutch.net.URLFilters.clinit(URLFilters.java:44) ... 4 more -- Best regards, Andrzej Bialecki ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
svn commit: r280549 - /lucene/nutch/trunk/src/plugin/build.xml
Author: jerome Date: Tue Sep 13 05:52:13 2005 New Revision: 280549 URL: http://svn.apache.org/viewcvs?rev=280549view=rev Log: Sorted alphabetically for easy maintenance Modified: lucene/nutch/trunk/src/plugin/build.xml Modified: lucene/nutch/trunk/src/plugin/build.xml URL: http://svn.apache.org/viewcvs/lucene/nutch/trunk/src/plugin/build.xml?rev=280549r1=280548r2=280549view=diff == --- lucene/nutch/trunk/src/plugin/build.xml (original) +++ lucene/nutch/trunk/src/plugin/build.xml Tue Sep 13 05:52:13 2005 @@ -6,89 +6,89 @@ !-- Build deploy all the plugin jars.-- !-- == -- target name=deploy + ant dir=clustering-carrot2 target=deploy/ + ant dir=creativecommons target=deploy/ + ant dir=index-basic target=deploy/ + ant dir=index-more target=deploy/ + ant dir=languageidentifier target=deploy/ ant dir=lib-jakarta-poi target=deploy/ ant dir=nutch-extensionpoints target=deploy/ + ant dir=ontology target=deploy/ ant dir=protocol-file target=deploy/ ant dir=protocol-ftp target=deploy/ ant dir=protocol-http target=deploy/ ant dir=protocol-httpclient target=deploy/ + ant dir=parse-ext target=deploy/ ant dir=parse-html target=deploy/ ant dir=parse-js target=deploy/ - ant dir=parse-text target=deploy/ + !-- ant dir=parse-mp3 target=deploy/ -- + ant dir=parse-mspowerpoint target=deploy/ + ant dir=parse-msword target=deploy/ ant dir=parse-pdf target=deploy/ ant dir=parse-rss target=deploy/ - ant dir=parse-msword target=deploy/ - ant dir=parse-mspowerpoint target=deploy/ -!-- ant dir=parse-mp3 target=deploy/ -- -!-- ant dir=parse-rtf target=deploy/ -- - ant dir=parse-ext target=deploy/ + !-- ant dir=parse-rtf target=deploy/ -- + ant dir=parse-text target=deploy/ ant dir=parse-zip target=deploy/ - ant dir=index-basic target=deploy/ - ant dir=index-more target=deploy/ ant dir=query-basic target=deploy/ ant dir=query-more target=deploy/ ant dir=query-site target=deploy/ ant dir=query-url target=deploy/ - ant dir=urlfilter-regex target=deploy/ ant dir=urlfilter-prefix target=deploy/ - ant dir=creativecommons target=deploy/ - ant dir=languageidentifier target=deploy/ - ant dir=clustering-carrot2 target=deploy/ - ant dir=ontology target=deploy/ + ant dir=urlfilter-regex target=deploy/ /target !-- == -- !-- Test all of the plugins. -- !-- == -- target name=test + ant dir=creativecommons target=test/ + ant dir=languageidentifier target=test/ + ant dir=ontology target=test/ ant dir=protocol-http target=test/ + ant dir=parse-ext target=test/ ant dir=parse-html target=test/ + !-- ant dir=parse-mp3 target=test/ -- + ant dir=parse-mspowerpoint target=test/ + ant dir=parse-msword target=test/ ant dir=parse-pdf target=test/ ant dir=parse-rss target=test/ - ant dir=parse-msword target=test/ - ant dir=parse-mspowerpoint target=test/ - !-- ant dir=parse-mp3 target=test/ -- !-- ant dir=parse-rtf target=test/ -- - ant dir=parse-ext target=test/ ant dir=parse-zip target=test/ - ant dir=creativecommons target=test/ - ant dir=languageidentifier target=test/ - ant dir=ontology target=test/ /target !-- == -- !-- Clean all of the plugins. -- !-- == -- target name=clean +ant dir=clustering-carrot2 target=clean/ +ant dir=creativecommons target=clean/ +ant dir=index-basic target=clean/ +ant dir=index-more target=clean/ +ant dir=languageidentifier target=clean/ ant dir=lib-jakarta-poi target=clean/ ant dir=nutch-extensionpoints target=clean/ +ant dir=ontology target=clean/ ant dir=protocol-file target=clean/ ant dir=protocol-ftp target=clean/ ant dir=protocol-http target=clean/ ant dir=protocol-httpclient target=clean/ +ant dir=parse-ext target=clean/ ant dir=parse-html target=clean/ ant dir=parse-js target=clean/ -ant dir=parse-text target=clean/ +ant dir=parse-mp3 target=clean/ +ant dir=parse-mspowerpoint target=clean/ +ant dir=parse-msword target=clean/ ant dir=parse-pdf target=clean/ ant dir=parse-rss target=clean/ -ant dir=parse-msword target=clean/ -ant dir=parse-mspowerpoint target=clean/ -ant dir=parse-mp3 target=clean/ ant dir=parse-rtf target=clean/ -ant dir=parse-ext target=clean/ +ant dir=parse-text target=clean/ ant dir=parse-zip target=clean/ -ant dir=index-basic target=clean/ -
svn commit: r280551 - in /lucene/nutch/trunk/src/plugin: build.xml lib-lucene-analyzers/ lib-lucene-analyzers/build.xml lib-lucene-analyzers/lib/ lib-lucene-analyzers/lib/lucene-analyzers-1.9-rc1-dev.jar lib-lucene-analyzers/plugin.xml
Author: jerome Date: Tue Sep 13 06:06:32 2005 New Revision: 280551 URL: http://svn.apache.org/viewcvs?rev=280551view=rev Log: Add a lib plugin for lucene analyzers Added: lucene/nutch/trunk/src/plugin/lib-lucene-analyzers/ lucene/nutch/trunk/src/plugin/lib-lucene-analyzers/build.xml (with props) lucene/nutch/trunk/src/plugin/lib-lucene-analyzers/lib/ lucene/nutch/trunk/src/plugin/lib-lucene-analyzers/lib/lucene-analyzers-1.9-rc1-dev.jar (with props) lucene/nutch/trunk/src/plugin/lib-lucene-analyzers/plugin.xml (with props) Modified: lucene/nutch/trunk/src/plugin/build.xml Modified: lucene/nutch/trunk/src/plugin/build.xml URL: http://svn.apache.org/viewcvs/lucene/nutch/trunk/src/plugin/build.xml?rev=280551r1=280550r2=280551view=diff == --- lucene/nutch/trunk/src/plugin/build.xml (original) +++ lucene/nutch/trunk/src/plugin/build.xml Tue Sep 13 06:06:32 2005 @@ -12,6 +12,7 @@ ant dir=index-more target=deploy/ ant dir=languageidentifier target=deploy/ ant dir=lib-jakarta-poi target=deploy/ + ant dir=lib-lucene-analyzers target=deploy/ ant dir=nutch-extensionpoints target=deploy/ ant dir=ontology target=deploy/ ant dir=protocol-file target=deploy/ @@ -66,6 +67,7 @@ ant dir=index-more target=clean/ ant dir=languageidentifier target=clean/ ant dir=lib-jakarta-poi target=clean/ +ant dir=lib-lucene-analyzers target=clean/ ant dir=nutch-extensionpoints target=clean/ ant dir=ontology target=clean/ ant dir=protocol-file target=clean/ Added: lucene/nutch/trunk/src/plugin/lib-lucene-analyzers/build.xml URL: http://svn.apache.org/viewcvs/lucene/nutch/trunk/src/plugin/lib-lucene-analyzers/build.xml?rev=280551view=auto == --- lucene/nutch/trunk/src/plugin/lib-lucene-analyzers/build.xml (added) +++ lucene/nutch/trunk/src/plugin/lib-lucene-analyzers/build.xml Tue Sep 13 06:06:32 2005 @@ -0,0 +1,17 @@ +?xml version=1.0? + +project name=lib-lucene-analyzers default=jar + + import file=../build-plugin.xml/ + + !-- + ! Override the compile and jar targets, + ! since there is nothing to compile here. + ! -- + target name=compile depends=init +echo message=Compiling plugin: ${name}/ + /target + + target name=jar depends=compile/ + +/project Propchange: lucene/nutch/trunk/src/plugin/lib-lucene-analyzers/build.xml -- svn:eol-style = native Added: lucene/nutch/trunk/src/plugin/lib-lucene-analyzers/lib/lucene-analyzers-1.9-rc1-dev.jar URL: http://svn.apache.org/viewcvs/lucene/nutch/trunk/src/plugin/lib-lucene-analyzers/lib/lucene-analyzers-1.9-rc1-dev.jar?rev=280551view=auto == Binary file - no diff available. Propchange: lucene/nutch/trunk/src/plugin/lib-lucene-analyzers/lib/lucene-analyzers-1.9-rc1-dev.jar -- svn:mime-type = application/octet-stream Added: lucene/nutch/trunk/src/plugin/lib-lucene-analyzers/plugin.xml URL: http://svn.apache.org/viewcvs/lucene/nutch/trunk/src/plugin/lib-lucene-analyzers/plugin.xml?rev=280551view=auto == --- lucene/nutch/trunk/src/plugin/lib-lucene-analyzers/plugin.xml (added) +++ lucene/nutch/trunk/src/plugin/lib-lucene-analyzers/plugin.xml Tue Sep 13 06:06:32 2005 @@ -0,0 +1,21 @@ +?xml version=1.0 encoding=UTF-8? +!-- + ! Lucene Analyzers + ! (http://lucene.apache.org/java/docs/lucene-sandbox/) + ! + ! Dowload : http://www.apache.org/dyn/closer.cgi/jakarta/lucene/binaries/ + ! License : http://www.apache.org/licenses/LICENSE-2.0.txt + !-- +plugin + id=lib-lucene-analyzers + name=Lucene Analysers + version=1.9-rc1-dev + provider-name=org.apache.lucene + + runtime + library name=lucene-analyzers-1.9-rc1-dev.jar +export name=*/ + /library + /runtime + +/plugin Propchange: lucene/nutch/trunk/src/plugin/lib-lucene-analyzers/plugin.xml -- svn:eol-style = native
svn commit: r280556 - in /lucene/nutch/trunk/src/plugin: ./ analysis-de/ analysis-de/src/ analysis-de/src/java/ analysis-de/src/java/org/ analysis-de/src/java/org/apache/ analysis-de/src/java/org/apache/nutch/ analysis-de/src/java/org/apache/nutch/anal...
Author: jerome Date: Tue Sep 13 07:03:36 2005 New Revision: 280556 URL: http://svn.apache.org/viewcvs?rev=280556view=rev Log: French and German analyzers added Added: lucene/nutch/trunk/src/plugin/analysis-de/ lucene/nutch/trunk/src/plugin/analysis-de/build.xml (with props) lucene/nutch/trunk/src/plugin/analysis-de/plugin.xml (with props) lucene/nutch/trunk/src/plugin/analysis-de/src/ lucene/nutch/trunk/src/plugin/analysis-de/src/java/ lucene/nutch/trunk/src/plugin/analysis-de/src/java/org/ lucene/nutch/trunk/src/plugin/analysis-de/src/java/org/apache/ lucene/nutch/trunk/src/plugin/analysis-de/src/java/org/apache/nutch/ lucene/nutch/trunk/src/plugin/analysis-de/src/java/org/apache/nutch/analysis/ lucene/nutch/trunk/src/plugin/analysis-de/src/java/org/apache/nutch/analysis/de/ lucene/nutch/trunk/src/plugin/analysis-de/src/java/org/apache/nutch/analysis/de/GermanAnalyzer.java (with props) lucene/nutch/trunk/src/plugin/analysis-fr/ lucene/nutch/trunk/src/plugin/analysis-fr/build.xml (with props) lucene/nutch/trunk/src/plugin/analysis-fr/plugin.xml (with props) lucene/nutch/trunk/src/plugin/analysis-fr/src/ lucene/nutch/trunk/src/plugin/analysis-fr/src/java/ lucene/nutch/trunk/src/plugin/analysis-fr/src/java/org/ lucene/nutch/trunk/src/plugin/analysis-fr/src/java/org/apache/ lucene/nutch/trunk/src/plugin/analysis-fr/src/java/org/apache/nutch/ lucene/nutch/trunk/src/plugin/analysis-fr/src/java/org/apache/nutch/analysis/ lucene/nutch/trunk/src/plugin/analysis-fr/src/java/org/apache/nutch/analysis/fr/ lucene/nutch/trunk/src/plugin/analysis-fr/src/java/org/apache/nutch/analysis/fr/FrenchAnalyzer.java (with props) Modified: lucene/nutch/trunk/src/plugin/build.xml Added: lucene/nutch/trunk/src/plugin/analysis-de/build.xml URL: http://svn.apache.org/viewcvs/lucene/nutch/trunk/src/plugin/analysis-de/build.xml?rev=280556view=auto == --- lucene/nutch/trunk/src/plugin/analysis-de/build.xml (added) +++ lucene/nutch/trunk/src/plugin/analysis-de/build.xml Tue Sep 13 07:03:36 2005 @@ -0,0 +1,13 @@ +?xml version=1.0? + +project name=analysis-de default=jar + + import file=../build-plugin.xml/ + + path id=plugin.deps +fileset dir=../lib-lucene-analyzers/lib + include name=*.jar / +/fileset + /path + +/project Propchange: lucene/nutch/trunk/src/plugin/analysis-de/build.xml -- svn:eol-style = native Added: lucene/nutch/trunk/src/plugin/analysis-de/plugin.xml URL: http://svn.apache.org/viewcvs/lucene/nutch/trunk/src/plugin/analysis-de/plugin.xml?rev=280556view=auto == --- lucene/nutch/trunk/src/plugin/analysis-de/plugin.xml (added) +++ lucene/nutch/trunk/src/plugin/analysis-de/plugin.xml Tue Sep 13 07:03:36 2005 @@ -0,0 +1,29 @@ +?xml version=1.0 encoding=UTF-8? +plugin + id=analysis-de + name=German Analysis Plug-in + version=1.0.0 + provider-name=org.apache.nutch + + runtime + library name=analysis-de.jar + export name=*/ + /library + /runtime + + requires + import plugin=nutch-extensionpoints/ + import plugin=lib-lucene-analyzers/ + /requires + + extension id=org.apache.nutch.analysis.de + name=GermanAnalyzer + point=org.apache.nutch.analysis.NutchAnalyzer + + implementation id=org.apache.nutch.analysis.de.GermanAnalyzer + class=org.apache.nutch.analysis.de.GermanAnalyzer + lang=de/ + + /extension + +/plugin Propchange: lucene/nutch/trunk/src/plugin/analysis-de/plugin.xml -- svn:eol-style = native Added: lucene/nutch/trunk/src/plugin/analysis-de/src/java/org/apache/nutch/analysis/de/GermanAnalyzer.java URL: http://svn.apache.org/viewcvs/lucene/nutch/trunk/src/plugin/analysis-de/src/java/org/apache/nutch/analysis/de/GermanAnalyzer.java?rev=280556view=auto == --- lucene/nutch/trunk/src/plugin/analysis-de/src/java/org/apache/nutch/analysis/de/GermanAnalyzer.java (added) +++ lucene/nutch/trunk/src/plugin/analysis-de/src/java/org/apache/nutch/analysis/de/GermanAnalyzer.java Tue Sep 13 07:03:36 2005 @@ -0,0 +1,48 @@ +/** + * Copyright 2005 The Apache Software Foundation + * + * Licensed under the Apache License, Version 2.0 (the License); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY