Alexander, Christoph and All When i was running de crawl command was giving this error:
050919 092356 impl: point=org.apache.nutch.searcher.QueryFilter class= org.apache.nutch.searcher.site.SiteQueryFilter 050919 092356 parsing: /files/home/vmf/nutch-0.7 /plugins/query-url/plugin.xml 050919 092356 impl: point=org.apache.nutch.searcher.QueryFilter class= org.apache.nutch.searcher.url.URLQueryFilter 050919 092356 not including: /files/home/vmf/nutch-0.7 /plugins/urlfilter-regex 050919 092356 not including: /files/home/vmf/nutch-0.7 /plugins/urlfilter-prefix Exception in thread "main" java.lang.ExceptionInInitializerError at org.apache.nutch.db.WebDBInjector.addPage(WebDBInjector.java:437) at org.apache.nutch.db.WebDBInjector.injectURLFile(WebDBInjector.java:378) at org.apache.nutch.db.WebDBInjector.main(WebDBInjector.java:535) at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:134) Caused by: java.lang.RuntimeException: org.apache.nutch.net.URLFilter not found. at org.apache.nutch.net.URLFilters.<clinit>(URLFilters.java:44) ... 4 more i fixed it putting the it on nutch-site.xml: <property> <name>plugin.includes</name> <value>protocol-file|protocol-http|parse-(text|html|msword|pdf)|index-basic|query-(basic|site|url)|urlfilter-regex</value> </property> my urls.txt file is : file:/export/home/vmf but is indexing everyting later de home. How i can index another account but in the intranet? I'm trying out the ip in crawl-urlfilter.txt but i don't obtained succes. Some one can give some suggestion, please. Thanks, Valmir On 9/16/05, Valmir Macário <[EMAIL PROTECTED]> wrote: > > Hi all, > > I'm using solaris and try to index my local system, i follow all steps in > the FAQ but i still don't obtained success. This FAQ is missing some step or > has anything wrong? I apreciate if some one couls help me, my objective is > to index local system in a intranet. Thanks >
