Hi Valmir, Adriano

I too had some problems with crawling the local filesystem.
I wrote a small document about what I've done in order to get 
things working for me.

http://www.folge2.de/tp/search/1/crawling-the-local-filesystem-with-nutch

bye
c

Am Montag, den 19.09.2005, 21:19 +0300 schrieb Valmir Macário:
> Alexander, Christoph and All 
> 
> When i was running de crawl command was giving this error:
> 
> 050919 092356 impl: point=org.apache.nutch.searcher.QueryFilter class=
> org.apache.nutch.searcher.site.SiteQueryFilter
> 050919 092356 parsing: /files/home/vmf/nutch-0.7
> /plugins/query-url/plugin.xml
> 050919 092356 impl: point=org.apache.nutch.searcher.QueryFilter class=
> org.apache.nutch.searcher.url.URLQueryFilter
> 050919 092356 not including: /files/home/vmf/nutch-0.7
> /plugins/urlfilter-regex
> 050919 092356 not including: /files/home/vmf/nutch-0.7
> /plugins/urlfilter-prefix
> Exception in thread "main" java.lang.ExceptionInInitializerError
> at org.apache.nutch.db.WebDBInjector.addPage(WebDBInjector.java:437)
> at org.apache.nutch.db.WebDBInjector.injectURLFile(WebDBInjector.java:378)
> at org.apache.nutch.db.WebDBInjector.main(WebDBInjector.java:535)
> at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:134)
> Caused by: java.lang.RuntimeException: org.apache.nutch.net.URLFilter not 
> found. at org.apache.nutch.net.URLFilters.<clinit>(URLFilters.java:44)
> ... 4 more
> 
> 
> i fixed it putting the it on nutch-site.xml:
> 
> <property>
> <name>plugin.includes</name>
> <value>protocol-file|protocol-http|parse-(text|html|msword|pdf)|index-basic|query-(basic|site|url)|urlfilter-regex</value>
> </property>
> 
> 
> my urls.txt file is : file:/export/home/vmf
> 
> but is indexing everyting later de home.
> 
> How i can index another account but in the intranet?
> 
> I'm trying out the ip in crawl-urlfilter.txt but i don't obtained succes.
> 
> Some one can give some suggestion, please. 
> 
> Thanks, Valmir
> 
> 
> On 9/16/05, Valmir Macário <[EMAIL PROTECTED]> wrote:
> > 
> > Hi all,
> > 
> > I'm using solaris and try to index my local system, i follow all steps in 
> > the FAQ but i still don't obtained success. This FAQ is missing some step 
> > or 
> > has anything wrong? I apreciate if some one couls help me, my objective is 
> > to index local system in a intranet. Thanks
> >



-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. 
Download it for free - -and be entered to win a 42" plasma tv or your very
own Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to