Hi Valmir, Adriano I too had some problems with crawling the local filesystem. I wrote a small document about what I've done in order to get things working for me.
http://www.folge2.de/tp/search/1/crawling-the-local-filesystem-with-nutch bye c Am Montag, den 19.09.2005, 21:19 +0300 schrieb Valmir Macário: > Alexander, Christoph and All > > When i was running de crawl command was giving this error: > > 050919 092356 impl: point=org.apache.nutch.searcher.QueryFilter class= > org.apache.nutch.searcher.site.SiteQueryFilter > 050919 092356 parsing: /files/home/vmf/nutch-0.7 > /plugins/query-url/plugin.xml > 050919 092356 impl: point=org.apache.nutch.searcher.QueryFilter class= > org.apache.nutch.searcher.url.URLQueryFilter > 050919 092356 not including: /files/home/vmf/nutch-0.7 > /plugins/urlfilter-regex > 050919 092356 not including: /files/home/vmf/nutch-0.7 > /plugins/urlfilter-prefix > Exception in thread "main" java.lang.ExceptionInInitializerError > at org.apache.nutch.db.WebDBInjector.addPage(WebDBInjector.java:437) > at org.apache.nutch.db.WebDBInjector.injectURLFile(WebDBInjector.java:378) > at org.apache.nutch.db.WebDBInjector.main(WebDBInjector.java:535) > at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:134) > Caused by: java.lang.RuntimeException: org.apache.nutch.net.URLFilter not > found. at org.apache.nutch.net.URLFilters.<clinit>(URLFilters.java:44) > ... 4 more > > > i fixed it putting the it on nutch-site.xml: > > <property> > <name>plugin.includes</name> > <value>protocol-file|protocol-http|parse-(text|html|msword|pdf)|index-basic|query-(basic|site|url)|urlfilter-regex</value> > </property> > > > my urls.txt file is : file:/export/home/vmf > > but is indexing everyting later de home. > > How i can index another account but in the intranet? > > I'm trying out the ip in crawl-urlfilter.txt but i don't obtained succes. > > Some one can give some suggestion, please. > > Thanks, Valmir > > > On 9/16/05, Valmir Macário <[EMAIL PROTECTED]> wrote: > > > > Hi all, > > > > I'm using solaris and try to index my local system, i follow all steps in > > the FAQ but i still don't obtained success. This FAQ is missing some step > > or > > has anything wrong? I apreciate if some one couls help me, my objective is > > to index local system in a intranet. Thanks > > ------------------------------------------------------- SF.Net email is sponsored by: Tame your development challenges with Apache's Geronimo App Server. Download it for free - -and be entered to win a 42" plasma tv or your very own Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
