Hi,

I had a smimiliar problem and I made a mistake withinin the e crawl-urlfilter.txt. Looking at your output:

...
> 041118 122750 Starting URL processing
> 041118 122750 Using URL filter: net.nutch.net.RegexURLFilter
> 041118 122751 found resource crawl-urlfilter.txt at
> file:/root/install/nutch-nightly/conf/crawl-urlfilter.txt
> .041118 122751 Added 0 pages
...

none of the sites you crawled made it through your filter...

Regards

        Michael






------------------------------------------------------- This SF.Net email is sponsored by: InterSystems CACHE FREE OODBMS DOWNLOAD - A multidimensional database that combines robust object and relational technologies, making it a perfect match for Java, C++,COM, XML, ODBC and JDBC. www.intersystems.com/match8 _______________________________________________ Nutch-general mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to