Hi,
I have checked end rececked but I see that nutch-site.xml if the regex-urlfilter.txt part. Started Fetch
and see that regex-urlfilter.txt is not loaded. While when starti fetch I see:
040519 193606 loading file:/d1/nutch/conf/nutch-default.xml
040519 193606 loading file:/d1/nutch/conf/nutch-site.xml
......
Informational Robots.txt entries we'll obey (in order):
...
040519 193606 found resource banned-hosts.txt at file:/d1/nutch/conf/banned-hosts.txt
040519 193606 found resource mime.types at file:/d1/nutch/conf/mime.types
But no mention about regex-urlfilter.txt
Is this file loaded or not? I use the lates CVS. I follow the Fetcher log and see that the banned urls (end with ?, for example) is crawled. So I suspect that Urlfilter don't work.
Thanks
------------------------------------------------------- This SF.Net email is sponsored by: SourceForge.net Broadband Sign-up now for SourceForge Broadband and get the fastest 6.0/768 connection for only $19.95/mo for the first 3 months! http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click _______________________________________________ Nutch-developers mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-developers
