Hi,
try:
In my conf/crawl-urlfilter.txt I have tried: +^http://([a-z0-9]*\.)*nutch.org/
+^http://([a-z0-9]*\.)*nutch.org
+^http://*.nutch.org/
This would never work.
Stars does not mean every sign. They are multipliers for the signs infront of the star.
Dots mean every sign.
\. means dots
Please google for "regex" or "perl regular expressions".
If you ask nutch to check against a string with slash at the end your url should have this also.my urls file contains: http://www.nutch.org
Try: http://www.nutch.org/
Bye
Matthias
------------------------------------------------------- This SF.Net email is sponsored by: InterSystems CACHE FREE OODBMS DOWNLOAD - A multidimensional database that combines robust object and relational technologies, making it a perfect match for Java, C++,COM, XML, ODBC and JDBC. www.intersystems.com/match8 _______________________________________________ Nutch-general mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-general
