I had the same problem, and my list of hosts was the the thousands, so regex was a little inefficient for that. I subclassed the regex_urlfilter and created a list of hosts based on domain names and implemented that with a hashmap. It runs pretty well as lookups are cheap.
P. ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
