Hi, I need to crawl simtel.com, however they have a unique URL format: http://www.simtel.com/product.php%5Bid%5D78895%5BSiteID%5Dsimtel.net
which is http://www.simtel.com/product.php[id]D78895[SiteID]simtel.net nutch don like these links. any suggestion would be appreciated. G. ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
