Hi, Did you check your urlfilter files? The default ones exclude URLs that are dynamic, so you might want to comment the following line from your crawl-urlfilter.txt:
# skip URLs containing certain characters as probable queries, etc. [EMAIL PROTECTED] Regards, -vishal. -----Original Message----- From: Fadzi Ushewokunze [mailto:[EMAIL PROTECTED] Sent: Sunday, September 03, 2006 3:17 PM To: [email protected] Subject: searching dynamic pages Hi, Is it possible to configure nutch to crawl a url like http://www.butterflycluster.com/index.php?searchword=java&option=com_sea rch&Itemid I dont want to crawl the _whole_ website. I want my crawl to start on the results returned from this query. I have injected this url but it doesnt seem to be fetched at all. If i inject the url http://www.butterflycluster.com it is crawled but I dont want this. In essence I want to crawl the search results of this website. And i have a lot more I want to crawl like this. Any suggestions will greatly appreciated;. Thanks
