Hi,

   Did you check your urlfilter files? The default ones exclude URLs
that are dynamic, so you might want to comment the following line from
your crawl-urlfilter.txt:

# skip URLs containing certain characters as probable queries, etc.
[EMAIL PROTECTED]

Regards,

-vishal.

-----Original Message-----
From: Fadzi Ushewokunze [mailto:[EMAIL PROTECTED] 
Sent: Sunday, September 03, 2006 3:17 PM
To: [email protected]
Subject: searching dynamic pages

Hi,

Is it possible to configure nutch to crawl a url like 
http://www.butterflycluster.com/index.php?searchword=java&option=com_sea
rch&Itemid

I dont want to crawl the _whole_ website. I want my crawl to start on
the results returned
from this query. 

I have injected this url but it doesnt seem to be fetched at all. If i
inject the url http://www.butterflycluster.com it is crawled but I dont
want this. 

In essence I want to crawl the search results of this website. And i
have a lot more I want to crawl like this.

Any suggestions will greatly appreciated;.

Thanks

Reply via email to