Hi.. I'm trying nutch to crawl some web-sites. Unfortunately they restricted to crawl their web-site by writing robots.txt. By using crawl-anywhere can I crawl any web-sites irrespective of that web-sites robots.txt??? If yes, plz send me the materials/links to study about crawl-anywhere or else plz suggest me which are the crawlers to use to crawl web-sites without bothering about robots.txt of that particular site. Its urgent plz reply as soon as possible.
Thanks in advance -- View this message in context: http://lucene.472066.n3.nabble.com/ANNOUNCE-Web-Crawler-tp2607833p4078039.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org