I am using nutch 1.1 for crawling. I am able to crawl so many site without any issue but when I am crawling www.magicbricks.com it is stopping at depth=1. I am using "bin/nutch crawl urls/magicbricks/url.txt -dir crawl/magicbricks -threads 10 -depth 3 -topN 10" But if I put links like "http://www.magicbricks.com/bricks/cityIndex.html" or "http://www.magicbricks.com/bricks/propertySearch.html" in urls/magicbricks/url.txt it crawls without any issue.
In robots.txt I have allowed my crawler named Propertybot all access to crawl, which can be seen by using http://magicbricks.com/robots.txt Please suggest what can be the reasons, why it is happening. Thanks in advance Hemant Verma -- View this message in context: http://lucene.472066.n3.nabble.com/Can-t-Crawl-Through-Home-Page-but-crawling-through-inner-page-tp2601843p2601843.html Sent from the Nutch - User mailing list archive at Nabble.com.

