which site? which links are missing? its not only the configuration of nutch which prevents to crawl some links. it can also be the site itself, robots exclusion protocol and meta tags. such meta tags are nofollow, noindex etc. http://www.robotstxt.org/meta.html
without more information, i assume that hardly anyone even helpful can help you. Pravin Karne schrieb: > Hi, > I am using nutch to crawl particular site. But I found that Nutch is not > crawling all links from every pages. > Is there any tuning parameter for nutch to crawl all links? > > Thanks in advance > -Pravin > > > DISCLAIMER > ========== > This e-mail may contain privileged and confidential information which is the > property of Persistent Systems Ltd. It is intended only for the use of the > individual or entity to which it is addressed. If you are not the intended > recipient, you are not authorized to read, retain, copy, print, distribute or > use this message. If you have received this communication in error, please > notify the sender and delete all copies of this message. Persistent Systems > Ltd. does not accept any liability for virus infected mails. > >