Great to hear! Happy New Year!
Cheers, Chris On Jan 1, 2012, at 1:15 PM, tahere ganjiyar wrote: > thanks for your answer, i set maxOutlink to -1 ,now i can crawl every > link in my sites. it work for me, thanks. > > > > On 12/20/11, Mattmann, Chris A (388J) <[email protected]> wrote: >> Hi, >> >> Try changing the properties related to max outlinks in the >> nutch-default.xml. That should help. >> >> Cheers, >> Chris >> >> On Dec 19, 2011, at 2:49 PM, tahere ganjiyar wrote: >> >>> hi, i crawl one site that it has 100 link in depth 1, and 100 links in >>> depth 2, but nutch only crawl 23 links from depth 1 and 30 from depth 2. >>> how can i force nutch to crawl all links in depth 1 and 2. i use nutch 1.3 >>> >>> topN=10000 >>> depth =2 >>> and in my nutch-site.xml: >>> <property> >>> <name>http.content.limit</name> >>> <value>-1</value> >>> <description> >>> </description> >>> </property> >>> <property> >>> <name>http.agent.name</name> >>> <value>My Nutch Spider</value> >>> <description> >>> </description> >>> </property> >>> >>> >>> >> >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> Chris Mattmann, Ph.D. >> Senior Computer Scientist >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >> Office: 171-266B, Mailstop: 171-246 >> Email: [email protected] >> WWW: http://sunset.usc.edu/~mattmann/ >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> Adjunct Assistant Professor, Computer Science Department >> University of Southern California, Los Angeles, CA 90089 USA >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

