[Nutch-general] Crawling all links on a page

Steven Yelton Sat, 14 Jan 2006 06:18:04 -0800

I have a page that has an enormous amount of links on it:


http://www.devdaily.com/unix/man/longlist.shtml

I would like nutch to fetch and index all the pages, but it stops after80 or so. I have made sure that the http.content.limit setting exceedsthe size of the page. I surveyed all the other settings, and don't seeone that seems applicable.

This appears to be the last hurdle for me to replace a proprietarysearch with nutch.


Thanks in advance,
Steven


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

[Nutch-general] Crawling all links on a page

Reply via email to