I have a page that has an enormous amount of links on it:

http://www.devdaily.com/unix/man/longlist.shtml

I would like nutch to fetch and index all the pages, but it stops after 80 or so. I have made sure that the http.content.limit setting exceeds the size of the page. I surveyed all the other settings, and don't see one that seems applicable.

This appears to be the last hurdle for me to replace a proprietary search with nutch.

Thanks in advance,
Steven


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to