Having the url ip in crawl-datum is a big issue from my point of view, since doing larger crawls is just not possible since the described honey pot problems.
I will collect some more information soon.
The solution to lookup ip's during segment generation is just to slow as soon you generate larger segments.

Stefan


Am 26.05.2006 um 22:14 schrieb Doug Cutting:

Andrzej Bialecki wrote:
0.8 is pretty stable now, I think we should start considering a release soon, within the next month's time frame.

+1

Are there substantial features still missing from 0.8 that were supported in 0.7?

Are there any showstopping bugs, things that worked in 0.7 that are broken in 0.8?

Doug




-------------------------------------------------------
All the advantages of Linux Managed Hosting--Without the Cost and Risk!
Fully trained technicians. The highest number of Red Hat certifications in
the hosting industry. Fanatical Support. Click to learn more
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=107521&bid=248729&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to