Having the url ip in crawl-datum is a big issue from my point of
view, since doing larger crawls is just not possible since the
described honey pot problems.
I will collect some more information soon.
The solution to lookup ip's during segment generation is just to slow
as soon you generate larger segments.
Stefan
Am 26.05.2006 um 22:14 schrieb Doug Cutting:
Andrzej Bialecki wrote:
0.8 is pretty stable now, I think we should start considering a
release soon, within the next month's time frame.
+1
Are there substantial features still missing from 0.8 that were
supported in 0.7?
Are there any showstopping bugs, things that worked in 0.7 that are
broken in 0.8?
Doug
-------------------------------------------------------
All the advantages of Linux Managed Hosting--Without the Cost and Risk!
Fully trained technicians. The highest number of Red Hat certifications in
the hosting industry. Fanatical Support. Click to learn more
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=107521&bid=248729&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general