With map reduce there will be only hardware limits.
To crawl ~ 500 Mio with nutch .7 is a pain since db update mai takes
more than one week.
Stefan
Am 25.10.2005 um 02:29 schrieb <[EMAIL PROTECTED]> <[EMAIL PROTECTED]>:
Hi,
Does anybody know what the maximum number of pages that have ever been
fetched and indexed with nutch is? I know Yahoo Research did fetch
100M
pages about 3 years ago, but they stopped after that. Is there any
real
large scale (like, google and yahoo) Webdb out there that has been
fetched
by nutch?
Thanks, Nima