Hey Jack,

*One concern:*

I am not sure where can I get  0.1 billion page urls? I am using DMOZ Open
Directory(which has around 3M urls) to inject the crawldb.

Please help.

Regards,
Gaurang

2009/10/4 Jack Yu <jackyu...@gmail.com>

> 0.1 billion pages for 1.5TB
>
>
> On 10/5/09, Gaurang Patel <gaurangtpa...@gmail.com> wrote:
> > All-
> >
> > I am novice to using Nutch. Can anyone tell me the estimated size in (I
> > suppose, in TBs) that will be required to store the crawled results for
> > whole web? I want to get estimate of the memory requirements for my
> project,
> > that uses Nutch web crawler.
> >
> >
> >
> > Regards,
> > Gaurang Patel
> >
>

Reply via email to