Hey Jack, *One concern:*
I am not sure where can I get 0.1 billion page urls? I am using DMOZ Open Directory(which has around 3M urls) to inject the crawldb. Please help. Regards, Gaurang 2009/10/4 Jack Yu <jackyu...@gmail.com> > 0.1 billion pages for 1.5TB > > > On 10/5/09, Gaurang Patel <gaurangtpa...@gmail.com> wrote: > > All- > > > > I am novice to using Nutch. Can anyone tell me the estimated size in (I > > suppose, in TBs) that will be required to store the crawled results for > > whole web? I want to get estimate of the memory requirements for my > project, > > that uses Nutch web crawler. > > > > > > > > Regards, > > Gaurang Patel > > >