> > I wonder if crawling to that depth for that many links you may have no > choice but to set up a hadoop cluster rather than trying to run it on a > single machine.
Thanks Kevin, I wondered what hadoop for! > > On Wed, Sep 17, 2008 at 6:30 AM, Edward Quick <[EMAIL PROTECTED]>wrote: > > > > > Hi, > > > > I'm running an intranet crawl and have got to the 6th depth which > > apparently has 2.2 million links to fetch. I started off with 100Gb but that > > was barely enough for the fetch not to mention the updatedb step, so I'm > > just trying to find a reliable method for determining how much space is > > required to do the crawl. > > > > Any ideas? > > > > Ed. > > > > _________________________________________________________________ > > Win New York holidays with Kellogg's & Live Search > > http://clk.atdmt.com/UKM/go/111354033/direct/01/ _________________________________________________________________ Make a mini you and download it into Windows Live Messenger http://clk.atdmt.com/UKM/go/111354029/direct/01/
