Byron Miller wrote:
Actually at mozdex we have consolidated a bit and we are rebuilding under
the latest release.   For 50 million urls a 200 gig disk is all you need.

If you don't run the DB analysis... ;-) Analysis can eat up a terabyte for breakfast.

That leaves you enough room for your segmetns, db and the space needed to
process (about double your db size)

I'm curious, how do you address the segment life-cycle problem? I'm still missing a good tool in Nutch to handle this, i.e. to phase-out ageing segments.


The biggest boost you can give your query servers is tons of memory. SATA
150 or Scsi drives at 10krpm is also a bonus.

We have finished migrating to entirely Athlon 64's and i'll be posting our
build on the site and wiki

That would be of big help!

--
Best regards,
Andrzej Bialecki
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply via email to