Murray,
you may will find this article interesting, especially the part about
disks.
http://www.acmqueue.org/modules.php?name=Content&pa=showpage&pid=143
Stefan
Am 17.10.2005 um 22:09 schrieb Murray Hunter:
Doug,
Our frontend server has 6 SCSI drive bays, would you suggest 6
separate
volumes or one raid 10 volume, or perhaps three raid 0 arrays? We have
networked storage that we could use for backups so we are not to
concerned
about data loss.
Thanks,
Murray
-----Original Message-----
From: Doug Cutting [mailto:[EMAIL PROTECTED]
Sent: Monday, October 17, 2005 12:38 PM
To: [email protected]
Subject: Re: Nutch Search Speed Concern
Murray Hunter wrote:
We tested search for a 20 Million page index on a dual core 64 bit
machine with 8 GB of ram using storage of the nutch data on another
server through linux nfs, and it's performance was terrible. It looks
like the bottleneck was nfs, so I was wondering how you had your
storage set up. Are you using NDFS, or is it split up over multiple
servers?
For good search performance, indexes and segments should always
reside on
local volumes, not in NDFS and not in NFS. Ideally these can be
spread
across the available local volumes, to permit more parallel disk i/
o. As a
rule of thumb, searching starts to get slow with more than around
20M pages
per node. Systems larger than that should benefit from distributed
search.
Doug