Doug,
actually splitting the segments across the drives as individual drives
is much faster, your waiting for it to access random data so your limitation
is the harddrives random access time, 15k scsi drives can have about
4-5million pages on them before they are slower than 1 sec per search. test
out my current search engine setup, I'm running 6 harddrives 5 are 10k 9.1gb
scsi and 1 is a 15k 18gb scsi, the server is a quad xeon 550mhz with 1gb
ram, running windows 2000. no raid controller just ultra wide scsi 2
(80mbyte a sec) each drive just has a segments folder containing separate
segments without a combined index. The machine has 3 million pages indexed
for searching, its running distributed search with 6 separate servers
running, also tomcat is running on the same machine, it is using all of its
1gb of ram.
-J
PS: http://search.fromped.com
PSS: I have 8x 320gb sata drives hooked into my big server running raid 0,
it can do less searches a sec as raid 0 then the old xeon with 10k and 15k
drives. even though the scsi drives are only 55mbytes a sec and the 8x320gb
drives is over 400mbytes a sec its still slower for random accesses!!! can
anyone say nonvolital-storage, if they only made 300gb flash drives for
under $100K!!!
----- Original Message -----
From: "Doug Cutting" <[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Monday, October 17, 2005 4:50 PM
Subject: Re: Nutch Search Speed Concern
> Murray Hunter wrote:
> > Our frontend server has 6 SCSI drive bays, would you suggest 6 separate
> > volumes or one raid 10 volume, or perhaps three raid 0 arrays? We have
> > networked storage that we could use for backups so we are not to
concerned
> > about data loss.
>
> One raid 0 or 10 volume should work well for searching, and simplifies
> allocation.
>
> Doug
>
>