You mentioned that as a rule of thumb each node should
only have about 20M pages. What's the main bottleneck
that's encountered around 20M pages? Disk i/o , cpu
speed?
Thanks.
TL
--- Doug Cutting <[EMAIL PROTECTED]> wrote:
> Murray Hunter wrote:
> > We tested search for a 20 Million page index on a
> dual core 64 bit machine
> > with 8 GB of ram using storage of the nutch data
> on another server through
> > linux nfs, and it's performance was terrible. It
> looks like the bottleneck
> > was nfs, so I was wondering how you had your
> storage set up. Are you using
> > NDFS, or is it split up over multiple servers?
>
> For good search performance, indexes and segments
> should always reside
> on local volumes, not in NDFS and not in NFS.
> Ideally these can be
> spread across the available local volumes, to permit
> more parallel disk
> i/o. As a rule of thumb, searching starts to get
> slow with more than
> around 20M pages per node. Systems larger than that
> should benefit from
> distributed search.
>
> Doug
>
__________________________________
Yahoo! Music Unlimited
Access over 1 million songs. Try it free.
http://music.yahoo.com/unlimited/