This is unlikely to work well/fast.  It will depend on the size of the index 
(not in terms of the number of docs, but its physical size), the number of 
queries/second and desired query latency.  If you can wait 10 seconds to get a 
query and if only a few queries are hitting the server at any one time, then 
you may be Ok.  Having things be up to date with non-relevancy sorting will be 
quite tough.  FieldCache will consume some RAM.  Warming it up will take some 
number of seconds.  Re-opening an IndexSearcher after index changes will also 
cost you a bit of time.

Consider a 64-bit server with more RAM that allowed larger Java heaps, and try 
to fit your index into RAM.

Otis

----- Original Message ----
From: Mark Miller <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Saturday, August 12, 2006 7:45:15 PM
Subject: Re: 30 milllion+ docs on a single server

The single server is important because I think it will take a lot of 
work to scale it to multiple servers. The index must allow for close to 
real-time updates and additions. It must also remain searchable at all 
times (other than than during the brief period of single updates and 
additions). If it is easy to scale this to multiple servers please tell 
me how.

- Mark
> Why is a single server so important?  I can scale horizontally much 
> cheaper
> than I scale vertically.
>
>
>
> On 8/11/06, Mark Miller <[EMAIL PROTECTED]> wrote:
>>
>> I've made a nice little archive application with lucene. I made it to
>> handle our largest need: 2.5 million docs or so on a single server. Now
>> the powers that be say: lets use it for a 30+ million document archive
>> on a single server! (each doc size maybe 10k max...as small as a 1 or
>> 2k) Please tell me why we are in trouble...please tell me why we are
>> not. I have tested up to 2 million docs without much trouble but 30
>> million...the average search will include a sort on a field as
>> well...can I search 30+ million docs with a sort? Man am I worried about
>> that. Maybe the server will have 8 procs and 12 billion gigs of RAM.
>> Mabye. Even still, Tomcat seems to be able to launch with a max of 1.5
>> or 1.6 gig of Ram in Windows. What do you think? 30 million+ sounds like
>> too much of a load to me for a single server. Not that they care what I
>> think...I only wrote the thing (man I hate my job, offer me a new one :)
>> )...please...comments?
>>
>> Cheers,
>>
>> Miserable Mark
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>> For additional commands, e-mail: [EMAIL PROTECTED]
>>
>>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to