Re: Query performance on a 315 Million document index (1TB)

Otis Gospodnetic Thu, 06 May 2004 23:42:54 -0700

That's big, and while I have not created such large indices with
Lucene, I would think that disk I/O would be the biggest issue.  That
is why Nutch has distributed search options built in, and their demo
has 'only' 100M documents.  Perhaps you can mimic distributed indexing
and searching approach of Nutch.


Otis

--- Will Allen <[EMAIL PROTECTED]> wrote:
> Hi,
>       I am considering a project that would index 315+ million documents.
> I am comfortable that the indexing will work well in creating an
> index ~800GB in size, but am concerned about the query performance.
> (Is this a = bad
> assumption?)
> 
> What are the bottlenecks of performance as an index scales?  Memory? 
> = Cost is not a concern, so what would be the shortcomings of a
> theoretical = machine with 16GB of ram, 4-16 cpus and 1-2 terabytes
> of space?  Would it be = better to cluster machines to break apart
> the query?
> 
> Thank you for your serious responses,
> Will Allen
> -- 
> ___________________________________________________________
> Sign-up for Ads Free at Mail.com
> http://promo.mail.com/adsfreejump.htm
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Query performance on a 315 Million document index (1TB)

Reply via email to