All,

We have integrated Lucene into a larger content management system. We
need to be able to quantify the performance and scalability of Lucene so
that we can compare it to some commercial systems that we also have
access to (Convera, in particular).

I searched the archives and didn't find any references to existing
benchmark tests or results for Lucene. Are there any? We are looking at
peak scales on the order of 100,000 or 1,000,000 separate indexed files
being accessed by 10,000 concurrent users (a system with 100,000
registered users of whom 10% are active at any given time). Users expect
5-10 second response times for a simple query (e.g., all docs containing
the work "spam"). I don't personally know if Convera, for example, can
meet these scales, but there is some expectation within DataChannel that
it can (it is currently used with the DataChannel Server portal
product). We need to know if Lucene can be made to work at these scales
or, if not, what it's upper scale limits are and/or what would need to
be done to it to provide the scalability characteristics we're looking
for.

Of course, we have other lower-scale use cases where we have no doubt
that Lucene will perform very well.

If you're curious: we've integrated Lucene into a generic CORBA-based
full-text framework that allows our Python-based versioning content
management system to use any full-text indexer integrated through the
framework. We chose Lucene as the first integration because it is open
source, came well recommended, and appeared to be (and was in fact) the
quickest way for us to get indexing functionality implemented. 

Thanks,

Eliot

-- 
. . . . . . . . . . . . . . . . . . . . . . . .

W. Eliot Kimber | Lead Brain

1016 La Posada Dr. | Suite 240 | Austin TX  78752
    T 512.656.4139 |  F 512.419.1860 | [EMAIL PROTECTED]

w w w . d a t a c h a n n e l . c o m

_______________________________________________
Lucene-users mailing list
[EMAIL PROTECTED]
http://lists.sourceforge.net/lists/listinfo/lucene-users

Reply via email to