Re: [Lucene-users] Performance/Scalability Benchmarks for Lucene

Anshuman Mon, 11 Jun 2001 20:46:17 -0700

Hi !,

You need to create the directory first like,

Directory directory = new RAMDirectory();
Then you need to create the IndexEngine using the directory, like

      //define analyzer
      StopWordsTool stopWords = new StopWordsToolImpl();
      Analyzer analyzer = new RedwoodAnalyzer(stopWords);
      IndexEngine index = new IndexEngine(directory,analyzer);

Then you can use the index to add document and this will be the RAM index as required.

While doing the query you need to specify the same directory for lucene to use the RAM index instead of the file index.

Anshuman

www.verilytics.com

----- Original Message -----

From: Frank Morton

To: [EMAIL PROTECTED]

Sent: Tuesday, June 12, 2001 4:43 AM

Subject: Re: [Lucene-users] Performance/Scalability Benchmarks for Lucene

While we are talking about scaling, does anyone know how
to force lucene to use ram for the entire index?

Frank

> Tal Dayan wrote:
> >
> > Hi Eliot,
> >
> > Are all the 10,000 doing seaches all day ? Can you estimate
> > the required peak number searches per second ?
>
> I would think something like 1000/second would be about right--that is,
> at any moment, 10% of the 10,000 active users would be requesting a
> search. This reflects a use case in which searching is one of the
> primary services the system provides and is one of the primary means of
> finding things in the system.
>
> > What about hardware, is a multi server solution practical ? What
> > kind of hardware do you have in mind ?
>
> I think we are anticipating the usual beefy hardware you need to drive a
> system of this scale in any case--big SUN machines, high-speed storage,
> etc. That is, I think we can presume fastest possible hardware (which
> would otherwise be a requirement for delivering the overall system
> performance, not just the indexing).
>
> > What is the expected total size of your data ? How often does it
> > changed or need to be reindexed ?
>
> In the large-scale use case, most documents would be in the 50-100K
> range (that is, typical business documents), but there would be 100s of
> thousands or millions of documents to be indexed. I'm not sure what our
> expected rate for adding new docs is, but I would think that 100/hour
> would be about right. Each new document would require re-indexing. In
> the non-versioned case, existing (and previously-indexed docs) would be
> replaced by new copies, which would have to be re-indexed. This would
> probably account for maybe 10 docs an hour.
>
> In the versioned content management use case, existing versions are
> never deleted and their indexes persist indefinitely, so indexing would
> always be additive, with no need to invalidate existing indexes because
> documents had been deleted.
>
> This is about as specific as I can be--I'm mostly wondering if either
> people have used Lucene at these sorts of scales (or something
> close--our scale targets are pretty high, reflecting the needs of the
> largest enterprises) or if there are some existing scalability tests
> that we can run on our test bed to get some baseline numbers.
>
> Thanks,
>
> Eliot
> --
> . . . . . . . . . . . . . . . . . . . . . . . .
>
> W. Eliot Kimber | Lead Brain
>
> 1016 La Posada Dr. | Suite 240 | Austin TX 78752
> T 512.656.4139 | F 512.419.1860 | [EMAIL PROTECTED]
>
> w w w . d a t a c h a n n e l . c o m
>
> _______________________________________________
> Lucene-users mailing list
> [EMAIL PROTECTED]
> http://lists.sourceforge.net/lists/listinfo/lucene-users
>

_______________________________________________
Lucene-users mailing list
[EMAIL PROTECTED]
http://lists.sourceforge.net/lists/listinfo/lucene-users

Re: [Lucene-users] Performance/Scalability Benchmarks for Lucene

Reply via email to