Hi,

I am trying to index a huge collection of documents - several hundreds of
gigabytes.
Needless to say, I'm trying to squeeze every ounce of performance from my
machine, to get this indexing done in a sensible amount of time.

Making use of the fact that my machine has two CPUs was easy: I just run
two indexing threads, creating two separate indices which can later be
merged, or searched with a MultiSearch.

What I couldn't figure out how to use, however, was the abundant memory (2
GB) that this machine has.

I tried playing with IndexWriter.setMaxBufferedDocs(), and noticed that
there is no speed gain after I set it to 1000, at which point the running
Lucene takes up just 70 MB of memory, or 140 MB for the two threads.

Is there a way for Lucene to make use of the much larger memory I have, to
speed up the indexing process? Does having a huge memory somehow improve
the speed of huge merges, for example?

Thanks,
Nadav.

--
Nadav Har'El
[EMAIL PROTECTED]
+972-4-829-6326


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to