"Michael D. Curtin" <[EMAIL PROTECTED]> wrote on 12/06/2006 03:49:53 PM:
> Nadav Har'El wrote:
>
> > What I couldn't figure out how to use, however, was the abundant memory
(2
> > GB) that this machine has.
> >
> > I tried playing with IndexWriter.setMaxBufferedDocs(), and noticed that
> > there is no speed gain after I set it to 1000, at which point the
running
> > Lucene takes up just 70 MB of memory, or 140 MB for the two threads.
>
> It may not be a Lucene limit per se, but a JVM limit instead.  What are
you
> using for the JVM's heap (via -Xms and -Xmx switches)?  For example, I
often
> run with java -Xmx1000m to let the heap grow to a gigabyte, if necessary.

Sure, I used -Xmx500m, for example, but Java still only grew to 70 MB.
When I continued to increase setMaxBufferedDocs(), Java's memory use grew,
but performance did not continue to improve (it even slightly dropped, I
don't know why). So apparently, the problem isn't Java's memory limit,
it's Lucene simply not wishing to use more memory.

> Might I also suggest that you not try to index all of this data in a
single
> invocation of a Java program.  That is, index a portion, say 10GB at a
time,
> and then use AddIndexes() later to bring them together.  Set your
granularity
> based on the amount of time you can stand to do work over, when the
seemingly
> inevitable problems crop up.  It sure would stink to be working on the
1000th
> GB of input data and have a power supply go out, and then have to start
all
> the way over from the beginning!  Other checkpointing schemes are
> possible, if
> you have the time and inclination to be more clever, too ...

Good idea.

Thanks,
Nadav.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to