Your index has relatively few terms: ~13 million.
Lucene stores TermInfo instances in two places. The first place is a
persistent array, called the terms index, of every 128th term. It's
created when the IndexReader is first opened. So in your case this is
~100.000 ("100 thousand") instances. The JPEG on your original post
on the compass forum is tracing to this terms index.
Then, per thread per segment there is an LRU cache of recently used
TermInfo instances. That cache is 1024 in size. Depending on how
many threads, segments and how many unique queries you are testing
with, this will add some number of TermInfo instances.
So... seeing alot of TermInfo instances is fully normal.
Are you actually hitting OOM, or just noticing alot of TermInfo
instances in YourKit?
How many SegmentReader instances do you see held open in YourKit?
Mike
chanchitodata wrote:
Hi Michael,
I´m pretty sure that the IndexReaders are being closed. As I said I
use
Compass and compass handles all the IndexReader stuff for me. I have
discussed this issue with Shay Banon for a while in the Compass
forum and
he was the guy that lead me to this forum after several diferents
test we
did.
I have attached a textfile with the output of a CheckIndex on the
biggest
index I have. I have also attached an image with the Luke overview
of the
same index.
Best regards,
/Rodrigo
http://www.nabble.com/file/p21932951/Checkindex.txt Checkindex.txt
http://www.nabble.com/file/p21932951/LukeOverview.JPG LukeOverview.JPG
Michael McCandless-2 wrote:
Are you certain that old IndexReaders are being closed?
If you are not using CFS file format, how large are your *.tii files?
If you are using CFS file format, can you run CheckIndex on your
index
and post the output? This way we can see how many terms are in the
index (which is what gets loaded as TermInfo instances).
Mike
chanchitodata wrote:
Hi,
I have a weird problem. I use Lucene 2.4 in an web
application(Tomcat
5.5.x), running uncer JDK 1.5. After a while (from 1 day to a couple
depending on traffic) all memory gets eaten up by a lot of TermInfo
instances. I have profiled the application and I can see that the
TermInfo
instances does not get recovered by the GC.
I also use Compass and have been posting on a thread in the Compass
forum(http://forum.compass-project.org/thread.jspa?threadID=215943&start=0&tstart=0
),
thinking that it was Compass the held on to something in memory but
we have
come into conclusion that it must be Lucene that keeps holding on
the the
memory.
I have read the thread
http://www.nabble.com/OutOfMemory-Problems-Lucene-2.4---Tomcat-td20236834.html#a20236834
and tried to divide my index into smaller parts but with the same
result.
The index contains about 1.5Gb with around 2.7 Million documents and
aprox.
30 Fields.
/Rodrigo
--
View this message in context:
http://www.nabble.com/Memory-Eaten-up-by-TermInfo-Instances-in-Lucene-2.4-tp21913262p21913262.html
Sent from the Lucene - Java Users mailing list archive at
Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
--
View this message in context:
http://www.nabble.com/Memory-Eaten-up-by-TermInfo-Instances-in-Lucene-2.4-tp21913262p21932951.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org