IndexReader doc method performance troubles

G B Tue, 14 May 2013 13:19:43 -0700

Hi there,
We've been having troubles with performance regarding IndexReader's *
document<http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/index/IndexReader.html#document(int)>
*(int docID) method.

In summary:
Why would the
*document<http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/index/IndexReader.html#document(int)>
*(int docID) take a few seconds? For some docIds it takes a millisecond and
for some it takes up to a few seconds whereas it used to consistenly take a
millisecond for each doc fetched.

In depth:
We have a set of 6 app servers and each serve around a million requests per
day.
Specs for these app servers are: Ubuntu x86_64 GNU/Linux 8GB RAM, Java 6
with an Xmx setting of 4 GB. Using Lucene 4.1. Current index size is 2.1 GB.

We've started using Lucene more extensively in the last 6 months, our index
size used be ~ 1.5 GB and back then we had no problems.
Recently we acquired more data and the index size jumped to 2.1 GB. When we
pushed this new data to production first we had Out of Memory
org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:248)
exception. The recommendation was to increase the memory so we upped the
Xmx setting 4GB but then we had the
https://issues.apache.org/jira/browse/LUCENE-1566 (Lucene bug caused by the
JVM). Increasing the memory of the servers to 8GB seems to resolve this
problem. Now, the system is stable and no more of those exceptions but the
response times of Lucene searches increased by 50%. Debugging the issue, I
realized that IndexReader's doc is taking an insane amount of time. Used to
be a ms each but now it can take a few seconds to fetch a single doc. Is
readChunkSize setting of FSDirectory have anything to do with it? (now that
our index size is larger than the default chunk size?)

IndexReader doc method performance troubles

Reply via email to