Hi there, We've been having troubles with performance regarding IndexReader's * document<http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/index/IndexReader.html#document(int)> *(int docID) method.
In summary: Why would the *document<http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/index/IndexReader.html#document(int)> *(int docID) take a few seconds? For some docIds it takes a millisecond and for some it takes up to a few seconds whereas it used to consistenly take a millisecond for each doc fetched. In depth: We have a set of 6 app servers and each serve around a million requests per day. Specs for these app servers are: Ubuntu x86_64 GNU/Linux 8GB RAM, Java 6 with an Xmx setting of 4 GB. Using Lucene 4.1. Current index size is 2.1 GB. We've started using Lucene more extensively in the last 6 months, our index size used be ~ 1.5 GB and back then we had no problems. Recently we acquired more data and the index size jumped to 2.1 GB. When we pushed this new data to production first we had Out of Memory org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:248) exception. The recommendation was to increase the memory so we upped the Xmx setting 4GB but then we had the https://issues.apache.org/jira/browse/LUCENE-1566 (Lucene bug caused by the JVM). Increasing the memory of the servers to 8GB seems to resolve this problem. Now, the system is stable and no more of those exceptions but the response times of Lucene searches increased by 50%. Debugging the issue, I realized that IndexReader's doc is taking an insane amount of time. Used to be a ms each but now it can take a few seconds to fetch a single doc. Is readChunkSize setting of FSDirectory have anything to do with it? (now that our index size is larger than the default chunk size?)