I am looking to track down an issue in 2.9.2 where during highlighting, certain data may cause rapid memory usage and OOM exception in java:
------- java.lang.OutOfMemoryError: Java heap space at org.apache.lucene.analysis.Token.growTermBuffer(Token.java:470) at org.apache.lucene.analysis.Token.setTermBuffer(Token.java:395) at org.apache.lucene.search.highlight.TokenSources.getTokenStream(TokenSources.java:200) at org.apache.lucene.search.highlight.TokenSources.getTokenStream(TokenSources.java:112) at org.apache.lucene.search.highlight.TokenSources.getTokenStream(TokenSources.java:249) at com.bmc.arsys.fts.impl.lucene.LuceneFTSService.doHighlight(LuceneFTSService.java:1871) ------- doHighlight is our method that calls to do the highlighting. I did a search for issue but so far have not come across any hits on Google, etc. doHighlight looks like this at the beginning: ------ private String doHighlight(IndexReader indexReader, int docId, String strFieldName, Query query, String strText, boolean isTitle, String markupLeft, String markupRight) { String strBestText = null; try { TokenStream tokenStream = TokenSources.getTokenStream(indexReader, docId, strFieldName); QueryScorer scorer = new QueryScorer(query, strFieldName); Fragmenter fragmenter = null; ------ It fails during the tokenStream. Now when I say fails, the memory shoots from 1.5GB to beyond 8.0GB in some cases where we stopped experimenting with adding memory. The entire collection directory is only 4.4 GB and the search strings are usually very simple, but it seems to be related to the data that is returned in some cases. (i.e. search for "db" and some data causes this to shoot up even if there are only 1 or a few (<10) hits). Has anyone seen this before?