Hi, I am using lucene-3.5 and getting an OutOfMemoryError on a large indexing task of 100M documents. I am creating an index with 3 UUIDs as separate field values. I am using Store.YES on 1 of them and Store.NO on the others; I am using Index.NOT_ANALYZED_NO_NORMS on all three; explicitly setting field.setIndexOptions(IndexOptions.DOCS_ONLY); and indexWriterConfig.setTermIndexInterval(termIndexInterval); to 1024. I am trying to index 100M records into my index.
Is there any reason FreqProxTermsWriterPerField.FreqProxPostingsArray needs to be constructed even though I have the positions etc suppressed? It seems that the reason I get an OutOfMemoryError is that 7 int[] of size proportional to number of unique fields are being constructed; however, at least some of them are probably wasteful given my indexing configurations. Any help is appreciated. Thanks, -Ken [junit] Error: [junit] Exception in thread "Thread-18" java.lang.OutOfMemoryError: Java heap space [junit] at org.apache.lucene.index.ParallelPostingsArray.<init>(ParallelPostingsArray.java:35) [junit] at org.apache.lucene.index.FreqProxTermsWriterPerField$FreqProxPostingsArray.<init>(FreqProxTermsWriterPerField.java:190) [junit] at org.apache.lucene.index.FreqProxTermsWriterPerField$FreqProxPostingsArray.newInstance(FreqProxTermsWriterPerField.java:204) [junit] at org.apache.lucene.index.ParallelPostingsArray.grow(ParallelPostingsArray.java:48) [junit] at org.apache.lucene.index.TermsHashPerField.growParallelPostingsArray(TermsHashPerField.java:137) [junit] at org.apache.lucene.index.TermsHashPerField.add(TermsHashPerField.java:440) [junit] at org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:94) [junit] at org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(DocFieldProcessorPerThread.java:278)