Hello, I think the documentation and example files for Solr 4.x need to be updated. If someone will let me know I'll be happy to fix the example and perhaps someone with edit rights could fix the reference guide.
Due to dirty OCR and over 400 languages we have over 2 billion unique terms in our index. In Solr 3.6 we set termIndexInterval to 1024 (8 times the default of 128) to reduce the size of the in-memory index. Previously we used termIndexDivisor for a similar purpose. We suspect that in Solr 4.10 (and probably previous Solr 4.x versions) termIndexInterval and termIndexDivisor do not apply to the default codec and are probably unnecessary (since the default terms index now uses a much more efficient representation). According to the JavaDocs for IndexWriterConfig, the Lucene level implementations of these do not apply to the default PostingsFormat implementation. http://lucene.apache.org/core/4_10_0/core/org/apache/lucene/index/IndexWriterConfig.html#setReaderTermsIndexDivisor%28int%29 Despite this statement in the Lucene JavaDocs, in the example/solrconfig.xml there is the following: <!-- Expert: Controls how often Lucene loads terms into memory 278 Default is 128 and is likely good for most everyone. 279 --> 280 <!-- <termIndexInterval>128</termIndexInterval> --> In the 4.10 reference manual page 365 there is also an example showing the termIndexInterval. Can someone please confirm that these two parameter settings termIndexInterval and termsIndexDivisor, do not apply to the default PostingsFormat for Solr 4.10? Tom