Hello, We are in the process of moving to SolrCloud Solr 9, but my team is still maintaining a *large Solr 6 farm (Solr/Lucene 6.4.2).*
In the last 6 months we have noticed a good number of CorruptIndexException errors. Most, if not all, we relate to an event on Vsphere where a large number of VMs lost the ability to write to disk (not good!). As we encounter these errors we have re-indexed the Solr cores to fix them. A good bit of work! We have tried to be proactive and use the Lucene Index checker tool to detect corrupt cores, however this is a lot of overhead and time to run. I am interested to learn more about what may cause Index corruption (wondering if other circumstances, beyond the temp loss of disk event, might be causing these errors in our Solr farm). - Is it always a problem with the Linux VM file system or storage, or a possible issue with Lucene? - Is there some misbehavior (or the handling of a specific scenario - high number of atomic updates?) within Lucene/Solr that can result in corruption? - Is there a size limit to Solr core that when exceeded, make them more vulnerable to corruption? Below is one of the type of error messages related to corruption that we have observed: Caused by: org.apache.lucene.index.CorruptIndexException: codec header mismatch: actual header=-1527899865 vs expected header=1071082519 (resource= BufferedChecksumIndexInput(MMapIndexInput(path="/var/data/solr/instance-1/ C23491/content/index/_vj4.cfs") [slice=_vj4.fnm])) at org.apache.lucene. codecs.CodecUtil.checkHeader(CodecUtil.java:196) at org.apache.lucene.codecs .CodecUtil.checkIndexHeader(CodecUtil.java:255) at org.apache.lucene.codecs. lucene60.Lucene60FieldInfosFormat.read(Lucene60FieldInfosFormat.java:117) at org.apache.lucene.index.IndexWriter.readFieldInfos(IndexWriter.java:1063) at org.apache.lucene.index.IndexWriter.getFieldNumberMap(IndexWriter.java:1079) at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:968) at org. apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:125) at org. apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:100) at org. apache.solr.update.DefaultSolrCoreState.createMainIndexWriter( DefaultSolrCoreState.java:240) at org.apache.solr.update. DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:114) at org. apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1852) ... 40 more Suppressed: org.apache.lucene.index.CorruptIndexException: codec footer mismatch (file truncated?): actual footer=-548541180 vs expected footer=- 1071082520 (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/var/ data/solr/instance-1/C23491/content/index/_vj4.cfs") [slice=_vj4.fnm])) at org.apache.lucene.codecs.CodecUtil.validateFooter(CodecUtil.java:499) at org .apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:411) at org. apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:459) at org.apache .lucene.codecs.lucene60.Lucene60FieldInfosFormat.read( Lucene60FieldInfosFormat.java:171) ... 48 more Thanks, Matt