Hello,

We are in the process of moving to SolrCloud Solr 9, but my team is still
maintaining a *large Solr 6 farm (Solr/Lucene 6.4.2).*

In the last 6 months we have noticed a good number of CorruptIndexException
errors.  Most, if not all, we relate to an event on Vsphere where a large
number of VMs lost the ability to write to disk (not good!).  As we
encounter these errors we have re-indexed the Solr cores to fix them.  A
good bit of work!  We have tried to be proactive and use the Lucene Index
checker tool to detect corrupt cores, however this is a lot of overhead and
time to run.

I am interested to learn more about what may cause Index corruption
(wondering if other circumstances, beyond the temp loss of disk event,
might be causing these errors in our Solr farm).
- Is it always a problem with the Linux VM file system or storage, or a
possible issue with Lucene?
- Is there some misbehavior (or the handling of a specific scenario - high
number of atomic updates?) within Lucene/Solr that can result in corruption?
- Is there a size limit to Solr core that when exceeded, make them more
vulnerable to corruption?

Below is one of the type of error messages related to corruption that we
have observed:

Caused by: org.apache.lucene.index.CorruptIndexException: codec header
mismatch: actual header=-1527899865 vs expected header=1071082519 (resource=
BufferedChecksumIndexInput(MMapIndexInput(path="/var/data/solr/instance-1/
C23491/content/index/_vj4.cfs") [slice=_vj4.fnm])) at org.apache.lucene.
codecs.CodecUtil.checkHeader(CodecUtil.java:196) at org.apache.lucene.codecs
.CodecUtil.checkIndexHeader(CodecUtil.java:255) at org.apache.lucene.codecs.
lucene60.Lucene60FieldInfosFormat.read(Lucene60FieldInfosFormat.java:117) at
org.apache.lucene.index.IndexWriter.readFieldInfos(IndexWriter.java:1063) at
org.apache.lucene.index.IndexWriter.getFieldNumberMap(IndexWriter.java:1079)
at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:968) at org.
apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:125) at org.
apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:100) at org.
apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(
DefaultSolrCoreState.java:240) at org.apache.solr.update.
DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:114) at org.
apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1852) ... 40 more
Suppressed: org.apache.lucene.index.CorruptIndexException: codec footer
mismatch (file truncated?): actual footer=-548541180 vs expected footer=-
1071082520 (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/var/
data/solr/instance-1/C23491/content/index/_vj4.cfs") [slice=_vj4.fnm])) at
org.apache.lucene.codecs.CodecUtil.validateFooter(CodecUtil.java:499) at org
.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:411) at org.
apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:459) at org.apache
.lucene.codecs.lucene60.Lucene60FieldInfosFormat.read(
Lucene60FieldInfosFormat.java:171) ... 48 more

Thanks,
Matt

Reply via email to