Hi Robert, Our sysadmins installed a later java version (info below) and I redid the merge and then ran CheckIndex both using Java7. Same error (appended below).
I suppose I could try merging 2 indexes, run checkindex and if its ok merge 3 indexes etc up to 12 to find the point where the problem occurs, but these are 45GB indexes so merging all 12 takes about a day and running CheckIndex feels like it takes a day, although its probably only a few hours. Any hints on an easier way to troubleshoot or ideas about what might be causing the problem? Tom java version "1.7.0_09" Java(TM) SE Runtime Environment (build 1.7.0_09-b05) Java HotSpot(TM) 64-Bit Server VM (build 23.5-b02, mixed mode) Opening index @ bigramsJava7 Segments file=segments_1 numSegments=1 version=3.6 format=FORMAT_3_1 [Lucene 3.1+] 1 of 1: name=_c docCount=865870 compound=false hasProx=true numFiles=8 size (MB)=309,357.885 diagnostics = {mergeFactor=12, os.version=2.6.18-308.24.1.el5, os=Linux, lucene.version=3.6-SNAPSHOT exported - tom - 2012-11-06 14:16:41, source=merge, os.arch=amd64, mergeMaxNumSegments=1, java.version=1.7.0_09, java.vendor=Oracle Corporation} no deletions test: open reader.........OK test: fields..............OK [87 fields] test: field norms.........OK [43 fields] test: terms, freq, prox...ERROR [133157597] java.lang.ArrayIndexOutOfBoundsException: 133157597 at org.apache.lucene.index.TermInfosReaderIndex.compareField(TermInfosReaderIndex.java:249) at org.apache.lucene.index.TermInfosReaderIndex.compareTo(TermInfosReaderIndex.java:225) at org.apache.lucene.index.TermInfosReaderIndex.getIndexOffset(TermInfosReaderIndex.java:156) at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:232) at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:172) at org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java:66) at org.apache.lucene.index.CheckIndex.testTermIndex(CheckIndex.java:715) at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:578) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:1064) test: stored fields.......OK [32361128 total field count; avg 37.374 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] FAILED WARNING: fixIndex() would remove reference to this segment; full exception: java.lang.RuntimeException: Term Index test failed at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:591) at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:1064) WARNING: 1 broken segments (containing 865870 documents) detected WARNING: would write new segments file, and 865870 documents would be lost, if -fix were specified On Wed, Dec 5, 2012 at 5:29 PM, Robert Muir <rcm...@gmail.com> wrote: > On Wed, Dec 5, 2012 at 2:27 PM, Tom Burton-West <tburt...@umich.edu> > wrote: > > > Thanks Robert, > > > > I've asked our sysadmins to install a more recent Java version for > testing. > > I'll report back if it fails with the newer Java version. > > > > Please let us know either way! >