Hi Robert,
Our sysadmins installed a later java version (info below) and I redid the
merge and then ran CheckIndex both using Java7. Same error (appended
below).
I suppose I could try merging 2 indexes, run checkindex and if its ok merge
3 indexes etc up to 12 to find the point where the problem occurs, but
these are 45GB indexes so merging all 12 takes about a day and running
CheckIndex feels like it takes a day, although its probably only a few
hours.
Any hints on an easier way to troubleshoot or ideas about what might be
causing the problem?
Tom
java version "1.7.0_09"
Java(TM) SE Runtime Environment (build 1.7.0_09-b05)
Java HotSpot(TM) 64-Bit Server VM (build 23.5-b02, mixed mode)
Opening index @ bigramsJava7
Segments file=segments_1 numSegments=1 version=3.6 format=FORMAT_3_1
[Lucene 3.1+]
1 of 1: name=_c docCount=865870
compound=false
hasProx=true
numFiles=8
size (MB)=309,357.885
diagnostics = {mergeFactor=12, os.version=2.6.18-308.24.1.el5,
os=Linux, lucene.version=3.6-SNAPSHOT exported - tom - 2012-11-06 14:16:41,
source=merge, os.arch=amd64, mergeMaxNumSegments=1, java.version=1.7.0_09,
java.vendor=Oracle Corporation}
no deletions
test: open reader.........OK
test: fields..............OK [87 fields]
test: field norms.........OK [43 fields]
test: terms, freq, prox...ERROR [133157597]
java.lang.ArrayIndexOutOfBoundsException: 133157597
at
org.apache.lucene.index.TermInfosReaderIndex.compareField(TermInfosReaderIndex.java:249)
at
org.apache.lucene.index.TermInfosReaderIndex.compareTo(TermInfosReaderIndex.java:225)
at
org.apache.lucene.index.TermInfosReaderIndex.getIndexOffset(TermInfosReaderIndex.java:156)
at
org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:232)
at
org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:172)
at
org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java:66)
at
org.apache.lucene.index.CheckIndex.testTermIndex(CheckIndex.java:715)
at
org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:578)
at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:1064)
test: stored fields.......OK [32361128 total field count; avg 37.374
fields per doc]
test: term vectors........OK [0 total vector count; avg 0 term/freq
vector fields per doc]
FAILED
WARNING: fixIndex() would remove reference to this segment; full
exception:
java.lang.RuntimeException: Term Index test failed
at
org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:591)
at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:1064)
WARNING: 1 broken segments (containing 865870 documents) detected
WARNING: would write new segments file, and 865870 documents would be lost,
if -fix were specified
On Wed, Dec 5, 2012 at 5:29 PM, Robert Muir <[email protected]> wrote:
> On Wed, Dec 5, 2012 at 2:27 PM, Tom Burton-West <[email protected]>
> wrote:
>
> > Thanks Robert,
> >
> > I've asked our sysadmins to install a more recent Java version for
> testing.
> > I'll report back if it fails with the newer Java version.
> >
>
> Please let us know either way!
>