Hi experts,

Recently one of our customer continuously seeing ArrayIndexOutOfBoundsException 
which is thrown from Lucene.

Our production is full-text search engine built on top of Lucene, following is 
the stack traces. The customer saying that they can reproduce the issue even 
after re-index everything from scratch.

Caused by: java.lang.ArrayIndexOutOfBoundsException: 110
                at 
org.apache.lucene.codecs.lucene41.ForUtil.readBlock(ForUtil.java:196)
                at 
org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$EverythingEnum.refillPositions(Lucene41PostingsReader.java:1284)
                at 
org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$EverythingEnum.skipPositions(Lucene41PostingsReader.java:1505)
                at 
org.apache.lucene.codecs.lucene41.Lucene41PostingsReader$EverythingEnum.nextPosition(Lucene41PostingsReader.java:1548)
                at 
org.apache.lucene.search.spans.TermSpans.skipTo(TermSpans.java:82)
                at 
org.apache.lucene.search.spans.SpanScorer.advance(SpanScorer.java:63)
                at 
org.apache.lucene.search.ConjunctionScorer.doNext(ConjunctionScorer.java:69)
                at 
org.apache.lucene.search.ConjunctionScorer.nextDoc(ConjunctionScorer.java:100)
                at org.apache.lucene.search.Scorer.score(Scorer.java:64)
                at 
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:627)
                at com.xhive.lucene.executor.f.a(xdb:158)
                at com.xhive.lucene.executor.f.search(xdb:145)
                at com.xhive.lucene.subpath.e.a(xdb:313)
                at com.xhive.lucene.subpath.e.a(xdb:264)
                at com.xhive.lucene.subpath.e.a(xdb:183)
                at com.xhive.lucene.executor.v.executeExternally(xdb:253)
                at com.xhive.kernel.ay.externalIndexExecute(xdb:2791)
                at com.xhive.core.index.ExternalIndex.executeExternally(xdb:485)
                at com.xhive.core.index.XhiveMultiPathIndex.a(xdb:306)
                at com.xhive.xquery.pathexpr.v$a.ci(xdb:124)
                at com.xhive.xquery.pathexpr.ad$a.cp(xdb:104)
                at com.xhive.xquery.pathexpr.ax.awP(xdb:39)
                at com.xhive.xquery.pathexpr.ax.<init>(xdb:32)
                at com.xhive.xquery.pathexpr.av.a(xdb:424)
                at com.xhive.xquery.pathexpr.al$a.awk(xdb:61)
                at com.xhive.xquery.pathexpr.ag.awj(xdb:28)
                at com.xhive.xquery.pathexpr.al.Xo(xdb:26)
                at com.xhive.xquery.pathexpr.aj.<init>(xdb:33)
                at com.xhive.xquery.pathexpr.al.<init>(xdb:20)
                at com.xhive.xquery.pathexpr.av.a(xdb:462)
                at com.xhive.xquery.pathexpr.av.a(xdb:413)
                at com.xhive.xquery.pathexpr.av.a(xdb:276)
                at com.xhive.xquery.pathexpr.av.a(xdb:220)


==============================================================
following is CheckIndex output of corrupted segment. The full output is 
attached.


Checking consistency of: [CHECK_INDEXES_CONSISTENCY]
Library child /dpwprd/dsearch/Data/Collection2 is not in consistent state, 
errors report:
============================================================
Library child name=/dpwprd/dsearch/Data/Collection2 indexes
consistency report.
============================================================
check external index consistency [database name: xhivedb;

index name: dmftdoc; segment id:

EI-0ab89c0c-2a9d-4fe2-97b9-5f0c96678f13-510173395289107-master;

xhive index id id: 510173395289107]
check lucene indices

fail: lucene index LI-0001cd61-342c-4cfe-9898-c293eb1c8c09

is not consistent; Segments file=segments_2 numSegments=5

version=4.5.1 format=
  1 of 5: name=_0 docCount=8341939


   codec=Lucene45
    compound=false
    numFiles=26


size (MB)=16,446.152
    diagnostics =

{timestamp=1514627603337, mergeFactor=6, os.version=6.1,

os=Windows Server 2008 R2, lucene.version=4.5.1 1533280 -

mark - 2013-10-17 21:37:01, source=merge, os.arch=amd64,

mergeMaxNumSegments=5, java.version=1.7.0_80,

java.vendor=Oracle Corporation}
    has deletions

[delGen=70]
    test: open reader.........OK [2295 deleted

docs]
    test: fields..............OK [268 fields]


test: field norms.........OK [3 fields]
    test: terms,

freq, prox...ERROR:

java.lang.ArrayIndexOutOfBoundsException


java.lang.ArrayIndexOutOfBoundsException
    test: stored

fields.......OK [16679288 total field count; avg 2 fields

per doc]
    test: term vectors........OK [0 total vector

count; avg 0 term/freq vector fields per doc]
    test:

docvalues...........OK [0 docvalues fields; 0 BINARY; 0

NUMERIC; 0 SORTED; 0 SORTED_SET]
FAILED
    WARNING:

fixIndex() would remove reference to this segment; full

exception:
java.lang.RuntimeException: Term Index test

failed
                at

org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:638)


                at

org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:372)


                at com.xhive.lucene.executor.j.a(xdb:1190)
                at

com.xhive.lucene.executor.j.aY(xdb:1166)
                at

com.xhive.lucene.executor.v.checkIndexConsistency(xdb:370)


                at

com.xhive.kernel.ay.externalIndexCheckConsistency(xdb:2523)


                at com.xhive.kernel.bn.handleRequest(xdb:2544)
                at

com.xhive.kernel.bn.run(xdb:222)
                at

java.lang.Thread.run(Thread.java:745)

==============================================================

The corrupted payload stores a serialized hashmap which contains several 
configurable metadata which is used to sort by condition.
The field of the corrupted payload is single term field, so the structure of 
posting looks like a sequence of payload.
We also put freshness boost value into payload in another field, which have no 
issues.

It is the first customer report the corruption after we used Lucene 4.5.1 and 
released our product for many years.

Please let me know if you have any idea to this issue.

Thanks,
Tony Ma(马江)

Checking consistency of: [CHECK_INDEXES_CONSISTENCY]
Library child /dpwprd/dsearch/Data/Collection2 is not in consistent state, 
errors report:
 ============================================================
Library child name=/dpwprd/dsearch/Data/Collection2 indexes
consistency report.
============================================================
check external index consistency [database name: xhivedb;

index name: dmftdoc; segment id:

EI-0ab89c0c-2a9d-4fe2-97b9-5f0c96678f13-510173395289107-master;

xhive index id id: 510173395289107]
check lucene indices

fail: lucene index LI-0001cd61-342c-4cfe-9898-c293eb1c8c09

is not consistent; Segments file=segments_2 numSegments=5

version=4.5.1 format=
  1 of 5: name=_0 docCount=8341939


   codec=Lucene45
    compound=false
    numFiles=26
  

size (MB)=16,446.152
    diagnostics =

{timestamp=1514627603337, mergeFactor=6, os.version=6.1,

os=Windows Server 2008 R2, lucene.version=4.5.1 1533280 -

mark - 2013-10-17 21:37:01, source=merge, os.arch=amd64,

mergeMaxNumSegments=5, java.version=1.7.0_80,

java.vendor=Oracle Corporation}
    has deletions

[delGen=70]
    test: open reader.........OK [2295 deleted

docs]
    test: fields..............OK [268 fields]
   

test: field norms.........OK [3 fields]
    test: terms,

freq, prox...ERROR:

java.lang.ArrayIndexOutOfBoundsException


java.lang.ArrayIndexOutOfBoundsException
    test: stored

fields.......OK [16679288 total field count; avg 2 fields

per doc]
    test: term vectors........OK [0 total vector

count; avg 0 term/freq vector fields per doc]
    test:

docvalues...........OK [0 docvalues fields; 0 BINARY; 0

NUMERIC; 0 SORTED; 0 SORTED_SET]
FAILED
    WARNING:

fixIndex() would remove reference to this segment; full

exception:
java.lang.RuntimeException: Term Index test

failed
        at

org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:638)


        at

org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:372)


        at com.xhive.lucene.executor.j.a(xdb:1190)
        at

com.xhive.lucene.executor.j.aY(xdb:1166)
        at

com.xhive.lucene.executor.v.checkIndexConsistency(xdb:370)


        at

com.xhive.kernel.ay.externalIndexCheckConsistency(xdb:2523)


        at com.xhive.kernel.bn.handleRequest(xdb:2544)
        at

com.xhive.kernel.bn.run(xdb:222)
        at

java.lang.Thread.run(Thread.java:745)

  2 of 5: name=_1

docCount=2190595
    codec=Lucene45
    compound=false
 

  numFiles=25
    size (MB)=4,417.281
    diagnostics =

{timestamp=1514771642294, mergeFactor=6, os.version=6.1,

os=Windows Server 2008 R2, lucene.version=4.5.1 1533280 -

mark - 2013-10-17 21:37:01, source=merge, os.arch=amd64,

mergeMaxNumSegments=5, java.version=1.7.0_80,

java.vendor=Oracle Corporation}
    has deletions

[delGen=53]
    test: open reader.........OK [1312 deleted

docs]
    test: fields..............OK [262 fields]
   

test: field norms.........OK [3 fields]
    test: terms,

freq, prox...OK [54401729 terms; 787177718 terms/docs

pairs; 624929292 tokens]
    test (ignoring deletes):

terms, freq, prox...OK [54429967 terms; 787657042

terms/docs pairs; 625313964 tokens]
    test: stored

fields.......OK [4378566 total field count; avg 2 fields

per doc]
    test: term vectors........OK [0 total vector

count; avg 0 term/freq vector fields per doc]
    test:

docvalues...........OK [0 docvalues fields; 0 BINARY; 0

NUMERIC; 0 SORTED; 0 SORTED_SET]

  3 of 5: name=_2

docCount=851764
    codec=Lucene45
    compound=false
  

 numFiles=25
    size (MB)=1,711.874
    diagnostics =

{timestamp=1514814824171, mergeFactor=9, os.version=6.1,

os=Windows Server 2008 R2, lucene.version=4.5.1 1533280 -

mark - 2013-10-17 21:37:01, source=merge, os.arch=amd64,

mergeMaxNumSegments=5, java.version=1.7.0_80,

java.vendor=Oracle Corporation}
    has deletions

[delGen=63]
    test: open reader.........OK [764 deleted

docs]
    test: fields..............OK [254 fields]
   

test: field norms.........OK [3 fields]
    test: terms,

freq, prox...OK [21881507 terms; 307102495 terms/docs

pairs; 243816854 tokens]
    test (ignoring deletes):

terms, freq, prox...OK [21899561 terms; 307376274

terms/docs pairs; 244037071 tokens]
    test: stored

fields.......OK [1702000 total field count; avg 2 fields

per doc]
    test: term vectors........OK [0 total vector

count; avg 0 term/freq vector fields per doc]
    test:

docvalues...........OK [0 docvalues fields; 0 BINARY; 0

NUMERIC; 0 SORTED; 0 SORTED_SET]

  4 of 5: name=_3

docCount=553102
    codec=Lucene45
    compound=false
  

 numFiles=25
    size (MB)=1,122.343
    diagnostics =

{timestamp=1514858042321, mergeFactor=3, os.version=6.1,

os=Windows Server 2008 R2, lucene.version=4.5.1 1533280 -

mark - 2013-10-17 21:37:01, source=merge, os.arch=amd64,

mergeMaxNumSegments=5, java.version=1.7.0_80,

java.vendor=Oracle Corporation}
    has deletions

[delGen=70]
    test: open reader.........OK [4955 deleted

docs]
    test: fields..............OK [254 fields]
   

test: field norms.........OK [3 fields]
    test: terms,

freq, prox...OK [14773729 terms; 197409595 terms/docs

pairs; 156425802 tokens]
    test (ignoring deletes):

terms, freq, prox...OK [14890759 terms; 199279426

terms/docs pairs; 157925779 tokens]
    test: stored

fields.......OK [1096294 total field count; avg 2 fields

per doc]
    test: term vectors........OK [0 total vector

count; avg 0 term/freq vector fields per doc]
    test:

docvalues...........OK [0 docvalues fields; 0 BINARY; 0

NUMERIC; 0 SORTED; 0 SORTED_SET]

  5 of 5: name=_4

docCount=805370
    codec=Lucene45
    compound=false
  

 numFiles=11
    size (MB)=1,651.581
    diagnostics =

{timestamp=1515667638318, mergeFactor=7, os.version=6.1,

os=Windows Server 2008 R2, lucene.version=4.5.1 1533280 -

mark - 2013-10-17 21:37:01, source=merge, os.arch=amd64,

mergeMaxNumSegments=5, java.version=1.7.0_80,

java.vendor=Oracle Corporation}
    no deletions
   

test: open reader.........OK
    test:

fields..............OK [254 fields]
    test: field

norms.........OK [3 fields]
    test: terms, freq,

prox...OK [21490057 terms; 290045363 terms/docs pairs;

230248681 tokens]
    test: stored fields.......OK

[1610740 total field count; avg 2 fields per doc]
   

test: term vectors........OK [0 total vector count; avg 0

term/freq vector fields per doc]
    test:

docvalues...........OK [0 docvalues fields; 0 BINARY; 0

NUMERIC; 0 SORTED; 0 SORTED_SET]

WARNING: 1 broken

segments (containing 8339644 documents) detected


FAILED


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to