I manually did CheckIndex in all indexes and found two with issues:
first
Segments file=segments_42w numSegments=21 version=FORMAT_HAS_PROX [Lucene 2.4]
1 of 21: name=_109 docCount=10410
compound=true
hasProx=true
numFiles=1
size (MB)=55,789
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [170764 terms; 6214606 terms/docs
pairs; 29694242 tokens]
test: stored fields.......OK [10410 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
2 of 21: name=_160 docCount=2178
compound=true
hasProx=true
numFiles=1
size (MB)=11,676
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [92275 terms; 1264593 terms/docs
pairs; 5807908 tokens]
test: stored fields.......OK [2178 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
3 of 21: name=_1i9 docCount=2225
compound=true
hasProx=true
numFiles=1
size (MB)=10,872
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [92284 terms; 1193562 terms/docs
pairs; 5253891 tokens]
test: stored fields.......OK [2225 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
4 of 21: name=_1qx docCount=2654
compound=true
hasProx=true
numFiles=1
size (MB)=14,374
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [95153 terms; 1623058 terms/docs
pairs; 7151784 tokens]
test: stored fields.......OK [2654 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
5 of 21: name=_248 docCount=2165
compound=true
hasProx=true
numFiles=1
size (MB)=11,446
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [92177 terms; 1243801 terms/docs
pairs; 5632974 tokens]
test: stored fields.......OK [2165 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
6 of 21: name=_2a4 docCount=2508
compound=true
hasProx=true
numFiles=1
size (MB)=13,917
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [94222 terms; 1566966 terms/docs
pairs; 6967989 tokens]
test: stored fields.......OK [2508 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
7 of 21: name=_2pu docCount=2092
compound=true
hasProx=true
numFiles=1
size (MB)=10,217
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [91592 terms; 1105091 terms/docs
pairs; 4865932 tokens]
test: stored fields.......OK [2092 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
8 of 21: name=_35q docCount=3285
compound=true
hasProx=true
numFiles=1
size (MB)=16,378
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...ERROR [term
attachments:c7a0e2a47ebad4214e933621f6c9dac7: doc 426 <= lastDoc 426]
java.lang.RuntimeException: term
attachments:c7a0e2a47ebad4214e933621f6c9dac7: doc 426 <= lastDoc 426
at org.apache.lucene.index.CheckIndex.testTermIndex(CheckIndex.java:644)
at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:530)
at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:319)
at org.getopt.luke.Luke$6.run(Unknown Source)
test: stored fields.......OK [3285 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
FAILED
WARNING: fixIndex() would remove reference to this segment; full exception:
java.lang.RuntimeException: Term Index test failed
at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:543)
at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:319)
at org.getopt.luke.Luke$6.run(Unknown Source)
9 of 21: name=_3g7 docCount=2016
compound=true
hasProx=true
numFiles=1
size (MB)=11,813
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [91094 terms; 1286115 terms/docs
pairs; 5942307 tokens]
test: stored fields.......OK [2016 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
10 of 21: name=_3o8 docCount=1702
compound=true
hasProx=true
numFiles=1
size (MB)=10,008
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [89179 terms; 1081620 terms/docs
pairs; 4960310 tokens]
test: stored fields.......OK [1702 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
11 of 21: name=_3wq docCount=2249
compound=true
hasProx=true
numFiles=1
size (MB)=13,441
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [92829 terms; 1491635 terms/docs
pairs; 6851263 tokens]
test: stored fields.......OK [2249 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
12 of 21: name=_3zd docCount=2264
compound=true
hasProx=true
numFiles=1
size (MB)=14,153
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [92747 terms; 1602790 terms/docs
pairs; 7248272 tokens]
test: stored fields.......OK [2264 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
13 of 21: name=_44r docCount=2743
compound=true
hasProx=true
numFiles=1
size (MB)=16,56
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [95135 terms; 1817602 terms/docs
pairs; 8534778 tokens]
test: stored fields.......OK [2743 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
14 of 21: name=_4fl docCount=2105
compound=true
hasProx=true
numFiles=1
size (MB)=13,041
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [91940 terms; 1447522 terms/docs
pairs; 6565862 tokens]
test: stored fields.......OK [2105 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
15 of 21: name=_4gf docCount=1891
compound=true
hasProx=true
numFiles=1
size (MB)=10,88
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [90339 terms; 1186203 terms/docs
pairs; 5392055 tokens]
test: stored fields.......OK [1891 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
16 of 21: name=_54z docCount=2007
compound=true
hasProx=true
numFiles=1
size (MB)=10,848
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [91154 terms; 1199551 terms/docs
pairs; 5284819 tokens]
test: stored fields.......OK [2007 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
17 of 21: name=_5ao docCount=530
compound=true
hasProx=true
numFiles=1
size (MB)=3,743
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [76964 terms; 311259 terms/docs
pairs; 1569162 tokens]
test: stored fields.......OK [530 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
18 of 21: name=_5am docCount=113
compound=true
hasProx=true
numFiles=1
size (MB)=2,118
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [66266 terms; 142915 terms/docs
pairs; 711223 tokens]
test: stored fields.......OK [113 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
19 of 21: name=_5ap docCount=117
compound=true
hasProx=true
numFiles=1
size (MB)=0,525
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [15930 terms; 50518 terms/docs pairs;
203306 tokens]
test: stored fields.......OK [117 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
20 of 21: name=_5ar docCount=165
compound=true
hasProx=true
numFiles=1
size (MB)=0,73
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [22756 terms; 67573 terms/docs pairs;
273697 tokens]
test: stored fields.......OK [165 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
21 of 21: name=_5at docCount=115
compound=true
hasProx=true
numFiles=1
size (MB)=0,409
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [14805 terms; 40657 terms/docs pairs;
139527 tokens]
test: stored fields.......OK [115 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
WARNING: 1 broken segments (containing 3285 documents) detected
===============================================================================================
second:
Segments file=segments_7s4 numSegments=12 version=FORMAT_HAS_PROX [Lucene 2.4]
1 of 12: name=_3hm docCount=17119
compound=true
hasProx=true
numFiles=1
size (MB)=88,715
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...ERROR [term body:mas: doc 13174 <= lastDoc 13174]
java.lang.RuntimeException: term body:mas: doc 13174 <= lastDoc 13174
at org.apache.lucene.index.CheckIndex.testTermIndex(CheckIndex.java:644)
at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:530)
at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:319)
at org.getopt.luke.Luke$6.run(Unknown Source)
test: stored fields.......OK [17119 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
FAILED
WARNING: fixIndex() would remove reference to this segment; full exception:
java.lang.RuntimeException: Term Index test failed
at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:543)
at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:319)
at org.getopt.luke.Luke$6.run(Unknown Source)
2 of 12: name=_4bl docCount=3302
compound=true
hasProx=true
numFiles=1
size (MB)=19,256
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [98534 terms; 2136125 terms/docs
pairs; 9970629 tokens]
test: stored fields.......OK [3302 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
3 of 12: name=_4w9 docCount=4380
compound=true
hasProx=true
numFiles=1
size (MB)=27,348
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [103306 terms; 3081204 terms/docs
pairs; 14235295 tokens]
test: stored fields.......OK [4380 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
4 of 12: name=_5jh docCount=4372
compound=true
hasProx=true
numFiles=1
size (MB)=28,867
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [103363 terms; 3187068 terms/docs
pairs; 15176139 tokens]
test: stored fields.......OK [4372 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
5 of 12: name=_6nn docCount=6526
compound=true
hasProx=true
numFiles=1
size (MB)=32,316
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [111787 terms; 3720002 terms/docs
pairs; 16710886 tokens]
test: stored fields.......OK [6526 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
6 of 12: name=_82m docCount=6889
compound=true
hasProx=true
numFiles=1
size (MB)=36,753
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [113120 terms; 4284180 terms/docs
pairs; 19188993 tokens]
test: stored fields.......OK [6889 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
7 of 12: name=_8kv docCount=2286
compound=true
hasProx=true
numFiles=1
size (MB)=13,367
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [93101 terms; 1437797 terms/docs
pairs; 6844358 tokens]
test: stored fields.......OK [2286 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
8 of 12: name=_8i0 docCount=141
compound=true
hasProx=true
numFiles=1
size (MB)=1,618
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [63287 terms; 85330 terms/docs pairs;
414225 tokens]
test: stored fields.......OK [141 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
9 of 12: name=_8kt docCount=275
compound=true
hasProx=true
numFiles=1
size (MB)=2,549
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [73391 terms; 173116 terms/docs
pairs; 847099 tokens]
test: stored fields.......OK [275 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
10 of 12: name=_8n3 docCount=103
compound=true
hasProx=true
numFiles=1
size (MB)=0,618
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [18925 terms; 50696 terms/docs pairs;
248460 tokens]
test: stored fields.......OK [103 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
11 of 12: name=_8n4 docCount=3
compound=true
hasProx=true
numFiles=1
size (MB)=1,209
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [49084 terms; 49234 terms/docs pairs;
267700 tokens]
test: stored fields.......OK [3 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
12 of 12: name=_8n5 docCount=122
compound=true
hasProx=true
numFiles=1
size (MB)=0,66
no deletions
test: open reader.........OK
test: fields..............OK [6 fields]
test: field norms.........OK [6 fields]
test: terms, freq, prox...OK [20211 terms; 54518 terms/docs pairs;
267390 tokens]
test: stored fields.......OK [122 total field count; avg 1 fields per doc]
test: term vectors........OK [0 total vector count; avg 0
term/freq vector fields per doc]
WARNING: 1 broken segments (containing 17119 documents) detected
I have not been able to reproduce this in my env.
javier
On Fri, Nov 27, 2009 at 12:23 PM, jm <[email protected]> wrote:
> Ok, I got the index from the production machine, but I am having some
> problem to find the index..., our process deals with multiple indexes,
> in the current exception I cannot see any indication about the index
> having the issue. I opened all my indexes with luke and old opened
> succesfully, some had around 4000 docs (so I can discard those I
> think), and some had around 40.000 docs, I was hoping one that had
> around 24658 docs, but I guess this is the number of docs in one
> segment, not the whole index right?
>
> Any idea to help find the index?
>
> In the codebase it would be nice to give any hint about the index
> (path in case of a fsdirectory etc) when found an exception, dont
> assume there is a single index in the application.
>
> javier
>
> On Fri, Nov 27, 2009 at 11:39 AM, Michael McCandless
> <[email protected]> wrote:
>> Also, if you're able to reproduce this, can you call
>> writer.setInfoStream and capture & post the resulting output leading
>> up to the exception?
>>
>> Mike
>>
>> On Thu, Nov 26, 2009 at 7:12 AM, jm <[email protected]> wrote:
>>> The process is still running and ops dont want to stop it. As soon as
>>> stops I'll try checkindex.
>>>
>>> Its created brand new with 2.4.1
>>>
>>> On Thu, Nov 26, 2009 at 12:42 PM, Michael McCandless
>>> <[email protected]> wrote:
>>>> I think you're using a JRE that has the fix for the issue found in
>>>> LUCENE-1282.
>>>>
>>>> Can you run CheckIndex on your index and post the output?
>>>>
>>>> Was this index created from scratch on Lucene 2.4.1? Or, created from
>>>> an earlier Lucene version?
>>>>
>>>> Mike
>>>>
>>>> On Thu, Nov 26, 2009 at 6:03 AM, jm <[email protected]> wrote:
>>>>> or are we really? I think we are on 1.6 update 14 right??
>>>>>
>>>>> sorry Im lost right now on jdk version numbering
>>>>>
>>>>> On Thu, Nov 26, 2009 at 12:01 PM, jm <[email protected]> wrote:
>>>>>> on second thought...I hadnt noticed the jdk numbers properly, we are
>>>>>> using using b28, and JDK 6 Update 10 (b28) is the one fixing this...
>>>>>>
>>>>>> ok forget this then
>>>>>> thanks!
>>>>>>
>>>>>> On Thu, Nov 26, 2009 at 11:55 AM, jm <[email protected]> wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> Dont know if this should be here or in java-dev, posting to this one
>>>>>>> first. In one of our installations, we have encountered an exception:
>>>>>>>
>>>>>>> Exception in thread "Lucene Merge Thread #0"
>>>>>>> org.apache.lucene.index.MergePolicy$MergeException:
>>>>>>> org.apache.lucene.index.CorruptIndexException: docs out of order
>>>>>>> (24658 <= 24658 )
>>>>>>> at
>>>>>>> org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:309)
>>>>>>> at
>>>>>>> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:286)
>>>>>>> Caused by: org.apache.lucene.index.CorruptIndexException: docs out of
>>>>>>> order (24658 <= 24658 )
>>>>>>> at
>>>>>>> org.apache.lucene.index.SegmentMerger.appendPostings(SegmentMerger.java:641)
>>>>>>> at
>>>>>>> org.apache.lucene.index.SegmentMerger.mergeTermInfo(SegmentMerger.java:586)
>>>>>>> at
>>>>>>> org.apache.lucene.index.SegmentMerger.mergeTermInfos(SegmentMerger.java:547)
>>>>>>> at
>>>>>>> org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:500)
>>>>>>> at
>>>>>>> org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:140)
>>>>>>> at
>>>>>>> org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4231)
>>>>>>> at
>>>>>>> org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3884)
>>>>>>> at
>>>>>>> org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:205)
>>>>>>> at
>>>>>>> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:260)
>>>>>>>
>>>>>>> I have investigated in the list and it looked like
>>>>>>> https://issues.apache.org/jira/browse/LUCENE-1282. But we are using
>>>>>>> 2.4.1, and
>>>>>>> C:\Documents and Settings\Administrator>java -version
>>>>>>> java version "1.6.0_14"
>>>>>>> Java(TM) SE Runtime Environment (build 1.6.0_14-b08)
>>>>>>> Java HotSpot(TM) Client VM (build 14.0-b16, mixed mode)
>>>>>>>
>>>>>>> java process launched like this:
>>>>>>> java -server -Xmx1536m ...
>>>>>>>
>>>>>>> So I understand this bug should not be happening??
>>>>>>>
>>>>>>> any idea?
>>>>>>> thanks
>>>>>>> javi
>>>>>>>
>>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: [email protected]
>>>>> For additional commands, e-mail: [email protected]
>>>>>
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: [email protected]
>>>> For additional commands, e-mail: [email protected]
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [email protected]
>>> For additional commands, e-mail: [email protected]
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]