Hi, I tried to use the CheckIndex tool (the latest svn code) and I was surprised to notice that all my indexes from production (around 30) are corrupt. This is highly unlikely because they were running for about one year and I had no exception during search so far.
One recurring pattern I observed is that the tool reports the segments with deleted docs as corrupt. The one without deleted docs are fine.. Here is a sample output. index 1 6 of 7: name=_wxlp docCount=1001 compound=true numFiles=1 size (MB)=0.213 no deletions test: open reader.........OK test: fields, norms.......OK [12 fields] test: terms, freq, prox...OK [4142 terms; 8004 terms/docs pairs; 8006 tokens] test: stored fields.......OK [12012 total field count; avg 12 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] 7 of 7: name=_wxqg docCount=178 compound=true numFiles=1 size (MB)=0.039 no deletions test: open reader.........OK test: fields, norms.......OK [12 fields] test: terms, freq, prox...OK [819 terms; 1417 terms/docs pairs; 1417 tokens] test: stored fields.......OK [2136 total field count; avg 12 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] index 2 6 of 7: name=_10hr docCount=1978 compound=true numFiles=2 size (MB)=3.601 has deletions [delFileName=_10hr_5.del] test: open reader.........OK [17 deleted docs] test: fields, norms.......OK [10 fields] test: terms, freq, prox...FAILED WARNING: would remove reference to this segment (-fix was not specified); full exception: java.lang.RuntimeException: term ASIN:342678033X docFreq=5 != num docs seen 4 at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:217) 7 of 7: name=_10i0 docCount=196 compound=true numFiles=1 size (MB)=0.44 no deletions test: open reader.........OK test: fields, norms.......OK [10 fields] test: terms, freq, prox...OK [8611 terms; 24307 terms/docs pairs; 32841 tokens] test: stored fields.......OK [1960 total field count; avg 10 fields per doc] test: term vectors........OK [0 total vector count; avg 0 term/freq vector fields per doc] Is this a known issue or my indexes are really corrupt ? Regards, Bogdan