[ 
https://issues.apache.org/jira/browse/LUCENE-8043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242531#comment-16242531
 ] 

Yonik Seeley commented on LUCENE-8043:
--------------------------------------

At first I thought it might be more of a transient issue with reopen using the 
IW and seeing intermediate state that could be over the limit.  It was often 
the case that one could get exceptions about too many docs, but then after 
merges were finished and the IW was closed, we would be back under the limit.  
But not always.  Sometimes we are still over the limit after all threads have 
been stopped and we've called commit and close on the IndexWriter.  Below is a 
stack trace of that case:

{code}
DONE: time in sec:6 Docs indexed:20000 ramBytesUsed: sizeInBytes:220160
FAIL: unexpected exception:
org.apache.lucene.index.CorruptIndexException: Too many documents: an index 
cannot exceed 10000 but readers have total maxDoc=10010 
(resource=BufferedChecksumIndexInput(RAMInputStream(name=segments_4)))
        at 
org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:399)
        at 
org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:288)
        at 
org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:59)
        at 
org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:56)
        at 
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:667)
        at 
org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:79)
        at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:63)
        at YCS_IndexTest7.main(YCS_IndexTest7.java:262)
{code}

> Attempting to add documents past limit can corrupt index
> --------------------------------------------------------
>
>                 Key: LUCENE-8043
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8043
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/index
>    Affects Versions: 4.10
>            Reporter: Yonik Seeley
>         Attachments: LUCENE-8043.patch
>
>
> The IndexWriter check for too many documents does not always work, resulting 
> in going over the limit.  Once this happens, Lucene refuses to open the index 
> and throws a CorruptIndexException: Too many documents.
> This appears to affect all versions of Lucene/Solr (the check was first 
> implemented in LUCENE-5843 in v4.9.1/4.10 and we've seen this manifest in 
> 4.10) 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to