[jira] [Commented] (LUCENE-8043) Attempting to add documents past limit can corrupt index

2017-12-04 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16277490#comment-16277490
 ] 

ASF subversion and git services commented on LUCENE-8043:
-

Commit 65a716911f35c304ae9da6d4ebb865509787548e in lucene-solr's branch 
refs/heads/branch_7_1 from [~simonw]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=65a7169 ]

LUCENE-8043: Fix document accounting in IndexWriter

The IndexWriter check for too many documents does not always work, resulting in
going over the limit. Once this happens, Lucene refuses to open the index and
throws a CorruptIndexException: Too many documents.
This change also fixes document accounting if the index writer hits an aborting
exception and/or the writer is rolled back. Pending document counts are now 
consistent
with the latest SegmentInfos once the writer has been rolled back.


> Attempting to add documents past limit can corrupt index
> 
>
> Key: LUCENE-8043
> URL: https://issues.apache.org/jira/browse/LUCENE-8043
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.10, 7.0, master (8.0)
>Reporter: Yonik Seeley
>Assignee: Simon Willnauer
> Fix For: master (8.0), 7.2, 7.1.1
>
> Attachments: LUCENE-8043.patch, LUCENE-8043.patch, LUCENE-8043.patch, 
> LUCENE-8043.patch, LUCENE-8043.patch, YCS_IndexTest7a.java
>
>
> The IndexWriter check for too many documents does not always work, resulting 
> in going over the limit.  Once this happens, Lucene refuses to open the index 
> and throws a CorruptIndexException: Too many documents.
> This appears to affect all versions of Lucene/Solr (the check was first 
> implemented in LUCENE-5843 in v4.9.1/4.10 and we've seen this manifest in 
> 4.10) 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8043) Attempting to add documents past limit can corrupt index

2017-12-04 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16277489#comment-16277489
 ] 

ASF subversion and git services commented on LUCENE-8043:
-

Commit 0bc07bc02a2bb5253f85bbca97041c76e4509f5f in lucene-solr's branch 
refs/heads/branch_7x from [~simonw]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=0bc07bc ]

LUCENE-8043: Fix document accounting in IndexWriter

The IndexWriter check for too many documents does not always work, resulting in
going over the limit. Once this happens, Lucene refuses to open the index and
throws a CorruptIndexException: Too many documents.
This change also fixes document accounting if the index writer hits an aborting
exception and/or the writer is rolled back. Pending document counts are now 
consistent
with the latest SegmentInfos once the writer has been rolled back.


> Attempting to add documents past limit can corrupt index
> 
>
> Key: LUCENE-8043
> URL: https://issues.apache.org/jira/browse/LUCENE-8043
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.10, 7.0, master (8.0)
>Reporter: Yonik Seeley
>Assignee: Simon Willnauer
> Fix For: master (8.0), 7.2, 7.1.1
>
> Attachments: LUCENE-8043.patch, LUCENE-8043.patch, LUCENE-8043.patch, 
> LUCENE-8043.patch, LUCENE-8043.patch, YCS_IndexTest7a.java
>
>
> The IndexWriter check for too many documents does not always work, resulting 
> in going over the limit.  Once this happens, Lucene refuses to open the index 
> and throws a CorruptIndexException: Too many documents.
> This appears to affect all versions of Lucene/Solr (the check was first 
> implemented in LUCENE-5843 in v4.9.1/4.10 and we've seen this manifest in 
> 4.10) 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8043) Attempting to add documents past limit can corrupt index

2017-12-04 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16277472#comment-16277472
 ] 

ASF subversion and git services commented on LUCENE-8043:
-

Commit b7d8731bbf2a9278c22efa5a7fb43285236c90ba in lucene-solr's branch 
refs/heads/master from [~simonw]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=b7d8731 ]

LUCENE-8043: Fix document accounting in IndexWriter

The IndexWriter check for too many documents does not always work, resulting in
going over the limit. Once this happens, Lucene refuses to open the index and
throws a CorruptIndexException: Too many documents.
This change also fixes document accounting if the index writer hits an aborting
exception and/or the writer is rolled back. Pending document counts are now 
consistent
with the latest SegmentInfos once the writer has been rolled back.


> Attempting to add documents past limit can corrupt index
> 
>
> Key: LUCENE-8043
> URL: https://issues.apache.org/jira/browse/LUCENE-8043
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.10, 7.0, master (8.0)
>Reporter: Yonik Seeley
>Assignee: Simon Willnauer
> Fix For: master (8.0), 7.2, 7.1.1
>
> Attachments: LUCENE-8043.patch, LUCENE-8043.patch, LUCENE-8043.patch, 
> LUCENE-8043.patch, LUCENE-8043.patch, YCS_IndexTest7a.java
>
>
> The IndexWriter check for too many documents does not always work, resulting 
> in going over the limit.  Once this happens, Lucene refuses to open the index 
> and throws a CorruptIndexException: Too many documents.
> This appears to affect all versions of Lucene/Solr (the check was first 
> implemented in LUCENE-5843 in v4.9.1/4.10 and we've seen this manifest in 
> 4.10) 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8043) Attempting to add documents past limit can corrupt index

2017-12-04 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16277397#comment-16277397
 ] 

Michael McCandless commented on LUCENE-8043:


+1 to the patch!  Phew that was tricky; thanks @simonw.

I beasted all Lucene tests 113X times and only hit 3 failures from LUCENE-8073.

+1 to push!

> Attempting to add documents past limit can corrupt index
> 
>
> Key: LUCENE-8043
> URL: https://issues.apache.org/jira/browse/LUCENE-8043
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.10, 7.0, master (8.0)
>Reporter: Yonik Seeley
>Assignee: Simon Willnauer
> Fix For: master (8.0), 7.2, 7.1.1
>
> Attachments: LUCENE-8043.patch, LUCENE-8043.patch, LUCENE-8043.patch, 
> LUCENE-8043.patch, LUCENE-8043.patch, YCS_IndexTest7a.java
>
>
> The IndexWriter check for too many documents does not always work, resulting 
> in going over the limit.  Once this happens, Lucene refuses to open the index 
> and throws a CorruptIndexException: Too many documents.
> This appears to affect all versions of Lucene/Solr (the check was first 
> implemented in LUCENE-5843 in v4.9.1/4.10 and we've seen this manifest in 
> 4.10) 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8043) Attempting to add documents past limit can corrupt index

2017-12-01 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16274607#comment-16274607
 ] 

Michael McCandless commented on LUCENE-8043:


Thanks [~simonw]; I love the new assert, and the patch looks correct to me.

I beasted all Lucene tests 33 times and hit this failure, twice:

{noformat}
ant test -Dtestcase=TestIndexWriter -Dtestmethod=testThreadInterruptDeadlock 
-Dtests.seed=55197CA38E8C827B

java.lang.AssertionError: pendingNumDocs 0 != 11 totalMaxDoc
at org.apache.lucene.index.IndexWriter.shutdown(IndexWriter.java:1277)
at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1319)
at 
org.apache.lucene.index.TestIndexWriter$IndexerThreadInterrupt.run(TestIndexWriter.java:902)
{noformat}

But it does not reproduce for me.

I hit two other unrelated failures; look like Similarity issues ... I'll open 
separate issues for those.

> Attempting to add documents past limit can corrupt index
> 
>
> Key: LUCENE-8043
> URL: https://issues.apache.org/jira/browse/LUCENE-8043
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.10, 7.0, master (8.0)
>Reporter: Yonik Seeley
>Assignee: Simon Willnauer
> Fix For: master (8.0), 7.2, 7.1.1
>
> Attachments: LUCENE-8043.patch, LUCENE-8043.patch, LUCENE-8043.patch, 
> LUCENE-8043.patch, YCS_IndexTest7a.java
>
>
> The IndexWriter check for too many documents does not always work, resulting 
> in going over the limit.  Once this happens, Lucene refuses to open the index 
> and throws a CorruptIndexException: Too many documents.
> This appears to affect all versions of Lucene/Solr (the check was first 
> implemented in LUCENE-5843 in v4.9.1/4.10 and we've seen this manifest in 
> 4.10) 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8043) Attempting to add documents past limit can corrupt index

2017-12-01 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16274259#comment-16274259
 ] 

Michael McCandless commented on LUCENE-8043:


Thanks [~simonw]; I'll look and beast the patch.

> Attempting to add documents past limit can corrupt index
> 
>
> Key: LUCENE-8043
> URL: https://issues.apache.org/jira/browse/LUCENE-8043
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.10, 7.0, master (8.0)
>Reporter: Yonik Seeley
>Assignee: Simon Willnauer
> Fix For: master (8.0), 7.2, 7.1.1
>
> Attachments: LUCENE-8043.patch, LUCENE-8043.patch, LUCENE-8043.patch, 
> YCS_IndexTest7a.java
>
>
> The IndexWriter check for too many documents does not always work, resulting 
> in going over the limit.  Once this happens, Lucene refuses to open the index 
> and throws a CorruptIndexException: Too many documents.
> This appears to affect all versions of Lucene/Solr (the check was first 
> implemented in LUCENE-5843 in v4.9.1/4.10 and we've seen this manifest in 
> 4.10) 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8043) Attempting to add documents past limit can corrupt index

2017-11-30 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16273711#comment-16273711
 ] 

Michael McCandless commented on LUCENE-8043:


Wow, what an evil test :)  +1 to the patch; thanks @simonw and 
[~ysee...@gmail.com]!

> Attempting to add documents past limit can corrupt index
> 
>
> Key: LUCENE-8043
> URL: https://issues.apache.org/jira/browse/LUCENE-8043
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.10, 7.0, master (8.0)
>Reporter: Yonik Seeley
>Assignee: Simon Willnauer
> Fix For: master (8.0), 7.2, 7.1.1
>
> Attachments: LUCENE-8043.patch, LUCENE-8043.patch, 
> YCS_IndexTest7a.java
>
>
> The IndexWriter check for too many documents does not always work, resulting 
> in going over the limit.  Once this happens, Lucene refuses to open the index 
> and throws a CorruptIndexException: Too many documents.
> This appears to affect all versions of Lucene/Solr (the check was first 
> implemented in LUCENE-5843 in v4.9.1/4.10 and we've seen this manifest in 
> 4.10) 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8043) Attempting to add documents past limit can corrupt index

2017-11-29 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16272238#comment-16272238
 ] 

Simon Willnauer commented on LUCENE-8043:
-

{quote}Turns out the test code that failed with a small amount of updates, even 
after my attempted fix, was for 4.10.3 / 4.10.4
I forward-ported that code to master and things no longer fail... so I think 
this patch is good for recent Lucene versions. Thanks!{quote}

[~yo...@apache.org] do you have a test case I can use to verify the patch going 
forward? Can you share it? I will also try to turn your reproduction into a 
testcase but maybe we should push the fix first to not be in the way of a 
release, WDYT?

> Attempting to add documents past limit can corrupt index
> 
>
> Key: LUCENE-8043
> URL: https://issues.apache.org/jira/browse/LUCENE-8043
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.10, 7.0, master (8.0)
>Reporter: Yonik Seeley
>Assignee: Simon Willnauer
> Attachments: LUCENE-8043.patch
>
>
> The IndexWriter check for too many documents does not always work, resulting 
> in going over the limit.  Once this happens, Lucene refuses to open the index 
> and throws a CorruptIndexException: Too many documents.
> This appears to affect all versions of Lucene/Solr (the check was first 
> implemented in LUCENE-5843 in v4.9.1/4.10 and we've seen this manifest in 
> 4.10) 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8043) Attempting to add documents past limit can corrupt index

2017-11-29 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16272010#comment-16272010
 ] 

Yonik Seeley commented on LUCENE-8043:
--

Turns out the test code that failed with a small amount of updates, even after 
my attempted fix, was for 4.10.3 / 4.10.4
I forward-ported that code to master and things no longer fail... so I think 
this patch is good for recent Lucene versions. Thanks!

> Attempting to add documents past limit can corrupt index
> 
>
> Key: LUCENE-8043
> URL: https://issues.apache.org/jira/browse/LUCENE-8043
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.10, 7.0, master (8.0)
>Reporter: Yonik Seeley
>Assignee: Simon Willnauer
> Attachments: LUCENE-8043.patch
>
>
> The IndexWriter check for too many documents does not always work, resulting 
> in going over the limit.  Once this happens, Lucene refuses to open the index 
> and throws a CorruptIndexException: Too many documents.
> This appears to affect all versions of Lucene/Solr (the check was first 
> implemented in LUCENE-5843 in v4.9.1/4.10 and we've seen this manifest in 
> 4.10) 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8043) Attempting to add documents past limit can corrupt index

2017-11-29 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271874#comment-16271874
 ] 

Yonik Seeley commented on LUCENE-8043:
--

I had worked on tracking this down for a bit before I got pulled off onto 
something else...
I remember adding the boolean to drop() just as this patch does, but when using 
that I only put the conditional around the pendingNumDocs decrement (in 
multiple places).  Perhaps that's why it didn't work to fix the issue for me...

I also exposed pendingNumDocs for testing reasons and then tested it against 
expected values, and was able to get tests that reliably failed after a handful 
of updates.  I'll try digging that up and see if it passes with this patch.

> Attempting to add documents past limit can corrupt index
> 
>
> Key: LUCENE-8043
> URL: https://issues.apache.org/jira/browse/LUCENE-8043
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.10, 7.0, master (8.0)
>Reporter: Yonik Seeley
>Assignee: Simon Willnauer
> Attachments: LUCENE-8043.patch
>
>
> The IndexWriter check for too many documents does not always work, resulting 
> in going over the limit.  Once this happens, Lucene refuses to open the index 
> and throws a CorruptIndexException: Too many documents.
> This appears to affect all versions of Lucene/Solr (the check was first 
> implemented in LUCENE-5843 in v4.9.1/4.10 and we've seen this manifest in 
> 4.10) 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8043) Attempting to add documents past limit can corrupt index

2017-11-29 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271782#comment-16271782
 ] 

Michael McCandless commented on LUCENE-8043:


Wow, nice find [~simonw]!  It is normal for drop to be called more than once, I 
think, so I think your fix is the right approach!  Thanks.

> Attempting to add documents past limit can corrupt index
> 
>
> Key: LUCENE-8043
> URL: https://issues.apache.org/jira/browse/LUCENE-8043
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.10, 7.0, master (8.0)
>Reporter: Yonik Seeley
>Assignee: Simon Willnauer
> Attachments: LUCENE-8043.patch
>
>
> The IndexWriter check for too many documents does not always work, resulting 
> in going over the limit.  Once this happens, Lucene refuses to open the index 
> and throws a CorruptIndexException: Too many documents.
> This appears to affect all versions of Lucene/Solr (the check was first 
> implemented in LUCENE-5843 in v4.9.1/4.10 and we've seen this manifest in 
> 4.10) 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8043) Attempting to add documents past limit can corrupt index

2017-11-29 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271767#comment-16271767
 ] 

Simon Willnauer commented on LUCENE-8043:
-

[~jpountz] [~yo...@apache.org] [~mikemccand] I think I found the issue. It 
seems like we try to drop the same segment reader from the reader pool multiple 
times during applying deletes which I am not 100% sure is expected or not. Yet, 
due to that we also reduce the counter for that segment multiple times. With 
this patch I can run the test 1k times without a failure. I am happy to provide 
a patch for it but I wonder if this is an expected state? [~mikemccand] can you 
tell.

{code}diff --git 
a/lucene/core/src/java/org/apache/lucene/index/IndexWriter.java 
b/lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
index 7f47e42d45..586a294915 100644
--- a/lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
+++ b/lucene/core/src/java/org/apache/lucene/index/IndexWriter.java
@@ -551,13 +551,15 @@ public class IndexWriter implements Closeable, 
TwoPhaseCommit, Accountable {
   return true;
 }
 
-public synchronized void drop(SegmentCommitInfo info) throws IOException {
+public synchronized boolean drop(SegmentCommitInfo info) throws 
IOException {
   final ReadersAndUpdates rld = readerMap.get(info);
   if (rld != null) {
 assert info == rld.info;
 readerMap.remove(info);
 rld.dropReaders();
+return true;
   }
+  return false;
 }
 
 public synchronized long ramBytesUsed() {
@@ -1616,10 +1618,9 @@ public class IndexWriter implements Closeable, 
TwoPhaseCommit, Accountable {
 // segment, we leave it in the readerPool; the
 // merge will skip merging it and will then drop
 // it once it's done:
-if (mergingSegments.contains(info) == false) {
+if (mergingSegments.contains(info) == false && readerPool.drop(info)) {
   segmentInfos.remove(info);
   pendingNumDocs.addAndGet(-info.info.maxDoc());
-  readerPool.drop(info);
 }
   }{code}

> Attempting to add documents past limit can corrupt index
> 
>
> Key: LUCENE-8043
> URL: https://issues.apache.org/jira/browse/LUCENE-8043
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.10, 7.0, master (8.0)
>Reporter: Yonik Seeley
> Attachments: LUCENE-8043.patch
>
>
> The IndexWriter check for too many documents does not always work, resulting 
> in going over the limit.  Once this happens, Lucene refuses to open the index 
> and throws a CorruptIndexException: Too many documents.
> This appears to affect all versions of Lucene/Solr (the check was first 
> implemented in LUCENE-5843 in v4.9.1/4.10 and we've seen this manifest in 
> 4.10) 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8043) Attempting to add documents past limit can corrupt index

2017-11-29 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16270441#comment-16270441
 ] 

Simon Willnauer commented on LUCENE-8043:
-

[~jpountz] I can look later at this and try to reproduce it.

> Attempting to add documents past limit can corrupt index
> 
>
> Key: LUCENE-8043
> URL: https://issues.apache.org/jira/browse/LUCENE-8043
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.10, 7.0, master (8.0)
>Reporter: Yonik Seeley
> Attachments: LUCENE-8043.patch
>
>
> The IndexWriter check for too many documents does not always work, resulting 
> in going over the limit.  Once this happens, Lucene refuses to open the index 
> and throws a CorruptIndexException: Too many documents.
> This appears to affect all versions of Lucene/Solr (the check was first 
> implemented in LUCENE-5843 in v4.9.1/4.10 and we've seen this manifest in 
> 4.10) 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8043) Attempting to add documents past limit can corrupt index

2017-11-28 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16269209#comment-16269209
 ] 

Adrien Grand commented on LUCENE-8043:
--

I can reproduce this but I'm not familiar enough with IndexWriter to understand 
what causes this. At first I thought thay maybe this was due to the fact the we 
were giving back documents to early after merges, but actually we do that after 
updating the list of segment infos, so that looks ok to me. Yet this doesn't 
prevent the list of segment infos from reaching more that MAX_DOCS documents in 
{{IndexWriter.publishFlushedSegment}} during the test. [~simonwillnauer] or 
[~mikemccand] Do you know why this may occur?

I wanted to look at the IW info stream to better understand what is happening 
but unfortunately this probably slows down things enough to prevent the issue 
from reproducing. It reproduces with assertions enabled ({{-ea}}), but no 
assertion breaks.

> Attempting to add documents past limit can corrupt index
> 
>
> Key: LUCENE-8043
> URL: https://issues.apache.org/jira/browse/LUCENE-8043
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.10, 7.0, master (8.0)
>Reporter: Yonik Seeley
> Attachments: LUCENE-8043.patch
>
>
> The IndexWriter check for too many documents does not always work, resulting 
> in going over the limit.  Once this happens, Lucene refuses to open the index 
> and throws a CorruptIndexException: Too many documents.
> This appears to affect all versions of Lucene/Solr (the check was first 
> implemented in LUCENE-5843 in v4.9.1/4.10 and we've seen this manifest in 
> 4.10) 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8043) Attempting to add documents past limit can corrupt index

2017-11-07 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16242531#comment-16242531
 ] 

Yonik Seeley commented on LUCENE-8043:
--

At first I thought it might be more of a transient issue with reopen using the 
IW and seeing intermediate state that could be over the limit.  It was often 
the case that one could get exceptions about too many docs, but then after 
merges were finished and the IW was closed, we would be back under the limit.  
But not always.  Sometimes we are still over the limit after all threads have 
been stopped and we've called commit and close on the IndexWriter.  Below is a 
stack trace of that case:

{code}
DONE: time in sec:6 Docs indexed:2 ramBytesUsed: sizeInBytes:220160
FAIL: unexpected exception:
org.apache.lucene.index.CorruptIndexException: Too many documents: an index 
cannot exceed 1 but readers have total maxDoc=10010 
(resource=BufferedChecksumIndexInput(RAMInputStream(name=segments_4)))
at 
org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:399)
at 
org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:288)
at 
org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:59)
at 
org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:56)
at 
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:667)
at 
org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:79)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:63)
at YCS_IndexTest7.main(YCS_IndexTest7.java:262)
{code}

> Attempting to add documents past limit can corrupt index
> 
>
> Key: LUCENE-8043
> URL: https://issues.apache.org/jira/browse/LUCENE-8043
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.10
>Reporter: Yonik Seeley
> Attachments: LUCENE-8043.patch
>
>
> The IndexWriter check for too many documents does not always work, resulting 
> in going over the limit.  Once this happens, Lucene refuses to open the index 
> and throws a CorruptIndexException: Too many documents.
> This appears to affect all versions of Lucene/Solr (the check was first 
> implemented in LUCENE-5843 in v4.9.1/4.10 and we've seen this manifest in 
> 4.10) 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org