[jira] [Comment Edited] (LUCENE-9621) pendingNumDocs doesn't match totalMaxDoc if tragedy on flush()
[ https://issues.apache.org/jira/browse/LUCENE-9621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17248133#comment-17248133 ] Michael Froh edited comment on LUCENE-9621 at 12/11/20, 7:21 PM: - Regarding the assertion failure, it looks like the call to {{adjustPendingNumDocs}} in {{rollbackInternalNoCommit}} is being call with 0 (as both {{totalMaxDoc}} and {{rollbackMaxDoc}} are both 0). It feels to me like when we roll back on tragedy, the {{IndexWriter}} is known to be in a bad state, so it's not really surprising that {{pendingNumDocs}} and {{segmentInfos.totalMaxDoc()}} are out of sync. Maybe the fix is to skip that assertion when called from {{maybeCloseOnTragicEvent}}, so that it doesn't mask the real tragedy? was (Author: msfroh): Regarding the assertion failure, it looks like the call to {{adjustPendingNumDocs}} in {{rollbackInternalNoCommit}} is being call with 0 (as both {{totalMaxDoc}} and {{rollbackMaxDoc}} are both 0). It feels to me like when we roll back on tragedy, the {{IndexWriter}} is known to be in a bad state, so it's not really surprising that {{pendingNumDocs}} and {{segmentInfos.totalMaxDoc()}} are out of sync. Maybe the fix is to skip that assertion when called from {{maybeCloseOnTragicEvent, so that it doesn't mask the real tragedy?}} > pendingNumDocs doesn't match totalMaxDoc if tragedy on flush() > -- > > Key: LUCENE-9621 > URL: https://issues.apache.org/jira/browse/LUCENE-9621 > Project: Lucene - Core > Issue Type: Bug > Components: core/index >Affects Versions: 8.6.3 >Reporter: Michael Froh >Priority: Major > > While implementing a test to trigger an OutOfMemoryError on flush() in > https://github.com/apache/lucene-solr/pull/2088, I noticed that the OOME was > followed by an assertion failure on rollback with the following stacktrace: > {code:java} > java.lang.AssertionError: pendingNumDocs 1 != 0 totalMaxDoc > at > __randomizedtesting.SeedInfo.seed([ABBF17C4E0FCDEE5:DDC8E99910AFC8FF]:0) > at > org.apache.lucene.index.IndexWriter.rollbackInternal(IndexWriter.java:2398) > at > org.apache.lucene.index.IndexWriter.maybeCloseOnTragicEvent(IndexWriter.java:5196) > at > org.apache.lucene.index.IndexWriter.tragicEvent(IndexWriter.java:5186) > at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3932) > at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3874) > at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3853) > at > org.apache.lucene.index.TestIndexWriterDelete.testDeleteAllRepeated(TestIndexWriterDelete.java:496) > {code} > We should probably look into how exactly we behave with this kind of tragedy > on flush(). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9621) pendingNumDocs doesn't match totalMaxDoc if tragedy on flush()
[ https://issues.apache.org/jira/browse/LUCENE-9621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17248133#comment-17248133 ] Michael Froh commented on LUCENE-9621: -- Regarding the assertion failure, it looks like the call to {{adjustPendingNumDocs}} in {{rollbackInternalNoCommit}} is being call with 0 (as both {{totalMaxDoc}} and {{rollbackMaxDoc}} are both 0). It feels to me like when we roll back on tragedy, the {{IndexWriter}} is known to be in a bad state, so it's not really surprising that {{pendingNumDocs}} and {{segmentInfos.totalMaxDoc()}} are out of sync. Maybe the fix is to skip that assertion when called from {{maybeCloseOnTragicEvent, so that it doesn't mask the real tragedy?}} > pendingNumDocs doesn't match totalMaxDoc if tragedy on flush() > -- > > Key: LUCENE-9621 > URL: https://issues.apache.org/jira/browse/LUCENE-9621 > Project: Lucene - Core > Issue Type: Bug > Components: core/index >Affects Versions: 8.6.3 >Reporter: Michael Froh >Priority: Major > > While implementing a test to trigger an OutOfMemoryError on flush() in > https://github.com/apache/lucene-solr/pull/2088, I noticed that the OOME was > followed by an assertion failure on rollback with the following stacktrace: > {code:java} > java.lang.AssertionError: pendingNumDocs 1 != 0 totalMaxDoc > at > __randomizedtesting.SeedInfo.seed([ABBF17C4E0FCDEE5:DDC8E99910AFC8FF]:0) > at > org.apache.lucene.index.IndexWriter.rollbackInternal(IndexWriter.java:2398) > at > org.apache.lucene.index.IndexWriter.maybeCloseOnTragicEvent(IndexWriter.java:5196) > at > org.apache.lucene.index.IndexWriter.tragicEvent(IndexWriter.java:5186) > at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3932) > at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3874) > at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3853) > at > org.apache.lucene.index.TestIndexWriterDelete.testDeleteAllRepeated(TestIndexWriterDelete.java:496) > {code} > We should probably look into how exactly we behave with this kind of tragedy > on flush(). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9621) pendingNumDocs doesn't match totalMaxDoc if tragedy on flush()
[ https://issues.apache.org/jira/browse/LUCENE-9621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17248119#comment-17248119 ] Michael Froh edited comment on LUCENE-9621 at 12/11/20, 6:55 PM: - I added a {{printStackTrace}} to {{onTragicEvent}} and got the following: {code:java} java.lang.OutOfMemoryError: Java heap space at org.apache.lucene.index.FieldInfos.(FieldInfos.java:125) at org.apache.lucene.index.FieldInfos$Builder.finish(FieldInfos.java:645) at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:291) at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:480) at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:660) at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3899) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3874) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3853) at org.apache.lucene.index.TestIndexWriterDelete.testDeleteAllRepeated(TestIndexWriterDelete.java:499) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1754) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:942) at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:978) at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:992) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:370) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:819) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:470) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:951) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:887) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:898) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) {code} This is the leak that I called out and fixed in https://issues.apache.org/jira/browse/LUCENE-9617. If we add documents and call {{deleteAll}} on the same {{IndexWriter}} repeatedly, it leaks field numbers and tries allocating a huge array in {{FieldInfos}} to accommodate the largest known field number. was (Author: msfroh): I added a {{printStackTrace}} to {{onTragicEvent}} and got the following: {code:java} java.lang.OutOfMemoryError: Java heap space at org.apache.lucene.index.FieldInfos.(FieldInfos.java:125) at org.apache.lucene.index.FieldInfos$Builder.finish(FieldInfos.java:645) at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:291) at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:480) at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:660) at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3899) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3874) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3853) at org.apache.lucene.index.TestIndexWriterDelete.testDeleteAllRepeated(TestIndexWriterDelete.java:499) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at
[jira] [Commented] (LUCENE-9621) pendingNumDocs doesn't match totalMaxDoc if tragedy on flush()
[ https://issues.apache.org/jira/browse/LUCENE-9621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17248119#comment-17248119 ] Michael Froh commented on LUCENE-9621: -- I added a {{printStackTrace}} to {{onTragicEvent}} and got the following: {code:java} java.lang.OutOfMemoryError: Java heap space at org.apache.lucene.index.FieldInfos.(FieldInfos.java:125) at org.apache.lucene.index.FieldInfos$Builder.finish(FieldInfos.java:645) at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:291) at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:480) at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:660) at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3899) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3874) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3853) at org.apache.lucene.index.TestIndexWriterDelete.testDeleteAllRepeated(TestIndexWriterDelete.java:499) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1754) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:942) at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:978) at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:992) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:370) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:819) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:470) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:951) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:887) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:898) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) {code} This is the leak that I called out and fixed in https://issues.apache.org/jira/browse/LUCENE-9617. If we call {{deleteAll}} on the same {{IndexWriter}} repeatedly, it leaks field numbers and tries allocating a huge array in {{FieldInfos}} to accommodate the largest known field number. > pendingNumDocs doesn't match totalMaxDoc if tragedy on flush() > -- > > Key: LUCENE-9621 > URL: https://issues.apache.org/jira/browse/LUCENE-9621 > Project: Lucene - Core > Issue Type: Bug > Components: core/index >Affects Versions: 8.6.3 >Reporter: Michael Froh >Priority: Major > > While implementing a test to trigger an OutOfMemoryError on flush() in > https://github.com/apache/lucene-solr/pull/2088, I noticed that the OOME was > followed by an assertion failure on rollback with the following stacktrace: > {code:java} > java.lang.AssertionError: pendingNumDocs 1 != 0 totalMaxDoc > at > __randomizedtesting.SeedInfo.seed([ABBF17C4E0FCDEE5:DDC8E99910AFC8FF]:0) > at > org.apache.lucene.index.IndexWriter.rollbackInternal(IndexWriter.java:2398) > at > org.apache.lucene.index.IndexWriter.maybeCloseOnTragicEvent(IndexWriter.java:5196) > at > org.apache.lucene.index.IndexWriter.tragicEvent(IndexWriter.java:5186) > at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3932) > at
[jira] [Updated] (LUCENE-9621) pendingNumDocs doesn't match totalMaxDoc if tragedy on flush()
[ https://issues.apache.org/jira/browse/LUCENE-9621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Froh updated LUCENE-9621: - Description: While implementing a test to trigger an OutOfMemoryError on flush() in https://github.com/apache/lucene-solr/pull/2088, I noticed that the OOME was followed by an assertion failure on rollback with the following stacktrace: {{java.lang.AssertionError: pendingNumDocs 1 != 0 totalMaxDoc at __randomizedtesting.SeedInfo.seed([ABBF17C4E0FCDEE5:DDC8E99910AFC8FF]:0) at org.apache.lucene.index.IndexWriter.rollbackInternal(IndexWriter.java:2398) at org.apache.lucene.index.IndexWriter.maybeCloseOnTragicEvent(IndexWriter.java:5196) at org.apache.lucene.index.IndexWriter.tragicEvent(IndexWriter.java:5186) at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3932) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3874) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3853) at org.apache.lucene.index.TestIndexWriterDelete.testDeleteAllRepeated(TestIndexWriterDelete.java:496)}} We should probably look into how exactly we behave with this kind of tragedy on flush(). was: While implementing a test to trigger an OutOfMemoryError on flush() in https://github.com/apache/lucene-solr/pull/2088, I noticed that the OOME was followed by an assertion failure on rollback with the following stacktrace: {{ java.lang.AssertionError: pendingNumDocs 1 != 0 totalMaxDoc at __randomizedtesting.SeedInfo.seed([ABBF17C4E0FCDEE5:DDC8E99910AFC8FF]:0) at org.apache.lucene.index.IndexWriter.rollbackInternal(IndexWriter.java:2398) at org.apache.lucene.index.IndexWriter.maybeCloseOnTragicEvent(IndexWriter.java:5196) at org.apache.lucene.index.IndexWriter.tragicEvent(IndexWriter.java:5186) at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3932) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3874) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3853) at org.apache.lucene.index.TestIndexWriterDelete.testDeleteAllRepeated(TestIndexWriterDelete.java:496) }} We should probably look into how exactly we behave with this kind of tragedy on flush(). > pendingNumDocs doesn't match totalMaxDoc if tragedy on flush() > -- > > Key: LUCENE-9621 > URL: https://issues.apache.org/jira/browse/LUCENE-9621 > Project: Lucene - Core > Issue Type: Bug > Components: core/index >Affects Versions: 8.6.3 >Reporter: Michael Froh >Priority: Major > > While implementing a test to trigger an OutOfMemoryError on flush() in > https://github.com/apache/lucene-solr/pull/2088, I noticed that the OOME was > followed by an assertion failure on rollback with the following stacktrace: > {{java.lang.AssertionError: pendingNumDocs 1 != 0 totalMaxDoc > at > __randomizedtesting.SeedInfo.seed([ABBF17C4E0FCDEE5:DDC8E99910AFC8FF]:0) > at > org.apache.lucene.index.IndexWriter.rollbackInternal(IndexWriter.java:2398) > at > org.apache.lucene.index.IndexWriter.maybeCloseOnTragicEvent(IndexWriter.java:5196) > at > org.apache.lucene.index.IndexWriter.tragicEvent(IndexWriter.java:5186) > at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3932) > at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3874) > at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3853) > at > org.apache.lucene.index.TestIndexWriterDelete.testDeleteAllRepeated(TestIndexWriterDelete.java:496)}} > We should probably look into how exactly we behave with this kind of tragedy > on flush(). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9621) pendingNumDocs doesn't match totalMaxDoc if tragedy on flush()
[ https://issues.apache.org/jira/browse/LUCENE-9621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Froh updated LUCENE-9621: - Description: While implementing a test to trigger an OutOfMemoryError on flush() in https://github.com/apache/lucene-solr/pull/2088, I noticed that the OOME was followed by an assertion failure on rollback with the following stacktrace: {code:java} java.lang.AssertionError: pendingNumDocs 1 != 0 totalMaxDoc at __randomizedtesting.SeedInfo.seed([ABBF17C4E0FCDEE5:DDC8E99910AFC8FF]:0) at org.apache.lucene.index.IndexWriter.rollbackInternal(IndexWriter.java:2398) at org.apache.lucene.index.IndexWriter.maybeCloseOnTragicEvent(IndexWriter.java:5196) at org.apache.lucene.index.IndexWriter.tragicEvent(IndexWriter.java:5186) at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3932) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3874) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3853) at org.apache.lucene.index.TestIndexWriterDelete.testDeleteAllRepeated(TestIndexWriterDelete.java:496) {code} We should probably look into how exactly we behave with this kind of tragedy on flush(). was: While implementing a test to trigger an OutOfMemoryError on flush() in https://github.com/apache/lucene-solr/pull/2088, I noticed that the OOME was followed by an assertion failure on rollback with the following stacktrace: {{java.lang.AssertionError: pendingNumDocs 1 != 0 totalMaxDoc at __randomizedtesting.SeedInfo.seed([ABBF17C4E0FCDEE5:DDC8E99910AFC8FF]:0) at org.apache.lucene.index.IndexWriter.rollbackInternal(IndexWriter.java:2398) at org.apache.lucene.index.IndexWriter.maybeCloseOnTragicEvent(IndexWriter.java:5196) at org.apache.lucene.index.IndexWriter.tragicEvent(IndexWriter.java:5186) at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3932) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3874) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3853) at org.apache.lucene.index.TestIndexWriterDelete.testDeleteAllRepeated(TestIndexWriterDelete.java:496)}} We should probably look into how exactly we behave with this kind of tragedy on flush(). > pendingNumDocs doesn't match totalMaxDoc if tragedy on flush() > -- > > Key: LUCENE-9621 > URL: https://issues.apache.org/jira/browse/LUCENE-9621 > Project: Lucene - Core > Issue Type: Bug > Components: core/index >Affects Versions: 8.6.3 >Reporter: Michael Froh >Priority: Major > > While implementing a test to trigger an OutOfMemoryError on flush() in > https://github.com/apache/lucene-solr/pull/2088, I noticed that the OOME was > followed by an assertion failure on rollback with the following stacktrace: > {code:java} > java.lang.AssertionError: pendingNumDocs 1 != 0 totalMaxDoc > at > __randomizedtesting.SeedInfo.seed([ABBF17C4E0FCDEE5:DDC8E99910AFC8FF]:0) > at > org.apache.lucene.index.IndexWriter.rollbackInternal(IndexWriter.java:2398) > at > org.apache.lucene.index.IndexWriter.maybeCloseOnTragicEvent(IndexWriter.java:5196) > at > org.apache.lucene.index.IndexWriter.tragicEvent(IndexWriter.java:5186) > at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3932) > at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3874) > at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3853) > at > org.apache.lucene.index.TestIndexWriterDelete.testDeleteAllRepeated(TestIndexWriterDelete.java:496) > {code} > We should probably look into how exactly we behave with this kind of tragedy > on flush(). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9621) pendingNumDocs doesn't match totalMaxDoc if tragedy on flush()
Michael Froh created LUCENE-9621: Summary: pendingNumDocs doesn't match totalMaxDoc if tragedy on flush() Key: LUCENE-9621 URL: https://issues.apache.org/jira/browse/LUCENE-9621 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 8.6.3 Reporter: Michael Froh While implementing a test to trigger an OutOfMemoryError on flush() in https://github.com/apache/lucene-solr/pull/2088, I noticed that the OOME was followed by an assertion failure on rollback with the following stacktrace: {{ java.lang.AssertionError: pendingNumDocs 1 != 0 totalMaxDoc at __randomizedtesting.SeedInfo.seed([ABBF17C4E0FCDEE5:DDC8E99910AFC8FF]:0) at org.apache.lucene.index.IndexWriter.rollbackInternal(IndexWriter.java:2398) at org.apache.lucene.index.IndexWriter.maybeCloseOnTragicEvent(IndexWriter.java:5196) at org.apache.lucene.index.IndexWriter.tragicEvent(IndexWriter.java:5186) at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3932) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3874) at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3853) at org.apache.lucene.index.TestIndexWriterDelete.testDeleteAllRepeated(TestIndexWriterDelete.java:496) }} We should probably look into how exactly we behave with this kind of tragedy on flush(). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9617) FieldNumbers.clear() should reset lowestUnassignedFieldNumber
Michael Froh created LUCENE-9617: Summary: FieldNumbers.clear() should reset lowestUnassignedFieldNumber Key: LUCENE-9617 URL: https://issues.apache.org/jira/browse/LUCENE-9617 Project: Lucene - Core Issue Type: Bug Components: core/index Affects Versions: 8.7 Reporter: Michael Froh A call to IndexWriter.deleteAll() should completely reset the state of the index. Part of that is a call to globalFieldNumbersMap.clear(), which purges all knowledge of fields by clearing name -> number and number -> name maps. However, it does not reset lowestUnassignedFieldNumber. If we have loop that adds some documents, calls deleteAll(), adds documents, etc. lowestUnassignedFieldNumber keeps counting up. Since FieldInfos allocates an array for number -> FieldInfo, this array will get larger and larger, effectively leaking memory. We can fix this by resetting lowestUnassignedFieldNumber to -1 in FieldNumbers.clear(). I'll write a unit test and attach a patch. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8962) Can we merge small segments during refresh, for faster searching?
[ https://issues.apache.org/jira/browse/LUCENE-8962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17184699#comment-17184699 ] Michael Froh commented on LUCENE-8962: -- 8.6 added ability to merge small segments on commit. The more recent changes add the ability to merge on getReader calls (which is what the original issue was asking for -- merging on commit was a slightly easier step on the way there). > Can we merge small segments during refresh, for faster searching? > - > > Key: LUCENE-8962 > URL: https://issues.apache.org/jira/browse/LUCENE-8962 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Priority: Major > Fix For: master (9.0), 8.6 > > Attachments: LUCENE-8962_demo.png, failed-tests.patch, > failure_log.txt, test.diff > > Time Spent: 31h > Remaining Estimate: 0h > > With near-real-time search we ask {{IndexWriter}} to write all in-memory > segments to disk and open an {{IndexReader}} to search them, and this is > typically a quick operation. > However, when you use many threads for concurrent indexing, {{IndexWriter}} > will accumulate write many small segments during {{refresh}} and this then > adds search-time cost as searching must visit all of these tiny segments. > The merge policy would normally quickly coalesce these small segments if > given a little time ... so, could we somehow improve {{IndexWriter'}}s > refresh to optionally kick off merge policy to merge segments below some > threshold before opening the near-real-time reader? It'd be a bit tricky > because while we are waiting for merges, indexing may continue, and new > segments may be flushed, but those new segments shouldn't be included in the > point-in-time segments returned by refresh ... > One could almost do this on top of Lucene today, with a custom merge policy, > and some hackity logic to have the merge policy target small segments just > written by refresh, but it's tricky to then open a near-real-time reader, > excluding newly flushed but including newly merged segments since the refresh > originally finished ... > I'm not yet sure how best to solve this, so I wanted to open an issue for > discussion! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8962) Can we merge small segments during refresh, for faster searching?
[ https://issues.apache.org/jira/browse/LUCENE-8962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17053640#comment-17053640 ] Michael Froh commented on LUCENE-8962: -- bq. With a slightly refactored IW we can share the merge logic and let the reader re-write itself since we are talking about very small segments the overhead is very small. This would in turn mean that we are doing the work twice ie. the IW would do its normal work and might merge later etc. Just to provide a bit more context, for the case where my team uses this change, we're replicating the index (think Solr master/slave) from "writers" to many "searchers", so we're avoiding doing the work many times. An earlier (less invasive) approach I tried to address the small flushed segments problem was roughly: call commit on writer, hard link the commit files to another filesystem directory to "clone" the index, open an IW on that directory, merge small segments on the clone, let searchers replicate from the clone. That approach does mean that the merging work happens twice (since the "real" index doesn't benefit from the merge on the clone), but it doesn't involve any changes in Lucene. Maybe that less-invasive approach is a better way to address this. It's certainly more consistent with [~simonw]'s suggestion above. > Can we merge small segments during refresh, for faster searching? > - > > Key: LUCENE-8962 > URL: https://issues.apache.org/jira/browse/LUCENE-8962 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Priority: Major > Fix For: 8.5 > > Attachments: LUCENE-8962_demo.png > > Time Spent: 9.5h > Remaining Estimate: 0h > > With near-real-time search we ask {{IndexWriter}} to write all in-memory > segments to disk and open an {{IndexReader}} to search them, and this is > typically a quick operation. > However, when you use many threads for concurrent indexing, {{IndexWriter}} > will accumulate write many small segments during {{refresh}} and this then > adds search-time cost as searching must visit all of these tiny segments. > The merge policy would normally quickly coalesce these small segments if > given a little time ... so, could we somehow improve {{IndexWriter'}}s > refresh to optionally kick off merge policy to merge segments below some > threshold before opening the near-real-time reader? It'd be a bit tricky > because while we are waiting for merges, indexing may continue, and new > segments may be flushed, but those new segments shouldn't be included in the > point-in-time segments returned by refresh ... > One could almost do this on top of Lucene today, with a custom merge policy, > and some hackity logic to have the merge policy target small segments just > written by refresh, but it's tricky to then open a near-real-time reader, > excluding newly flushed but including newly merged segments since the refresh > originally finished ... > I'm not yet sure how best to solve this, so I wanted to open an issue for > discussion! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8962) Can we merge small segments during refresh, for faster searching?
[ https://issues.apache.org/jira/browse/LUCENE-8962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17051571#comment-17051571 ] Michael Froh commented on LUCENE-8962: -- I updated https://github.com/apache/lucene-solr/pull/1313 with that proposed fix (adding a {{boolean}} field to OneMerge that gets set once a merge is successfully committed). > Can we merge small segments during refresh, for faster searching? > - > > Key: LUCENE-8962 > URL: https://issues.apache.org/jira/browse/LUCENE-8962 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Priority: Major > Fix For: 8.5 > > Attachments: LUCENE-8962_demo.png > > Time Spent: 6.5h > Remaining Estimate: 0h > > With near-real-time search we ask {{IndexWriter}} to write all in-memory > segments to disk and open an {{IndexReader}} to search them, and this is > typically a quick operation. > However, when you use many threads for concurrent indexing, {{IndexWriter}} > will accumulate write many small segments during {{refresh}} and this then > adds search-time cost as searching must visit all of these tiny segments. > The merge policy would normally quickly coalesce these small segments if > given a little time ... so, could we somehow improve {{IndexWriter'}}s > refresh to optionally kick off merge policy to merge segments below some > threshold before opening the near-real-time reader? It'd be a bit tricky > because while we are waiting for merges, indexing may continue, and new > segments may be flushed, but those new segments shouldn't be included in the > point-in-time segments returned by refresh ... > One could almost do this on top of Lucene today, with a custom merge policy, > and some hackity logic to have the merge policy target small segments just > written by refresh, but it's tricky to then open a near-real-time reader, > excluding newly flushed but including newly merged segments since the refresh > originally finished ... > I'm not yet sure how best to solve this, so I wanted to open an issue for > discussion! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-8962) Can we merge small segments during refresh, for faster searching?
[ https://issues.apache.org/jira/browse/LUCENE-8962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17051558#comment-17051558 ] Michael Froh edited comment on LUCENE-8962 at 3/4/20, 7:32 PM: --- It's not immediately obvious to me how to fix the failure on {{TestIndexWriterExceptions2}}. A merge on commit fails (because it's using {{CrankyCodec}}), closing the merge readers, which calls the custom {{mergeFinished}} override, which assumes the merge completed (since it wasn't aborted), and tries to reference the files for the merged segment (to increment their reference counts). That triggers an {{IllegalStateException}} because the files weren't set (because we didn't get that far in the merge). Unfortunately, stepping through the debugger, I don't see a clear way of telling in {{mergeFinished}} that a merge failed. Obviously, I could wrap the call to {{SegmentCommitInfo.files()}} in a try-catch, and assume that the {{IllegalStateException}} means that the merge failed, but that would fail to properly handle the case where, say, an IOException occurred when committing the merge (after {{SegmentInfo.setFiles()}} was called, but before the files were actually written to disk). I'm thinking of adding a {{boolean}} field to {{OneMerge}} that gets set once a merge is successfully committed (e.g. just before the call to {{closeMergeReaders}} in {{IndexWriter.commitMerge()}}), which the {{mergeFinished}} override can use to determine if the merge completed successfully or not. was (Author: msfroh): It's not immediately obvious to me how to fix the failure on {{TestIndexWriterExceptions2}}. A merge on commit fails (because it's using {{CrankyCodec}}), closing the merge readers, which calls the custom {{mergeFinished}} override, which assumes the merge completed (since it wasn't aborted), and tries to reference the files for the merged segment (to increment their reference counts). That triggers an {{IllegalStateException}} because the files weren't set (because we didn't get that far in the merge). Unfortunately, stepping through the debugger, I don't see a clear way of telling in {{mergeFinished}} that a merge failed. Obviously, I could wrap the call to {{SegmentCommitInfo.files()}} in a try-catch, and assume that the {{IllegalStateException}} means that the merge failed, but that would fail to catch an IOException when e.g. committing the merge. I'm thinking of adding a {{boolean}} field to {{OneMerge}} that gets set once a merge is successfully committed (e.g. just before the call to {{closeMergeReaders}} in {{IndexWriter.commitMerge()}}), which the {{mergeFinished}} override can use to determine if the merge completed successfully or not. > Can we merge small segments during refresh, for faster searching? > - > > Key: LUCENE-8962 > URL: https://issues.apache.org/jira/browse/LUCENE-8962 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Priority: Major > Fix For: 8.5 > > Attachments: LUCENE-8962_demo.png > > Time Spent: 6h 20m > Remaining Estimate: 0h > > With near-real-time search we ask {{IndexWriter}} to write all in-memory > segments to disk and open an {{IndexReader}} to search them, and this is > typically a quick operation. > However, when you use many threads for concurrent indexing, {{IndexWriter}} > will accumulate write many small segments during {{refresh}} and this then > adds search-time cost as searching must visit all of these tiny segments. > The merge policy would normally quickly coalesce these small segments if > given a little time ... so, could we somehow improve {{IndexWriter'}}s > refresh to optionally kick off merge policy to merge segments below some > threshold before opening the near-real-time reader? It'd be a bit tricky > because while we are waiting for merges, indexing may continue, and new > segments may be flushed, but those new segments shouldn't be included in the > point-in-time segments returned by refresh ... > One could almost do this on top of Lucene today, with a custom merge policy, > and some hackity logic to have the merge policy target small segments just > written by refresh, but it's tricky to then open a near-real-time reader, > excluding newly flushed but including newly merged segments since the refresh > originally finished ... > I'm not yet sure how best to solve this, so I wanted to open an issue for > discussion! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8962) Can we merge small segments during refresh, for faster searching?
[ https://issues.apache.org/jira/browse/LUCENE-8962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17051558#comment-17051558 ] Michael Froh commented on LUCENE-8962: -- It's not immediately obvious to me how to fix the failure on {{TestIndexWriterExceptions2}}. A merge on commit fails (because it's using {{CrankyCodec}}), closing the merge readers, which calls the custom {{mergeFinished}} override, which assumes the merge completed (since it wasn't aborted), and tries to reference the files for the merged segment (to increment their reference counts). That triggers an {{IllegalStateException}} because the files weren't set (because we didn't get that far in the merge). Unfortunately, stepping through the debugger, I don't see a clear way of telling in {{mergeFinished}} that a merge failed. Obviously, I could wrap the call to {{SegmentCommitInfo.files()}} in a try-catch, and assume that the {{IllegalStateException}} means that the merge failed, but that would fail to catch an IOException when e.g. committing the merge. I'm thinking of adding a {{boolean}} field to {{OneMerge}} that gets set once a merge is successfully committed (e.g. just before the call to {{closeMergeReaders}} in {{IndexWriter.commitMerge()}}), which the {{mergeFinished}} override can use to determine if the merge completed successfully or not. > Can we merge small segments during refresh, for faster searching? > - > > Key: LUCENE-8962 > URL: https://issues.apache.org/jira/browse/LUCENE-8962 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Priority: Major > Fix For: 8.5 > > Attachments: LUCENE-8962_demo.png > > Time Spent: 6h 20m > Remaining Estimate: 0h > > With near-real-time search we ask {{IndexWriter}} to write all in-memory > segments to disk and open an {{IndexReader}} to search them, and this is > typically a quick operation. > However, when you use many threads for concurrent indexing, {{IndexWriter}} > will accumulate write many small segments during {{refresh}} and this then > adds search-time cost as searching must visit all of these tiny segments. > The merge policy would normally quickly coalesce these small segments if > given a little time ... so, could we somehow improve {{IndexWriter'}}s > refresh to optionally kick off merge policy to merge segments below some > threshold before opening the near-real-time reader? It'd be a bit tricky > because while we are waiting for merges, indexing may continue, and new > segments may be flushed, but those new segments shouldn't be included in the > point-in-time segments returned by refresh ... > One could almost do this on top of Lucene today, with a custom merge policy, > and some hackity logic to have the merge policy target small segments just > written by refresh, but it's tricky to then open a near-real-time reader, > excluding newly flushed but including newly merged segments since the refresh > originally finished ... > I'm not yet sure how best to solve this, so I wanted to open an issue for > discussion! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8962) Can we merge small segments during refresh, for faster searching?
[ https://issues.apache.org/jira/browse/LUCENE-8962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17050607#comment-17050607 ] Michael Froh commented on LUCENE-8962: -- I ended up splitting testMergeOnCommit into two test cases. One runs through the basic invariants on a single thread and confirms that everything behaves as expected. The other tries indexing and committing from multiple threads, but doesn't really make any assumptions about the segment topology in the end (since randomness and concurrency can lead to all kinds of possible valid segment counts). Instead it just verifies that it doesn't fail and doesn't lose any documents. https://github.com/apache/lucene-solr/pull/1313 > Can we merge small segments during refresh, for faster searching? > - > > Key: LUCENE-8962 > URL: https://issues.apache.org/jira/browse/LUCENE-8962 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Priority: Major > Fix For: 8.5 > > Attachments: LUCENE-8962_demo.png > > Time Spent: 6h 10m > Remaining Estimate: 0h > > With near-real-time search we ask {{IndexWriter}} to write all in-memory > segments to disk and open an {{IndexReader}} to search them, and this is > typically a quick operation. > However, when you use many threads for concurrent indexing, {{IndexWriter}} > will accumulate write many small segments during {{refresh}} and this then > adds search-time cost as searching must visit all of these tiny segments. > The merge policy would normally quickly coalesce these small segments if > given a little time ... so, could we somehow improve {{IndexWriter'}}s > refresh to optionally kick off merge policy to merge segments below some > threshold before opening the near-real-time reader? It'd be a bit tricky > because while we are waiting for merges, indexing may continue, and new > segments may be flushed, but those new segments shouldn't be included in the > point-in-time segments returned by refresh ... > One could almost do this on top of Lucene today, with a custom merge policy, > and some hackity logic to have the merge policy target small segments just > written by refresh, but it's tricky to then open a near-real-time reader, > excluding newly flushed but including newly merged segments since the refresh > originally finished ... > I'm not yet sure how best to solve this, so I wanted to open an issue for > discussion! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8962) Can we merge small segments during refresh, for faster searching?
[ https://issues.apache.org/jira/browse/LUCENE-8962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049700#comment-17049700 ] Michael Froh commented on LUCENE-8962: -- Posted a PR with fixes for the above test failures: [https://github.com/apache/lucene-solr/pull/1307] > Can we merge small segments during refresh, for faster searching? > - > > Key: LUCENE-8962 > URL: https://issues.apache.org/jira/browse/LUCENE-8962 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Priority: Major > Attachments: LUCENE-8962_demo.png > > Time Spent: 5.5h > Remaining Estimate: 0h > > With near-real-time search we ask {{IndexWriter}} to write all in-memory > segments to disk and open an {{IndexReader}} to search them, and this is > typically a quick operation. > However, when you use many threads for concurrent indexing, {{IndexWriter}} > will accumulate write many small segments during {{refresh}} and this then > adds search-time cost as searching must visit all of these tiny segments. > The merge policy would normally quickly coalesce these small segments if > given a little time ... so, could we somehow improve {{IndexWriter'}}s > refresh to optionally kick off merge policy to merge segments below some > threshold before opening the near-real-time reader? It'd be a bit tricky > because while we are waiting for merges, indexing may continue, and new > segments may be flushed, but those new segments shouldn't be included in the > point-in-time segments returned by refresh ... > One could almost do this on top of Lucene today, with a custom merge policy, > and some hackity logic to have the merge policy target small segments just > written by refresh, but it's tricky to then open a near-real-time reader, > excluding newly flushed but including newly merged segments since the refresh > originally finished ... > I'm not yet sure how best to solve this, so I wanted to open an issue for > discussion! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8962) Can we merge small segments during refresh, for faster searching?
[ https://issues.apache.org/jira/browse/LUCENE-8962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049666#comment-17049666 ] Michael Froh commented on LUCENE-8962: -- I was able to reproduce the {{testMergeOnCommit}} failure on master sometimes with the following options: {{-Dtestcase=TestIndexWriterMergePolicy -Dtests.method=testMergeOnCommit -Dtests.seed=F8DD5AD20994FDDF -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=fi-FI -Dtests.timezone=America/Danmarkshavn -Dtests.asserts=true -Dtests.file.encoding=UTF-8}} > Can we merge small segments during refresh, for faster searching? > - > > Key: LUCENE-8962 > URL: https://issues.apache.org/jira/browse/LUCENE-8962 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Priority: Major > Attachments: LUCENE-8962_demo.png > > Time Spent: 5h 20m > Remaining Estimate: 0h > > With near-real-time search we ask {{IndexWriter}} to write all in-memory > segments to disk and open an {{IndexReader}} to search them, and this is > typically a quick operation. > However, when you use many threads for concurrent indexing, {{IndexWriter}} > will accumulate write many small segments during {{refresh}} and this then > adds search-time cost as searching must visit all of these tiny segments. > The merge policy would normally quickly coalesce these small segments if > given a little time ... so, could we somehow improve {{IndexWriter'}}s > refresh to optionally kick off merge policy to merge segments below some > threshold before opening the near-real-time reader? It'd be a bit tricky > because while we are waiting for merges, indexing may continue, and new > segments may be flushed, but those new segments shouldn't be included in the > point-in-time segments returned by refresh ... > One could almost do this on top of Lucene today, with a custom merge policy, > and some hackity logic to have the merge policy target small segments just > written by refresh, but it's tricky to then open a near-real-time reader, > excluding newly flushed but including newly merged segments since the refresh > originally finished ... > I'm not yet sure how best to solve this, so I wanted to open an issue for > discussion! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8962) Can we merge small segments during refresh, for faster searching?
[ https://issues.apache.org/jira/browse/LUCENE-8962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049651#comment-17049651 ] Michael Froh commented on LUCENE-8962: -- Regarding {{TestIndexWriter.testThreadInterruptDeadlock}}, I think that's a bug in the implementation. When waiting for merges to complete, I added a {{catch}} for {{InterruptedException}} that sets the interrupt flag and throws an {{IOException}}. The documented behavior of {{IndexWriter}} is to clear the interrupt flag and throw {{ThreadInterruptedException}}. Again, not sure why the tests on master didn't fail. Maybe we just got lucky with the branch_8x tests. > Can we merge small segments during refresh, for faster searching? > - > > Key: LUCENE-8962 > URL: https://issues.apache.org/jira/browse/LUCENE-8962 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Priority: Major > Attachments: LUCENE-8962_demo.png > > Time Spent: 5h 20m > Remaining Estimate: 0h > > With near-real-time search we ask {{IndexWriter}} to write all in-memory > segments to disk and open an {{IndexReader}} to search them, and this is > typically a quick operation. > However, when you use many threads for concurrent indexing, {{IndexWriter}} > will accumulate write many small segments during {{refresh}} and this then > adds search-time cost as searching must visit all of these tiny segments. > The merge policy would normally quickly coalesce these small segments if > given a little time ... so, could we somehow improve {{IndexWriter'}}s > refresh to optionally kick off merge policy to merge segments below some > threshold before opening the near-real-time reader? It'd be a bit tricky > because while we are waiting for merges, indexing may continue, and new > segments may be flushed, but those new segments shouldn't be included in the > point-in-time segments returned by refresh ... > One could almost do this on top of Lucene today, with a custom merge policy, > and some hackity logic to have the merge policy target small segments just > written by refresh, but it's tricky to then open a near-real-time reader, > excluding newly flushed but including newly merged segments since the refresh > originally finished ... > I'm not yet sure how best to solve this, so I wanted to open an issue for > discussion! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8962) Can we merge small segments during refresh, for faster searching?
[ https://issues.apache.org/jira/browse/LUCENE-8962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049641#comment-17049641 ] Michael Froh commented on LUCENE-8962: -- I think the failure in {{testMergeOnCommit}} occurs because of a difference in the random behavior of the test. Specifically, sometimes the last writing thread happens to choose to {{commit()}} at the end, so there are no pending changes by the time we do the last {{commit()}} which should merge all segments (or abandon the merge, if it takes too long). If we add one more doc before that last commit (ensuring that the {{anyChanges}} check in {{IndexWriter.prepareCommitInternal()}} is {{true}}), the test passes consistently. I'm not sure why we don't see the same failure sometimes on master, though. > Can we merge small segments during refresh, for faster searching? > - > > Key: LUCENE-8962 > URL: https://issues.apache.org/jira/browse/LUCENE-8962 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Priority: Major > Attachments: LUCENE-8962_demo.png > > Time Spent: 5h 20m > Remaining Estimate: 0h > > With near-real-time search we ask {{IndexWriter}} to write all in-memory > segments to disk and open an {{IndexReader}} to search them, and this is > typically a quick operation. > However, when you use many threads for concurrent indexing, {{IndexWriter}} > will accumulate write many small segments during {{refresh}} and this then > adds search-time cost as searching must visit all of these tiny segments. > The merge policy would normally quickly coalesce these small segments if > given a little time ... so, could we somehow improve {{IndexWriter'}}s > refresh to optionally kick off merge policy to merge segments below some > threshold before opening the near-real-time reader? It'd be a bit tricky > because while we are waiting for merges, indexing may continue, and new > segments may be flushed, but those new segments shouldn't be included in the > point-in-time segments returned by refresh ... > One could almost do this on top of Lucene today, with a custom merge policy, > and some hackity logic to have the merge policy target small segments just > written by refresh, but it's tricky to then open a near-real-time reader, > excluding newly flushed but including newly merged segments since the refresh > originally finished ... > I'm not yet sure how best to solve this, so I wanted to open an issue for > discussion! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8962) Can we merge small segments during refresh, for faster searching?
[ https://issues.apache.org/jira/browse/LUCENE-8962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049564#comment-17049564 ] Michael Froh commented on LUCENE-8962: -- I'm looking into the {{branch_8x}} failures. I'm able to reproduce on my machine and will step through to see what's different. > Can we merge small segments during refresh, for faster searching? > - > > Key: LUCENE-8962 > URL: https://issues.apache.org/jira/browse/LUCENE-8962 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Priority: Major > Attachments: LUCENE-8962_demo.png > > Time Spent: 5h 20m > Remaining Estimate: 0h > > With near-real-time search we ask {{IndexWriter}} to write all in-memory > segments to disk and open an {{IndexReader}} to search them, and this is > typically a quick operation. > However, when you use many threads for concurrent indexing, {{IndexWriter}} > will accumulate write many small segments during {{refresh}} and this then > adds search-time cost as searching must visit all of these tiny segments. > The merge policy would normally quickly coalesce these small segments if > given a little time ... so, could we somehow improve {{IndexWriter'}}s > refresh to optionally kick off merge policy to merge segments below some > threshold before opening the near-real-time reader? It'd be a bit tricky > because while we are waiting for merges, indexing may continue, and new > segments may be flushed, but those new segments shouldn't be included in the > point-in-time segments returned by refresh ... > One could almost do this on top of Lucene today, with a custom merge policy, > and some hackity logic to have the merge policy target small segments just > written by refresh, but it's tricky to then open a near-real-time reader, > excluding newly flushed but including newly merged segments since the refresh > originally finished ... > I'm not yet sure how best to solve this, so I wanted to open an issue for > discussion! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8962) Can we merge small segments during refresh, for faster searching?
[ https://issues.apache.org/jira/browse/LUCENE-8962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17028727#comment-17028727 ] Michael Froh commented on LUCENE-8962: -- bq. Yeah I think you are right! That would be a nice simplification. Probably this can just be folded into the existing MergePolicy API as a different MergeTrigger. Though then I wonder why e.g. forceMerge or expungeDeletes are not also simply different triggers ... Michael Froh what do you think? As I was first writing this, I added a {{MergeTrigger.COMMIT}} value and used that, rather than adding a dedicated method. Then I realized that any time I've ever written a custom implementation of {{MergePolicy.findMerges()}}, I've ignored the {{MergeTrigger}} value, because I didn't really care what triggered the merge -- I just wanted to define the {{MergeSpecification}}. Even {{TieredMergePolicy.findMerges()}}} doesn't look at the {{MergeTrigger}} parameter. If I had made {{IndexWriter}} call {{findMerges}} with a {{MergeTrigger.COMMIT}} trigger, anyone with a similar {{MergePolicy}} would have probably ended up running (and blocking on) some pretty expensive merges on commit. The best way I could think of to be backwards compatible with the "old" behavior by default was to add a no-op method to the base class. Looking through the history, it looks like {{forceMerge}} and {{expungeDeletes}} predate {{MergeTrigger}}, so that could explain them. I really like the idea of controlling this with a {{MergeTrigger}}, but I'm concerned about breaking existing {{MergePolicy}} implementations that ignore the {{MergeTrigger}} (which I suspect may be most of them). > Can we merge small segments during refresh, for faster searching? > - > > Key: LUCENE-8962 > URL: https://issues.apache.org/jira/browse/LUCENE-8962 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Priority: Major > Attachments: LUCENE-8962_demo.png > > Time Spent: 4h 40m > Remaining Estimate: 0h > > With near-real-time search we ask {{IndexWriter}} to write all in-memory > segments to disk and open an {{IndexReader}} to search them, and this is > typically a quick operation. > However, when you use many threads for concurrent indexing, {{IndexWriter}} > will accumulate write many small segments during {{refresh}} and this then > adds search-time cost as searching must visit all of these tiny segments. > The merge policy would normally quickly coalesce these small segments if > given a little time ... so, could we somehow improve {{IndexWriter'}}s > refresh to optionally kick off merge policy to merge segments below some > threshold before opening the near-real-time reader? It'd be a bit tricky > because while we are waiting for merges, indexing may continue, and new > segments may be flushed, but those new segments shouldn't be included in the > point-in-time segments returned by refresh ... > One could almost do this on top of Lucene today, with a custom merge policy, > and some hackity logic to have the merge policy target small segments just > written by refresh, but it's tricky to then open a near-real-time reader, > excluding newly flushed but including newly merged segments since the refresh > originally finished ... > I'm not yet sure how best to solve this, so I wanted to open an issue for > discussion! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8962) Can we merge small segments during refresh, for faster searching?
[ https://issues.apache.org/jira/browse/LUCENE-8962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17027019#comment-17027019 ] Michael Froh commented on LUCENE-8962: -- [~dsmiley] – in your test, the merge executes after the commit updates the {{IndexWriter}}'s live {{SegmentInfos}}. When you call {{DirectoryReader.open}}, it takes another clone of that live SegmentInfos (which has 1 segment). However, the clone of the {{SegmentInfos}} that was written in the commit is from before the merge. If you were to open a fresh {{DirectoryReader}} from the on-disk directory, I believe you would still see 9 segments. With the approach I took, the cheap merge (or merges) asynchronously updates the commit's {{SegmentInfos}} clone before the commit happens. > Can we merge small segments during refresh, for faster searching? > - > > Key: LUCENE-8962 > URL: https://issues.apache.org/jira/browse/LUCENE-8962 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Priority: Major > Attachments: LUCENE-8962_demo.png > > Time Spent: 3h 10m > Remaining Estimate: 0h > > With near-real-time search we ask {{IndexWriter}} to write all in-memory > segments to disk and open an {{IndexReader}} to search them, and this is > typically a quick operation. > However, when you use many threads for concurrent indexing, {{IndexWriter}} > will accumulate write many small segments during {{refresh}} and this then > adds search-time cost as searching must visit all of these tiny segments. > The merge policy would normally quickly coalesce these small segments if > given a little time ... so, could we somehow improve {{IndexWriter'}}s > refresh to optionally kick off merge policy to merge segments below some > threshold before opening the near-real-time reader? It'd be a bit tricky > because while we are waiting for merges, indexing may continue, and new > segments may be flushed, but those new segments shouldn't be included in the > point-in-time segments returned by refresh ... > One could almost do this on top of Lucene today, with a custom merge policy, > and some hackity logic to have the merge policy target small segments just > written by refresh, but it's tricky to then open a near-real-time reader, > excluding newly flushed but including newly merged segments since the refresh > originally finished ... > I'm not yet sure how best to solve this, so I wanted to open an issue for > discussion! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8962) Can we merge small segments during refresh, for faster searching?
[ https://issues.apache.org/jira/browse/LUCENE-8962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17025313#comment-17025313 ] Michael Froh commented on LUCENE-8962: -- Thanks [~msoko...@gmail.com] for the feedback on the PR! I've updated it to incorporate your suggestions. > Can we merge small segments during refresh, for faster searching? > - > > Key: LUCENE-8962 > URL: https://issues.apache.org/jira/browse/LUCENE-8962 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Priority: Major > Attachments: LUCENE-8962_demo.png > > Time Spent: 1h 40m > Remaining Estimate: 0h > > With near-real-time search we ask {{IndexWriter}} to write all in-memory > segments to disk and open an {{IndexReader}} to search them, and this is > typically a quick operation. > However, when you use many threads for concurrent indexing, {{IndexWriter}} > will accumulate write many small segments during {{refresh}} and this then > adds search-time cost as searching must visit all of these tiny segments. > The merge policy would normally quickly coalesce these small segments if > given a little time ... so, could we somehow improve {{IndexWriter'}}s > refresh to optionally kick off merge policy to merge segments below some > threshold before opening the near-real-time reader? It'd be a bit tricky > because while we are waiting for merges, indexing may continue, and new > segments may be flushed, but those new segments shouldn't be included in the > point-in-time segments returned by refresh ... > One could almost do this on top of Lucene today, with a custom merge policy, > and some hackity logic to have the merge policy target small segments just > written by refresh, but it's tricky to then open a near-real-time reader, > excluding newly flushed but including newly merged segments since the refresh > originally finished ... > I'm not yet sure how best to solve this, so I wanted to open an issue for > discussion! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8962) Can we merge small segments during refresh, for faster searching?
[ https://issues.apache.org/jira/browse/LUCENE-8962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17019012#comment-17019012 ] Michael Froh commented on LUCENE-8962: -- Here's a before and after comparison of the average number of segments searched per request since I applied this change (with a TieredMergePolicy subclass that tries to merge all segments smaller than 100MB into a single segment on commit, with floorSegmentMB of 500). It lowers the overall count, but especially significantly reduced the variance. !LUCENE-8962_demo.png! > Can we merge small segments during refresh, for faster searching? > - > > Key: LUCENE-8962 > URL: https://issues.apache.org/jira/browse/LUCENE-8962 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Priority: Major > Attachments: LUCENE-8962_demo.png > > Time Spent: 1h 40m > Remaining Estimate: 0h > > With near-real-time search we ask {{IndexWriter}} to write all in-memory > segments to disk and open an {{IndexReader}} to search them, and this is > typically a quick operation. > However, when you use many threads for concurrent indexing, {{IndexWriter}} > will accumulate write many small segments during {{refresh}} and this then > adds search-time cost as searching must visit all of these tiny segments. > The merge policy would normally quickly coalesce these small segments if > given a little time ... so, could we somehow improve {{IndexWriter'}}s > refresh to optionally kick off merge policy to merge segments below some > threshold before opening the near-real-time reader? It'd be a bit tricky > because while we are waiting for merges, indexing may continue, and new > segments may be flushed, but those new segments shouldn't be included in the > point-in-time segments returned by refresh ... > One could almost do this on top of Lucene today, with a custom merge policy, > and some hackity logic to have the merge policy target small segments just > written by refresh, but it's tricky to then open a near-real-time reader, > excluding newly flushed but including newly merged segments since the refresh > originally finished ... > I'm not yet sure how best to solve this, so I wanted to open an issue for > discussion! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-8962) Can we merge small segments during refresh, for faster searching?
[ https://issues.apache.org/jira/browse/LUCENE-8962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Froh updated LUCENE-8962: - Attachment: LUCENE-8962_demo.png > Can we merge small segments during refresh, for faster searching? > - > > Key: LUCENE-8962 > URL: https://issues.apache.org/jira/browse/LUCENE-8962 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Priority: Major > Attachments: LUCENE-8962_demo.png > > Time Spent: 1h 40m > Remaining Estimate: 0h > > With near-real-time search we ask {{IndexWriter}} to write all in-memory > segments to disk and open an {{IndexReader}} to search them, and this is > typically a quick operation. > However, when you use many threads for concurrent indexing, {{IndexWriter}} > will accumulate write many small segments during {{refresh}} and this then > adds search-time cost as searching must visit all of these tiny segments. > The merge policy would normally quickly coalesce these small segments if > given a little time ... so, could we somehow improve {{IndexWriter'}}s > refresh to optionally kick off merge policy to merge segments below some > threshold before opening the near-real-time reader? It'd be a bit tricky > because while we are waiting for merges, indexing may continue, and new > segments may be flushed, but those new segments shouldn't be included in the > point-in-time segments returned by refresh ... > One could almost do this on top of Lucene today, with a custom merge policy, > and some hackity logic to have the merge policy target small segments just > written by refresh, but it's tricky to then open a near-real-time reader, > excluding newly flushed but including newly merged segments since the refresh > originally finished ... > I'm not yet sure how best to solve this, so I wanted to open an issue for > discussion! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8962) Can we merge small segments during refresh, for faster searching?
[ https://issues.apache.org/jira/browse/LUCENE-8962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17010210#comment-17010210 ] Michael Froh commented on LUCENE-8962: -- I ended up needing something like this, not for NRT readers, but rather on commit. I added a mechanism to compute cheap "commit merges" from within the prepareCommitInternal() call, and block until they complete (updating the "toCommit" SegmentInfos as they finish). I posted a PR for that here: [https://github.com/apache/lucene-solr/pull/1155] I think we could do something similar from IndexWriter.getReader() to handle the NRT case, but I haven't tried working on that yet. > Can we merge small segments during refresh, for faster searching? > - > > Key: LUCENE-8962 > URL: https://issues.apache.org/jira/browse/LUCENE-8962 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Michael McCandless >Priority: Major > > With near-real-time search we ask {{IndexWriter}} to write all in-memory > segments to disk and open an {{IndexReader}} to search them, and this is > typically a quick operation. > However, when you use many threads for concurrent indexing, {{IndexWriter}} > will accumulate write many small segments during {{refresh}} and this then > adds search-time cost as searching must visit all of these tiny segments. > The merge policy would normally quickly coalesce these small segments if > given a little time ... so, could we somehow improve {{IndexWriter'}}s > refresh to optionally kick off merge policy to merge segments below some > threshold before opening the near-real-time reader? It'd be a bit tricky > because while we are waiting for merges, indexing may continue, and new > segments may be flushed, but those new segments shouldn't be included in the > point-in-time segments returned by refresh ... > One could almost do this on top of Lucene today, with a custom merge policy, > and some hackity logic to have the merge policy target small segments just > written by refresh, but it's tricky to then open a near-real-time reader, > excluding newly flushed but including newly merged segments since the refresh > originally finished ... > I'm not yet sure how best to solve this, so I wanted to open an issue for > discussion! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org