[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments
[ https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440956#comment-16440956 ] ASF subversion and git services commented on LUCENE-8253: - Commit 330fd18f200dae0892b3aa0882668435730c4319 in lucene-solr's branch refs/heads/branch_7x from [~simonw] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=330fd18 ] LUCENE-8253: Don't create ReadersAndUpdates for foreign segments IndexWriter#numDeletesToMerge was creating a ReadersAndUpdates for all incoming SegmentCommitInfo even if that info wasn't private to the IndexWriter. This is an illegal use of this API but since it's transitively public via MergePolicy#findMerges we have to be conservative with regestiering ReadersAndUpdates. In IndexWriter#numDeletesToMerge we can only use existing ones. This means for soft-deletes we need to react earlier in order to produce accurate numbers. This change partially rolls back the changes in LUCENE-8253. Instead of registering the readers once they are pulled via IndexWriter#numDeletesToMerge we now check if segments are fully deleted on flush which is very unlikely and can be done in a lazy fashion ie. it's only paying the extra cost of opening a reader and checking all soft-deletes if soft deletes are used and present in the flushed segment. This has the side-effect that flushed segments that are 100% hard deleted are also cleaned up right after they are flushed, previously these segments were sticking around for a while until they got picked for a merge or received another delete. This also closes LUCENE-8256 > ForceMergeDeletes does not merge soft-deleted segments > -- > > Key: LUCENE-8253 > URL: https://issues.apache.org/jira/browse/LUCENE-8253 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 7.4, master (8.0) >Reporter: Nhat Nguyen >Assignee: Simon Willnauer >Priority: Major > Attachments: LUCENE-8253.patch, test-merge.patch > > > IndexWriter#forceMergeDeletes should merge segments having soft-deleted > documents as hard-deleted documents if we configured "softDeletesField" in an > IndexWriterConfig. > Attached is a failed test. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments
[ https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440957#comment-16440957 ] ASF subversion and git services commented on LUCENE-8253: - Commit 330fd18f200dae0892b3aa0882668435730c4319 in lucene-solr's branch refs/heads/branch_7x from [~simonw] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=330fd18 ] LUCENE-8253: Don't create ReadersAndUpdates for foreign segments IndexWriter#numDeletesToMerge was creating a ReadersAndUpdates for all incoming SegmentCommitInfo even if that info wasn't private to the IndexWriter. This is an illegal use of this API but since it's transitively public via MergePolicy#findMerges we have to be conservative with regestiering ReadersAndUpdates. In IndexWriter#numDeletesToMerge we can only use existing ones. This means for soft-deletes we need to react earlier in order to produce accurate numbers. This change partially rolls back the changes in LUCENE-8253. Instead of registering the readers once they are pulled via IndexWriter#numDeletesToMerge we now check if segments are fully deleted on flush which is very unlikely and can be done in a lazy fashion ie. it's only paying the extra cost of opening a reader and checking all soft-deletes if soft deletes are used and present in the flushed segment. This has the side-effect that flushed segments that are 100% hard deleted are also cleaned up right after they are flushed, previously these segments were sticking around for a while until they got picked for a merge or received another delete. This also closes LUCENE-8256 > ForceMergeDeletes does not merge soft-deleted segments > -- > > Key: LUCENE-8253 > URL: https://issues.apache.org/jira/browse/LUCENE-8253 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 7.4, master (8.0) >Reporter: Nhat Nguyen >Assignee: Simon Willnauer >Priority: Major > Attachments: LUCENE-8253.patch, test-merge.patch > > > IndexWriter#forceMergeDeletes should merge segments having soft-deleted > documents as hard-deleted documents if we configured "softDeletesField" in an > IndexWriterConfig. > Attached is a failed test. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments
[ https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440954#comment-16440954 ] Steve Rowe commented on LUCENE-8253: Thanks [~simonw]! > ForceMergeDeletes does not merge soft-deleted segments > -- > > Key: LUCENE-8253 > URL: https://issues.apache.org/jira/browse/LUCENE-8253 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 7.4, master (8.0) >Reporter: Nhat Nguyen >Assignee: Simon Willnauer >Priority: Major > Attachments: LUCENE-8253.patch, test-merge.patch > > > IndexWriter#forceMergeDeletes should merge segments having soft-deleted > documents as hard-deleted documents if we configured "softDeletesField" in an > IndexWriterConfig. > Attached is a failed test. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments
[ https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440951#comment-16440951 ] Simon Willnauer commented on LUCENE-8253: - [~steve_rowe] I fixed the issue and reenabled the test. sorry for the noise > ForceMergeDeletes does not merge soft-deleted segments > -- > > Key: LUCENE-8253 > URL: https://issues.apache.org/jira/browse/LUCENE-8253 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 7.4, master (8.0) >Reporter: Nhat Nguyen >Assignee: Simon Willnauer >Priority: Major > Attachments: LUCENE-8253.patch, test-merge.patch > > > IndexWriter#forceMergeDeletes should merge segments having soft-deleted > documents as hard-deleted documents if we configured "softDeletesField" in an > IndexWriterConfig. > Attached is a failed test. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments
[ https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440944#comment-16440944 ] ASF subversion and git services commented on LUCENE-8253: - Commit d904112428184ce9c1726313add5d184f4014a72 in lucene-solr's branch refs/heads/master from [~simonw] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=d904112 ] LUCENE-8253: Don't create ReadersAndUpdates for foreign segments IndexWriter#numDeletesToMerge was creating a ReadersAndUpdates for all incoming SegmentCommitInfo even if that info wasn't private to the IndexWriter. This is an illegal use of this API but since it's transitively public via MergePolicy#findMerges we have to be conservative with regestiering ReadersAndUpdates. In IndexWriter#numDeletesToMerge we can only use existing ones. This means for soft-deletes we need to react earlier in order to produce accurate numbers. This change partially rolls back the changes in LUCENE-8253. Instead of registering the readers once they are pulled via IndexWriter#numDeletesToMerge we now check if segments are fully deleted on flush which is very unlikely and can be done in a lazy fashion ie. it's only paying the extra cost of opening a reader and checking all soft-deletes if soft deletes are used and present in the flushed segment. This has the side-effect that flushed segments that are 100% hard deleted are also cleaned up right after they are flushed, previously these segments were sticking around for a while until they got picked for a merge or received another delete. This also closes LUCENE-8256 > ForceMergeDeletes does not merge soft-deleted segments > -- > > Key: LUCENE-8253 > URL: https://issues.apache.org/jira/browse/LUCENE-8253 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 7.4, master (8.0) >Reporter: Nhat Nguyen >Assignee: Simon Willnauer >Priority: Major > Attachments: LUCENE-8253.patch, test-merge.patch > > > IndexWriter#forceMergeDeletes should merge segments having soft-deleted > documents as hard-deleted documents if we configured "softDeletesField" in an > IndexWriterConfig. > Attached is a failed test. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments
[ https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440943#comment-16440943 ] ASF subversion and git services commented on LUCENE-8253: - Commit d904112428184ce9c1726313add5d184f4014a72 in lucene-solr's branch refs/heads/master from [~simonw] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=d904112 ] LUCENE-8253: Don't create ReadersAndUpdates for foreign segments IndexWriter#numDeletesToMerge was creating a ReadersAndUpdates for all incoming SegmentCommitInfo even if that info wasn't private to the IndexWriter. This is an illegal use of this API but since it's transitively public via MergePolicy#findMerges we have to be conservative with regestiering ReadersAndUpdates. In IndexWriter#numDeletesToMerge we can only use existing ones. This means for soft-deletes we need to react earlier in order to produce accurate numbers. This change partially rolls back the changes in LUCENE-8253. Instead of registering the readers once they are pulled via IndexWriter#numDeletesToMerge we now check if segments are fully deleted on flush which is very unlikely and can be done in a lazy fashion ie. it's only paying the extra cost of opening a reader and checking all soft-deletes if soft deletes are used and present in the flushed segment. This has the side-effect that flushed segments that are 100% hard deleted are also cleaned up right after they are flushed, previously these segments were sticking around for a while until they got picked for a merge or received another delete. This also closes LUCENE-8256 > ForceMergeDeletes does not merge soft-deleted segments > -- > > Key: LUCENE-8253 > URL: https://issues.apache.org/jira/browse/LUCENE-8253 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 7.4, master (8.0) >Reporter: Nhat Nguyen >Assignee: Simon Willnauer >Priority: Major > Attachments: LUCENE-8253.patch, test-merge.patch > > > IndexWriter#forceMergeDeletes should merge segments having soft-deleted > documents as hard-deleted documents if we configured "softDeletesField" in an > IndexWriterConfig. > Attached is a failed test. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments
[ https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440745#comment-16440745 ] ASF subversion and git services commented on LUCENE-8253: - Commit 94adf9d2ff42cc4133354f7ab09ed32c496250b9 in lucene-solr's branch refs/heads/branch_7x from [~romseygeek] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=94adf9d ] LUCENE-8253: Mute test while a fix is worked on > ForceMergeDeletes does not merge soft-deleted segments > -- > > Key: LUCENE-8253 > URL: https://issues.apache.org/jira/browse/LUCENE-8253 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 7.4, master (8.0) >Reporter: Nhat Nguyen >Assignee: Simon Willnauer >Priority: Major > Attachments: LUCENE-8253.patch, test-merge.patch > > > IndexWriter#forceMergeDeletes should merge segments having soft-deleted > documents as hard-deleted documents if we configured "softDeletesField" in an > IndexWriterConfig. > Attached is a failed test. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments
[ https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440746#comment-16440746 ] ASF subversion and git services commented on LUCENE-8253: - Commit f7f12a51f313bf406f0fa3d48e74864268338c6d in lucene-solr's branch refs/heads/master from [~romseygeek] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=f7f12a5 ] LUCENE-8253: Mute test while a fix is worked on > ForceMergeDeletes does not merge soft-deleted segments > -- > > Key: LUCENE-8253 > URL: https://issues.apache.org/jira/browse/LUCENE-8253 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 7.4, master (8.0) >Reporter: Nhat Nguyen >Assignee: Simon Willnauer >Priority: Major > Attachments: LUCENE-8253.patch, test-merge.patch > > > IndexWriter#forceMergeDeletes should merge segments having soft-deleted > documents as hard-deleted documents if we configured "softDeletesField" in an > IndexWriterConfig. > Attached is a failed test. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments
[ https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440073#comment-16440073 ] Steve Rowe commented on LUCENE-8253: {{git bisect}} blames commit {{c70ccea}} on this issue for reproducing Solr {{SegmentsInfoRequestHandlerTest}} failures, e.g. from [https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/21840/]: {noformat} [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=SegmentsInfoRequestHandlerTest -Dtests.method=testSegmentInfosData -Dtests.seed=D8FA27F4CB25E126 -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=it -Dtests.timezone=Europe/Kaliningrad -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] FAILURE 0.00s J2 | SegmentsInfoRequestHandlerTest.testSegmentInfosData <<< [junit4]> Throwable #1: java.lang.AssertionError [junit4]>at __randomizedtesting.SeedInfo.seed([D8FA27F4CB25E126:A45305AF7DCD56B9]:0) [junit4]>at org.apache.lucene.index.IndexWriter$ReaderPool.noDups(IndexWriter.java:867) [junit4]>at org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:857) [junit4]>at org.apache.lucene.index.IndexWriter.numDeletesToMerge(IndexWriter.java:5233) [junit4]>at org.apache.lucene.index.LogMergePolicy.sizeDocs(LogMergePolicy.java:153) [junit4]>at org.apache.lucene.index.LogDocMergePolicy.size(LogDocMergePolicy.java:44) [junit4]>at org.apache.lucene.index.LogMergePolicy.findMerges(LogMergePolicy.java:469) [junit4]>at org.apache.solr.handler.admin.SegmentsInfoRequestHandler.getMergeCandidatesNames(SegmentsInfoRequestHandler.java:100) [junit4]>at org.apache.solr.handler.admin.SegmentsInfoRequestHandler.getSegmentsInfo(SegmentsInfoRequestHandler.java:59) [junit4]>at org.apache.solr.handler.admin.SegmentsInfoRequestHandler.handleRequestBody(SegmentsInfoRequestHandler.java:48) [junit4]>at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199) [junit4]>at org.apache.solr.core.SolrCore.execute(SolrCore.java:2508) [junit4]>at org.apache.solr.util.TestHarness.query(TestHarness.java:337) [junit4]>at org.apache.solr.util.TestHarness.query(TestHarness.java:319) [junit4]>at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:890) [junit4]>at org.apache.solr.handler.admin.SegmentsInfoRequestHandlerTest.testSegmentInfosData(SegmentsInfoRequestHandlerTest.java:75) [junit4]>at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit4]>at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [junit4]>at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [junit4]>at java.base/java.lang.reflect.Method.invoke(Method.java:564) [junit4]>at java.base/java.lang.Thread.run(Thread.java:844) [...] [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=SegmentsInfoRequestHandlerTest -Dtests.method=testSegmentInfos -Dtests.seed=D8FA27F4CB25E126 -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=it -Dtests.timezone=Europe/Kaliningrad -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] FAILURE 0.00s J2 | SegmentsInfoRequestHandlerTest.testSegmentInfos <<< [junit4]> Throwable #1: java.lang.AssertionError [junit4]>at __randomizedtesting.SeedInfo.seed([D8FA27F4CB25E126:94ADB04B01293976]:0) [junit4]>at org.apache.lucene.index.IndexWriter$ReaderPool.noDups(IndexWriter.java:867) [junit4]>at org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:857) [junit4]>at org.apache.lucene.index.IndexWriter.numDeletesToMerge(IndexWriter.java:5233) [junit4]>at org.apache.lucene.index.LogMergePolicy.sizeDocs(LogMergePolicy.java:153) [junit4]>at org.apache.lucene.index.LogDocMergePolicy.size(LogDocMergePolicy.java:44) [junit4]>at org.apache.lucene.index.LogMergePolicy.findMerges(LogMergePolicy.java:469) [junit4]>at org.apache.solr.handler.admin.SegmentsInfoRequestHandler.getMergeCandidatesNames(SegmentsInfoRequestHandler.java:100) [junit4]>at org.apache.solr.handler.admin.SegmentsInfoRequestHandler.getSegmentsInfo(SegmentsInfoRequestHandler.java:59) [junit4]>at org.apache.solr.handler.admin.SegmentsInfoRequestHandler.handleRequestBody(SegmentsInfoRequestHandler.java:48) [junit4]>at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199) [junit4]>at org.apache.solr.core.SolrCore.execute(SolrCore.java:2508) [junit4]
[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments
[ https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16439579#comment-16439579 ] Simon Willnauer commented on LUCENE-8253: - [~erickerickson] I agree this doesn't affect your work. > ForceMergeDeletes does not merge soft-deleted segments > -- > > Key: LUCENE-8253 > URL: https://issues.apache.org/jira/browse/LUCENE-8253 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 7.4, master (8.0) >Reporter: Nhat Nguyen >Priority: Major > Attachments: LUCENE-8253.patch, test-merge.patch > > > IndexWriter#forceMergeDeletes should merge segments having soft-deleted > documents as hard-deleted documents if we configured "softDeletesField" in an > IndexWriterConfig. > Attached is a failed test. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments
[ https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16439570#comment-16439570 ] Erick Erickson commented on LUCENE-8253: Simon: I'm deep in the guts of TieredMergePolicy for LUCENE-7976 so wanted to check something. I took a _very_ quick scan through the patch and it doesn't look like this affects TieredMergePolicy at all. do you agree? All the work in TMP is just around creating a list of segments to merge and returning them to the serious merging code... > ForceMergeDeletes does not merge soft-deleted segments > -- > > Key: LUCENE-8253 > URL: https://issues.apache.org/jira/browse/LUCENE-8253 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 7.4, master (8.0) >Reporter: Nhat Nguyen >Priority: Major > Attachments: LUCENE-8253.patch, test-merge.patch > > > IndexWriter#forceMergeDeletes should merge segments having soft-deleted > documents as hard-deleted documents if we configured "softDeletesField" in an > IndexWriterConfig. > Attached is a failed test. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments
[ https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16439517#comment-16439517 ] ASF subversion and git services commented on LUCENE-8253: - Commit aeac55a602980c92ffee25602c6450e40eab6e6f in lucene-solr's branch refs/heads/branch_7x from [~simonw] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=aeac55a ] LUCENE-8253: Account for soft-deletes before they are flushed to disk Inside the IndexWriter buffers are only written to disk if it's needed or "worth it" which doesn't guarantee soft deletes to be accounted in time. This is not necessarily a problem since they are eventually collected and segments that have soft-deletes will me merged eventually but for tests and on par behavior compared to hard deletes this behavior is tricky. This change cuts over to accounting in-place just like hard-deletes. This results in accurate delete numbers for soft deletes at any give point in time once the reader is loaded or a pending soft delete occurs. This change also fixes an issue where all updates to a DV field are allowed event if the field is unknown. Now this only works if the field is equal to the soft deletes field. This behavior was never released. > ForceMergeDeletes does not merge soft-deleted segments > -- > > Key: LUCENE-8253 > URL: https://issues.apache.org/jira/browse/LUCENE-8253 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 7.4, master (8.0) >Reporter: Nhat Nguyen >Priority: Major > Attachments: LUCENE-8253.patch, test-merge.patch > > > IndexWriter#forceMergeDeletes should merge segments having soft-deleted > documents as hard-deleted documents if we configured "softDeletesField" in an > IndexWriterConfig. > Attached is a failed test. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments
[ https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16439495#comment-16439495 ] ASF subversion and git services commented on LUCENE-8253: - Commit c70cceaee56cecf35875cd2b5c8d5700f2b3cedb in lucene-solr's branch refs/heads/master from [~simonw] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=c70ccea ] LUCENE-8253: Account for soft-deletes before they are flushed to disk Inside the IndexWriter buffers are only written to disk if it's needed or "worth it" which doesn't guarantee soft deletes to be accounted in time. This is not necessarily a problem since they are eventually collected and segments that have soft-deletes will me merged eventually but for tests and on par behavior compared to hard deletes this behavior is tricky. This change cuts over to accounting in-place just like hard-deletes. This results in accurate delete numbers for soft deletes at any give point in time once the reader is loaded or a pending soft delete occurs. This change also fixes an issue where all updates to a DV field are allowed event if the field is unknown. Now this only works if the field is equal to the soft deletes field. This behavior was never released. > ForceMergeDeletes does not merge soft-deleted segments > -- > > Key: LUCENE-8253 > URL: https://issues.apache.org/jira/browse/LUCENE-8253 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 7.4, master (8.0) >Reporter: Nhat Nguyen >Priority: Major > Attachments: LUCENE-8253.patch, test-merge.patch > > > IndexWriter#forceMergeDeletes should merge segments having soft-deleted > documents as hard-deleted documents if we configured "softDeletesField" in an > IndexWriterConfig. > Attached is a failed test. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments
[ https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16439279#comment-16439279 ] Michael McCandless commented on LUCENE-8253: +1, I left some small comments on GH. > ForceMergeDeletes does not merge soft-deleted segments > -- > > Key: LUCENE-8253 > URL: https://issues.apache.org/jira/browse/LUCENE-8253 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 7.4, master (8.0) >Reporter: Nhat Nguyen >Priority: Major > Attachments: LUCENE-8253.patch, test-merge.patch > > > IndexWriter#forceMergeDeletes should merge segments having soft-deleted > documents as hard-deleted documents if we configured "softDeletesField" in an > IndexWriterConfig. > Attached is a failed test. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments
[ https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16439141#comment-16439141 ] Simon Willnauer commented on LUCENE-8253: - here is a review link https://github.com/s1monw/lucene-solr/pull/11/ > ForceMergeDeletes does not merge soft-deleted segments > -- > > Key: LUCENE-8253 > URL: https://issues.apache.org/jira/browse/LUCENE-8253 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 7.4, master (8.0) >Reporter: Nhat Nguyen >Priority: Major > Attachments: LUCENE-8253.patch, test-merge.patch > > > IndexWriter#forceMergeDeletes should merge segments having soft-deleted > documents as hard-deleted documents if we configured "softDeletesField" in an > IndexWriterConfig. > Attached is a failed test. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments
[ https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16439135#comment-16439135 ] Simon Willnauer commented on LUCENE-8253: - thanks [~dnhatn] good catch! I have attached a patch. [~mikemccand] can you take a look. I also optimized a couple of things along the way that were necessary due to test changes. Here is my commit message: {noformat} LUCENE-8253: Account for soft-deletes before they are flushed to disk Inside the IndexWriter buffers are only written to disk if it's needed or "worth it" which doesn't guarantee soft deletes to be accounted in time. This is not necessarily a problem since they are eventually collected and segments that have soft-deletes will me merged eventually but for tests and on par behavior compared to hard deletes this behavior is tricky. This change cuts over to accouting in-place just like hard-deletes. This results in accurate delete numbers for soft deltes at any give point in time once the reader is loaded or a pending soft delte occurs. This change also fixes an issue where all updates to a DV field are allowed event if the field is unknown. Now this only works if the field is equal to the soft deletes field. This behavior was never released. {noformat} > ForceMergeDeletes does not merge soft-deleted segments > -- > > Key: LUCENE-8253 > URL: https://issues.apache.org/jira/browse/LUCENE-8253 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 7.4, master (8.0) >Reporter: Nhat Nguyen >Priority: Major > Attachments: LUCENE-8253.patch, test-merge.patch > > > IndexWriter#forceMergeDeletes should merge segments having soft-deleted > documents as hard-deleted documents if we configured "softDeletesField" in an > IndexWriterConfig. > Attached is a failed test. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments
[ https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16438823#comment-16438823 ] Nhat Nguyen commented on LUCENE-8253: - FYI [~simonw] > ForceMergeDeletes does not merge soft-deleted segments > -- > > Key: LUCENE-8253 > URL: https://issues.apache.org/jira/browse/LUCENE-8253 > Project: Lucene - Core > Issue Type: Bug >Affects Versions: 7.4, master (8.0) >Reporter: Nhat Nguyen >Priority: Major > Attachments: test-merge.patch > > > IndexWriter#forceMergeDeletes should merge segments having soft-deleted > documents as hard-deleted documents if we configured "softDeletesField" in an > IndexWriterConfig. > Attached is a failed test. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org