[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments

2018-04-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440956#comment-16440956
 ] 

ASF subversion and git services commented on LUCENE-8253:
-

Commit 330fd18f200dae0892b3aa0882668435730c4319 in lucene-solr's branch 
refs/heads/branch_7x from [~simonw]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=330fd18 ]

LUCENE-8253: Don't create ReadersAndUpdates for foreign segments

IndexWriter#numDeletesToMerge was creating a ReadersAndUpdates
for all incoming SegmentCommitInfo even if that info wasn't private
to the IndexWriter. This is an illegal use of this API but since it's
transitively public via MergePolicy#findMerges we have to be conservative
with regestiering ReadersAndUpdates. In IndexWriter#numDeletesToMerge we
can only use existing ones. This means for soft-deletes we need to react
earlier in order to produce accurate numbers.

This change partially rolls back the changes in LUCENE-8253. Instead of
registering the readers once they are pulled via IndexWriter#numDeletesToMerge
we now check if segments are fully deleted on flush which is very unlikely and
can be done in a lazy fashion ie. it's only paying the extra cost of opening a
reader and checking all soft-deletes if soft deletes are used and present
in the flushed segment.

This has the side-effect that flushed segments that are 100% hard deleted are 
also
cleaned up right after they are flushed, previously these segments were sticking
around for a while until they got picked for a merge or received another delete.

This also closes LUCENE-8256


> ForceMergeDeletes does not merge soft-deleted segments
> --
>
> Key: LUCENE-8253
> URL: https://issues.apache.org/jira/browse/LUCENE-8253
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 7.4, master (8.0)
>Reporter: Nhat Nguyen
>Assignee: Simon Willnauer
>Priority: Major
> Attachments: LUCENE-8253.patch, test-merge.patch
>
>
> IndexWriter#forceMergeDeletes should merge segments having soft-deleted 
> documents as hard-deleted documents if we configured "softDeletesField" in an 
> IndexWriterConfig.
> Attached is a failed test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments

2018-04-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440957#comment-16440957
 ] 

ASF subversion and git services commented on LUCENE-8253:
-

Commit 330fd18f200dae0892b3aa0882668435730c4319 in lucene-solr's branch 
refs/heads/branch_7x from [~simonw]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=330fd18 ]

LUCENE-8253: Don't create ReadersAndUpdates for foreign segments

IndexWriter#numDeletesToMerge was creating a ReadersAndUpdates
for all incoming SegmentCommitInfo even if that info wasn't private
to the IndexWriter. This is an illegal use of this API but since it's
transitively public via MergePolicy#findMerges we have to be conservative
with regestiering ReadersAndUpdates. In IndexWriter#numDeletesToMerge we
can only use existing ones. This means for soft-deletes we need to react
earlier in order to produce accurate numbers.

This change partially rolls back the changes in LUCENE-8253. Instead of
registering the readers once they are pulled via IndexWriter#numDeletesToMerge
we now check if segments are fully deleted on flush which is very unlikely and
can be done in a lazy fashion ie. it's only paying the extra cost of opening a
reader and checking all soft-deletes if soft deletes are used and present
in the flushed segment.

This has the side-effect that flushed segments that are 100% hard deleted are 
also
cleaned up right after they are flushed, previously these segments were sticking
around for a while until they got picked for a merge or received another delete.

This also closes LUCENE-8256


> ForceMergeDeletes does not merge soft-deleted segments
> --
>
> Key: LUCENE-8253
> URL: https://issues.apache.org/jira/browse/LUCENE-8253
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 7.4, master (8.0)
>Reporter: Nhat Nguyen
>Assignee: Simon Willnauer
>Priority: Major
> Attachments: LUCENE-8253.patch, test-merge.patch
>
>
> IndexWriter#forceMergeDeletes should merge segments having soft-deleted 
> documents as hard-deleted documents if we configured "softDeletesField" in an 
> IndexWriterConfig.
> Attached is a failed test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments

2018-04-17 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440954#comment-16440954
 ] 

Steve Rowe commented on LUCENE-8253:


Thanks [~simonw]!

> ForceMergeDeletes does not merge soft-deleted segments
> --
>
> Key: LUCENE-8253
> URL: https://issues.apache.org/jira/browse/LUCENE-8253
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 7.4, master (8.0)
>Reporter: Nhat Nguyen
>Assignee: Simon Willnauer
>Priority: Major
> Attachments: LUCENE-8253.patch, test-merge.patch
>
>
> IndexWriter#forceMergeDeletes should merge segments having soft-deleted 
> documents as hard-deleted documents if we configured "softDeletesField" in an 
> IndexWriterConfig.
> Attached is a failed test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments

2018-04-17 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440951#comment-16440951
 ] 

Simon Willnauer commented on LUCENE-8253:
-

[~steve_rowe] I fixed the issue and reenabled the test. sorry for the noise

> ForceMergeDeletes does not merge soft-deleted segments
> --
>
> Key: LUCENE-8253
> URL: https://issues.apache.org/jira/browse/LUCENE-8253
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 7.4, master (8.0)
>Reporter: Nhat Nguyen
>Assignee: Simon Willnauer
>Priority: Major
> Attachments: LUCENE-8253.patch, test-merge.patch
>
>
> IndexWriter#forceMergeDeletes should merge segments having soft-deleted 
> documents as hard-deleted documents if we configured "softDeletesField" in an 
> IndexWriterConfig.
> Attached is a failed test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments

2018-04-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440944#comment-16440944
 ] 

ASF subversion and git services commented on LUCENE-8253:
-

Commit d904112428184ce9c1726313add5d184f4014a72 in lucene-solr's branch 
refs/heads/master from [~simonw]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=d904112 ]

LUCENE-8253: Don't create ReadersAndUpdates for foreign segments

IndexWriter#numDeletesToMerge was creating a ReadersAndUpdates
for all incoming SegmentCommitInfo even if that info wasn't private
to the IndexWriter. This is an illegal use of this API but since it's
transitively public via MergePolicy#findMerges we have to be conservative
with regestiering ReadersAndUpdates. In IndexWriter#numDeletesToMerge we
can only use existing ones. This means for soft-deletes we need to react
earlier in order to produce accurate numbers.

This change partially rolls back the changes in LUCENE-8253. Instead of
registering the readers once they are pulled via IndexWriter#numDeletesToMerge
we now check if segments are fully deleted on flush which is very unlikely and
can be done in a lazy fashion ie. it's only paying the extra cost of opening a
reader and checking all soft-deletes if soft deletes are used and present
in the flushed segment.

This has the side-effect that flushed segments that are 100% hard deleted are 
also
cleaned up right after they are flushed, previously these segments were sticking
around for a while until they got picked for a merge or received another delete.

This also closes LUCENE-8256


> ForceMergeDeletes does not merge soft-deleted segments
> --
>
> Key: LUCENE-8253
> URL: https://issues.apache.org/jira/browse/LUCENE-8253
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 7.4, master (8.0)
>Reporter: Nhat Nguyen
>Assignee: Simon Willnauer
>Priority: Major
> Attachments: LUCENE-8253.patch, test-merge.patch
>
>
> IndexWriter#forceMergeDeletes should merge segments having soft-deleted 
> documents as hard-deleted documents if we configured "softDeletesField" in an 
> IndexWriterConfig.
> Attached is a failed test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments

2018-04-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440943#comment-16440943
 ] 

ASF subversion and git services commented on LUCENE-8253:
-

Commit d904112428184ce9c1726313add5d184f4014a72 in lucene-solr's branch 
refs/heads/master from [~simonw]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=d904112 ]

LUCENE-8253: Don't create ReadersAndUpdates for foreign segments

IndexWriter#numDeletesToMerge was creating a ReadersAndUpdates
for all incoming SegmentCommitInfo even if that info wasn't private
to the IndexWriter. This is an illegal use of this API but since it's
transitively public via MergePolicy#findMerges we have to be conservative
with regestiering ReadersAndUpdates. In IndexWriter#numDeletesToMerge we
can only use existing ones. This means for soft-deletes we need to react
earlier in order to produce accurate numbers.

This change partially rolls back the changes in LUCENE-8253. Instead of
registering the readers once they are pulled via IndexWriter#numDeletesToMerge
we now check if segments are fully deleted on flush which is very unlikely and
can be done in a lazy fashion ie. it's only paying the extra cost of opening a
reader and checking all soft-deletes if soft deletes are used and present
in the flushed segment.

This has the side-effect that flushed segments that are 100% hard deleted are 
also
cleaned up right after they are flushed, previously these segments were sticking
around for a while until they got picked for a merge or received another delete.

This also closes LUCENE-8256


> ForceMergeDeletes does not merge soft-deleted segments
> --
>
> Key: LUCENE-8253
> URL: https://issues.apache.org/jira/browse/LUCENE-8253
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 7.4, master (8.0)
>Reporter: Nhat Nguyen
>Assignee: Simon Willnauer
>Priority: Major
> Attachments: LUCENE-8253.patch, test-merge.patch
>
>
> IndexWriter#forceMergeDeletes should merge segments having soft-deleted 
> documents as hard-deleted documents if we configured "softDeletesField" in an 
> IndexWriterConfig.
> Attached is a failed test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments

2018-04-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440745#comment-16440745
 ] 

ASF subversion and git services commented on LUCENE-8253:
-

Commit 94adf9d2ff42cc4133354f7ab09ed32c496250b9 in lucene-solr's branch 
refs/heads/branch_7x from [~romseygeek]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=94adf9d ]

LUCENE-8253: Mute test while a fix is worked on


> ForceMergeDeletes does not merge soft-deleted segments
> --
>
> Key: LUCENE-8253
> URL: https://issues.apache.org/jira/browse/LUCENE-8253
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 7.4, master (8.0)
>Reporter: Nhat Nguyen
>Assignee: Simon Willnauer
>Priority: Major
> Attachments: LUCENE-8253.patch, test-merge.patch
>
>
> IndexWriter#forceMergeDeletes should merge segments having soft-deleted 
> documents as hard-deleted documents if we configured "softDeletesField" in an 
> IndexWriterConfig.
> Attached is a failed test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments

2018-04-17 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440746#comment-16440746
 ] 

ASF subversion and git services commented on LUCENE-8253:
-

Commit f7f12a51f313bf406f0fa3d48e74864268338c6d in lucene-solr's branch 
refs/heads/master from [~romseygeek]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=f7f12a5 ]

LUCENE-8253: Mute test while a fix is worked on


> ForceMergeDeletes does not merge soft-deleted segments
> --
>
> Key: LUCENE-8253
> URL: https://issues.apache.org/jira/browse/LUCENE-8253
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 7.4, master (8.0)
>Reporter: Nhat Nguyen
>Assignee: Simon Willnauer
>Priority: Major
> Attachments: LUCENE-8253.patch, test-merge.patch
>
>
> IndexWriter#forceMergeDeletes should merge segments having soft-deleted 
> documents as hard-deleted documents if we configured "softDeletesField" in an 
> IndexWriterConfig.
> Attached is a failed test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments

2018-04-16 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440073#comment-16440073
 ] 

Steve Rowe commented on LUCENE-8253:


{{git bisect}} blames commit {{c70ccea}} on this issue for reproducing Solr 
{{SegmentsInfoRequestHandlerTest}} failures, e.g. from 
[https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/21840/]:

{noformat}
   [junit4]   2> NOTE: reproduce with: ant test  
-Dtestcase=SegmentsInfoRequestHandlerTest -Dtests.method=testSegmentInfosData 
-Dtests.seed=D8FA27F4CB25E126 -Dtests.multiplier=3 -Dtests.slow=true 
-Dtests.locale=it -Dtests.timezone=Europe/Kaliningrad -Dtests.asserts=true 
-Dtests.file.encoding=US-ASCII
   [junit4] FAILURE 0.00s J2 | 
SegmentsInfoRequestHandlerTest.testSegmentInfosData <<<
   [junit4]> Throwable #1: java.lang.AssertionError
   [junit4]>at 
__randomizedtesting.SeedInfo.seed([D8FA27F4CB25E126:A45305AF7DCD56B9]:0)
   [junit4]>at 
org.apache.lucene.index.IndexWriter$ReaderPool.noDups(IndexWriter.java:867)
   [junit4]>at 
org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:857)
   [junit4]>at 
org.apache.lucene.index.IndexWriter.numDeletesToMerge(IndexWriter.java:5233)
   [junit4]>at 
org.apache.lucene.index.LogMergePolicy.sizeDocs(LogMergePolicy.java:153)
   [junit4]>at 
org.apache.lucene.index.LogDocMergePolicy.size(LogDocMergePolicy.java:44)
   [junit4]>at 
org.apache.lucene.index.LogMergePolicy.findMerges(LogMergePolicy.java:469)
   [junit4]>at 
org.apache.solr.handler.admin.SegmentsInfoRequestHandler.getMergeCandidatesNames(SegmentsInfoRequestHandler.java:100)
   [junit4]>at 
org.apache.solr.handler.admin.SegmentsInfoRequestHandler.getSegmentsInfo(SegmentsInfoRequestHandler.java:59)
   [junit4]>at 
org.apache.solr.handler.admin.SegmentsInfoRequestHandler.handleRequestBody(SegmentsInfoRequestHandler.java:48)
   [junit4]>at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199)
   [junit4]>at 
org.apache.solr.core.SolrCore.execute(SolrCore.java:2508)
   [junit4]>at 
org.apache.solr.util.TestHarness.query(TestHarness.java:337)
   [junit4]>at 
org.apache.solr.util.TestHarness.query(TestHarness.java:319)
   [junit4]>at 
org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:890)
   [junit4]>at 
org.apache.solr.handler.admin.SegmentsInfoRequestHandlerTest.testSegmentInfosData(SegmentsInfoRequestHandlerTest.java:75)
   [junit4]>at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   [junit4]>at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   [junit4]>at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   [junit4]>at 
java.base/java.lang.reflect.Method.invoke(Method.java:564)
   [junit4]>at java.base/java.lang.Thread.run(Thread.java:844)
[...]
   [junit4]   2> NOTE: reproduce with: ant test  
-Dtestcase=SegmentsInfoRequestHandlerTest -Dtests.method=testSegmentInfos 
-Dtests.seed=D8FA27F4CB25E126 -Dtests.multiplier=3 -Dtests.slow=true 
-Dtests.locale=it -Dtests.timezone=Europe/Kaliningrad -Dtests.asserts=true 
-Dtests.file.encoding=US-ASCII
   [junit4] FAILURE 0.00s J2 | SegmentsInfoRequestHandlerTest.testSegmentInfos 
<<<
   [junit4]> Throwable #1: java.lang.AssertionError
   [junit4]>at 
__randomizedtesting.SeedInfo.seed([D8FA27F4CB25E126:94ADB04B01293976]:0)
   [junit4]>at 
org.apache.lucene.index.IndexWriter$ReaderPool.noDups(IndexWriter.java:867)
   [junit4]>at 
org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:857)
   [junit4]>at 
org.apache.lucene.index.IndexWriter.numDeletesToMerge(IndexWriter.java:5233)
   [junit4]>at 
org.apache.lucene.index.LogMergePolicy.sizeDocs(LogMergePolicy.java:153)
   [junit4]>at 
org.apache.lucene.index.LogDocMergePolicy.size(LogDocMergePolicy.java:44)
   [junit4]>at 
org.apache.lucene.index.LogMergePolicy.findMerges(LogMergePolicy.java:469)
   [junit4]>at 
org.apache.solr.handler.admin.SegmentsInfoRequestHandler.getMergeCandidatesNames(SegmentsInfoRequestHandler.java:100)
   [junit4]>at 
org.apache.solr.handler.admin.SegmentsInfoRequestHandler.getSegmentsInfo(SegmentsInfoRequestHandler.java:59)
   [junit4]>at 
org.apache.solr.handler.admin.SegmentsInfoRequestHandler.handleRequestBody(SegmentsInfoRequestHandler.java:48)
   [junit4]>at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199)
   [junit4]>at 
org.apache.solr.core.SolrCore.execute(SolrCore.java:2508)
   [junit4]  

[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments

2018-04-16 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16439579#comment-16439579
 ] 

Simon Willnauer commented on LUCENE-8253:
-

[~erickerickson] I agree this doesn't affect your work.

> ForceMergeDeletes does not merge soft-deleted segments
> --
>
> Key: LUCENE-8253
> URL: https://issues.apache.org/jira/browse/LUCENE-8253
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 7.4, master (8.0)
>Reporter: Nhat Nguyen
>Priority: Major
> Attachments: LUCENE-8253.patch, test-merge.patch
>
>
> IndexWriter#forceMergeDeletes should merge segments having soft-deleted 
> documents as hard-deleted documents if we configured "softDeletesField" in an 
> IndexWriterConfig.
> Attached is a failed test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments

2018-04-16 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16439570#comment-16439570
 ] 

Erick Erickson commented on LUCENE-8253:


Simon:

I'm deep in the guts of TieredMergePolicy for LUCENE-7976 so wanted to check 
something. I took a _very_ quick scan through the patch and it doesn't look 
like this affects TieredMergePolicy at all. do you agree? All the work in TMP 
is just around creating a list of segments to merge and returning them to the 
serious merging code...

> ForceMergeDeletes does not merge soft-deleted segments
> --
>
> Key: LUCENE-8253
> URL: https://issues.apache.org/jira/browse/LUCENE-8253
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 7.4, master (8.0)
>Reporter: Nhat Nguyen
>Priority: Major
> Attachments: LUCENE-8253.patch, test-merge.patch
>
>
> IndexWriter#forceMergeDeletes should merge segments having soft-deleted 
> documents as hard-deleted documents if we configured "softDeletesField" in an 
> IndexWriterConfig.
> Attached is a failed test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments

2018-04-16 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16439517#comment-16439517
 ] 

ASF subversion and git services commented on LUCENE-8253:
-

Commit aeac55a602980c92ffee25602c6450e40eab6e6f in lucene-solr's branch 
refs/heads/branch_7x from [~simonw]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=aeac55a ]

LUCENE-8253: Account for soft-deletes before they are flushed to disk

Inside the IndexWriter buffers are only written to disk if it's needed
or "worth it" which doesn't guarantee soft deletes to be accounted
in time. This is not necessarily a problem since they are eventually
collected and segments that have soft-deletes will me merged eventually
but for tests and on par behavior compared to hard deletes this behavior
is tricky.
This change cuts over to accounting in-place just like hard-deletes. This
results in accurate delete numbers for soft deletes at any give point in time
once the reader is loaded or a pending soft delete occurs.

This change also fixes an issue where all updates to a DV field are allowed
event if the field is unknown. Now this only works if the field is equal
to the soft deletes field. This behavior was never released.


> ForceMergeDeletes does not merge soft-deleted segments
> --
>
> Key: LUCENE-8253
> URL: https://issues.apache.org/jira/browse/LUCENE-8253
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 7.4, master (8.0)
>Reporter: Nhat Nguyen
>Priority: Major
> Attachments: LUCENE-8253.patch, test-merge.patch
>
>
> IndexWriter#forceMergeDeletes should merge segments having soft-deleted 
> documents as hard-deleted documents if we configured "softDeletesField" in an 
> IndexWriterConfig.
> Attached is a failed test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments

2018-04-16 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16439495#comment-16439495
 ] 

ASF subversion and git services commented on LUCENE-8253:
-

Commit c70cceaee56cecf35875cd2b5c8d5700f2b3cedb in lucene-solr's branch 
refs/heads/master from [~simonw]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=c70ccea ]

LUCENE-8253: Account for soft-deletes before they are flushed to disk

Inside the IndexWriter buffers are only written to disk if it's needed
or "worth it" which doesn't guarantee soft deletes to be accounted
in time. This is not necessarily a problem since they are eventually
collected and segments that have soft-deletes will me merged eventually
but for tests and on par behavior compared to hard deletes this behavior
is tricky.
This change cuts over to accounting in-place just like hard-deletes. This
results in accurate delete numbers for soft deletes at any give point in time
once the reader is loaded or a pending soft delete occurs.

This change also fixes an issue where all updates to a DV field are allowed
event if the field is unknown. Now this only works if the field is equal
to the soft deletes field. This behavior was never released.


> ForceMergeDeletes does not merge soft-deleted segments
> --
>
> Key: LUCENE-8253
> URL: https://issues.apache.org/jira/browse/LUCENE-8253
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 7.4, master (8.0)
>Reporter: Nhat Nguyen
>Priority: Major
> Attachments: LUCENE-8253.patch, test-merge.patch
>
>
> IndexWriter#forceMergeDeletes should merge segments having soft-deleted 
> documents as hard-deleted documents if we configured "softDeletesField" in an 
> IndexWriterConfig.
> Attached is a failed test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments

2018-04-16 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16439279#comment-16439279
 ] 

Michael McCandless commented on LUCENE-8253:


+1, I left some small comments on GH.

> ForceMergeDeletes does not merge soft-deleted segments
> --
>
> Key: LUCENE-8253
> URL: https://issues.apache.org/jira/browse/LUCENE-8253
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 7.4, master (8.0)
>Reporter: Nhat Nguyen
>Priority: Major
> Attachments: LUCENE-8253.patch, test-merge.patch
>
>
> IndexWriter#forceMergeDeletes should merge segments having soft-deleted 
> documents as hard-deleted documents if we configured "softDeletesField" in an 
> IndexWriterConfig.
> Attached is a failed test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments

2018-04-16 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16439141#comment-16439141
 ] 

Simon Willnauer commented on LUCENE-8253:
-

here is a review link https://github.com/s1monw/lucene-solr/pull/11/

> ForceMergeDeletes does not merge soft-deleted segments
> --
>
> Key: LUCENE-8253
> URL: https://issues.apache.org/jira/browse/LUCENE-8253
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 7.4, master (8.0)
>Reporter: Nhat Nguyen
>Priority: Major
> Attachments: LUCENE-8253.patch, test-merge.patch
>
>
> IndexWriter#forceMergeDeletes should merge segments having soft-deleted 
> documents as hard-deleted documents if we configured "softDeletesField" in an 
> IndexWriterConfig.
> Attached is a failed test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments

2018-04-16 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16439135#comment-16439135
 ] 

Simon Willnauer commented on LUCENE-8253:
-

thanks [~dnhatn] good catch! I have attached a patch. [~mikemccand] can you 
take a look. I also optimized a couple of things along the way that were 
necessary due to test changes. Here is my commit message:


{noformat}
 LUCENE-8253: Account for soft-deletes before they are flushed to disk

Inside the IndexWriter buffers are only written to disk if it's needed
or "worth it" which doesn't guarantee soft deletes to be accounted
in time. This is not necessarily a problem since they are eventually
collected and segments that have soft-deletes will me merged eventually
but for tests and on par behavior compared to hard deletes this behavior
is tricky.
This change cuts over to accouting in-place just like hard-deletes. This
results in accurate delete numbers for soft deltes at any give point in time
once the reader is loaded or a pending soft delte occurs.

This change also fixes an issue where all updates to a DV field are allowed
event if the field is unknown. Now this only works if the field is equal
to the soft deletes field. This behavior was never released.
{noformat}


> ForceMergeDeletes does not merge soft-deleted segments
> --
>
> Key: LUCENE-8253
> URL: https://issues.apache.org/jira/browse/LUCENE-8253
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 7.4, master (8.0)
>Reporter: Nhat Nguyen
>Priority: Major
> Attachments: LUCENE-8253.patch, test-merge.patch
>
>
> IndexWriter#forceMergeDeletes should merge segments having soft-deleted 
> documents as hard-deleted documents if we configured "softDeletesField" in an 
> IndexWriterConfig.
> Attached is a failed test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8253) ForceMergeDeletes does not merge soft-deleted segments

2018-04-15 Thread Nhat Nguyen (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16438823#comment-16438823
 ] 

Nhat Nguyen commented on LUCENE-8253:
-

FYI [~simonw]

> ForceMergeDeletes does not merge soft-deleted segments
> --
>
> Key: LUCENE-8253
> URL: https://issues.apache.org/jira/browse/LUCENE-8253
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 7.4, master (8.0)
>Reporter: Nhat Nguyen
>Priority: Major
> Attachments: test-merge.patch
>
>
> IndexWriter#forceMergeDeletes should merge segments having soft-deleted 
> documents as hard-deleted documents if we configured "softDeletesField" in an 
> IndexWriterConfig.
> Attached is a failed test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org