[
https://issues.apache.org/jira/browse/LUCENE-8293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16462508#comment-16462508
]
Erick Erickson commented on LUCENE-8293:
----------------------------------------
Related question: Does this have any implications for TieredMergePolicy? In
particular TMP relies on:
IndexWriter.numDeletesToMerge(info);
SegmentCommitInfo.info.maxDoc()
in order to score documents to pass off to the merging code. I'm not worried
about the nuts and bolts of merging you're addressing here, mostly whether
IndexWriter.numDeletesToMerge(info); will continue to reflect the number of
docs that will be merged away.
> Ensure only hard deletes are carried over in a merge
> ----------------------------------------------------
>
> Key: LUCENE-8293
> URL: https://issues.apache.org/jira/browse/LUCENE-8293
> Project: Lucene - Core
> Issue Type: Bug
> Affects Versions: 7.4, master (8.0)
> Reporter: Simon Willnauer
> Priority: Major
> Fix For: 7.4, master (8.0)
>
> Attachments: LUCENE-8293.patch
>
>
> Today we carry over hard deletes based on the SegmentReaders liveDocs.
> This is not correct if soft-deletes are used especially with rentention
> policies. If a soft delete is added while a segment is merged the document
> might end up hard deleted in the target segment. This isn't necessarily a
> correctness issue but causes unnecessary writes of hard-deletes. The
> biggest
> issue here is that we assert that previously deleted documents are still
> deleted
> in the live-docs we apply and that might be violated by the retention
> policy.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]