Simon Willnauer created LUCENE-8255:
---------------------------------------
Summary: Can we make index sorting work for soft deletes
Key: LUCENE-8255
URL: https://issues.apache.org/jira/browse/LUCENE-8255
Project: Lucene - Core
Issue Type: Improvement
Reporter: Simon Willnauer
I phrased this as a question since it's mainly a discussion. I spoke to
[~rcmuir] on a couple of occasions about making index sorting work for soft
deletes. The issue that prevents this is that soft deletes use updateable DV to
mark docs as deleted. This basically means that a sorted segment is not
guaranteed to be sorted if it has received any updates. This also means that
sorting such a segment on merge has a significant overhead. (I hope [~jimczi]
can shed some light on it how much we would have to expect). We also need to
add some special casing since we use "merge sorting" and can't go backwards in
doc ID which would be violated if a segment received updates. (cc [~jpountz])
The main purpose of doing this is that "soft deleted" documents would either be
at the end or in the beginning of the segment such that compression is better
if these docs have larger retention policies.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]