Hi Stéphane,

Lucene's segments are made of an immutable "core", on top of which deleted
docs and doc-value updates may be applied. These deleted docs and doc-value
updates may change on every reopen (`DirectoryReader#open(IndexWriter)`)
while the segment's core (inverted index, points, etc.) is guaranteed to be
fully immutable. So if we took deleted docs into account in the query
cache, cached entries would be invalidated more frequently (until the next
reopen that brings more deletions vs. until the next merge that merges this
segment away).

One could argue that applying deletes is fine because deletes may only
accrue over time. But this is a bit dangerous, as nothing prevents users
from running a search on an older point-in-time view of the index after a
refresh occurs, and the benefits would likely not be worth the extra
trappiness. So we're sticking to the simplest approach: the query cache
completely ignores deleted docs, and deleted docs need to be applied on top
of the cache, typically by a `BulkScorer`.


On Thu, Jul 24, 2025 at 11:28 AM Stéphane Campinas <
stephane.campi...@gmail.com> wrote:

> Hello,
>
> In the bitset cache implementations (in LRUQueryCache), I am wondering
> why livedocs aren't passed to the `score` methods used to create
> bitsets. See [0] for an example.
> My understanding would be that passing livedocs would benefit the
> creation of the cached bitset:
>
> - faster search since some docs are deleted
> - potentially smaller memory usage, e.g., with the roaring bitset
> - potentially smaller cost of the FixedBitset
>
> I have checked the mailing list archive, as well as the commit history,
> but found no explanation. Could you explain the reason why livedocs
> isn't passed to the `score` method ?
>
> Best,
>
> [0]
> https://github.com/apache/lucene/blob/7fe43de9f6678b41d8894258e1f99a9f38e87689/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java#L539-L553
> --
> Campinas Stephane
>


-- 
Adrien

Reply via email to