kaivalnp commented on issue #15379:
URL: https://github.com/apache/lucene/issues/15379#issuecomment-3733582514

   Caching results for `Knn[Byte|Float]VectorQuery` can be tricky -- because 
their contract is to find the K highest scoring hits at the index-level -- but 
results are cached at the segment-level.
   
   With document deletes / updates, the segment-level results of a query can 
change -- for example if a document is deleted, then all cached segment-level 
results that contain the document are invalidated, because we need the next 
highest scoring doc for the query now.
   
   Even without deletes, if a new segment indexes a vector that is closer to a 
query, then the segment-level result containing the lowest-scoring document is 
no longer valid.
   
   At the very least, we'll need changes to the way cached results are used, 
allowing for re-computation if some segment-level result is invalidated?
   
   The problem with KNN queries is that each document cannot be determined as a 
hit _independently_ of other documents in the index.
   
   > tweaking it for knn queries helps in caching those?
   
   I don't think this will work, because KNN queries are marked as "not 
cacheable" because of the above reasons


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to