Re: [PR] Add validation to prevent zero vectors in KNN fields [lucene]

via GitHub Wed, 28 Jan 2026 04:02:17 -0800


benwtrent commented on PR #15621:
URL: https://github.com/apache/lucene/pull/15621#issuecomment-3810900302


   > They'll always return 0 values for dot product and Max inner product, and 
search won't really be able to differentiate based on those similarity scores.
   
   Technically, this is the same exact problem that vectors have in general. 
Two different vectors can return the same scores. I don't think this is a good 
reason.
   
   Additionally, "same scores" is possible even in term based search.
   
   > It will silently affect scores, graph geometry and result set, which feels 
trappy?
   
   I am not sure it will do so silently, again, users do all sorts of things to 
test, and it seems to me plausible for there to be zero vectors.
   
   > Are there meaningful scenarios where we want to allow zero vectors?
   
   I don't know. But `zero` vectors feel the same as any "uniform" component 
vector (e.g. all `1` or all `1.23`).
   
   Preventing this in cosine only keeps the system sane. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Add validation to prevent zero vectors in KNN fields [lucene]

Reply via email to