Re: [PR] Two bit quantization for dense vectors [lucene]

via GitHub Tue, 13 Jan 2026 10:40:11 -0800


mccullocht commented on PR #15564:
URL: https://github.com/apache/lucene/pull/15564#issuecomment-3745813145


   The exploration reduction is really cool, it makes seemingly linear distance 
computation cost increases sublinear in practice.
   
   For increasing query bits you could always go to `float` :). The delta and 
lower vector terms can be factored out of the computation and applied at the 
end as corrections. You do need to unpack the vector to 32 bit integers and 
convert to float, theoretically `expand()` or `shuffle()` should be sufficient 
for arranging this but I have no idea if the vector incubator API will generate 
good code for these (in particular `expand()` has no direct intrinsics on ARM). 
OSQ is also well arranged to perform integer dot products between vectors of 
different sizes so we could theoretically quantize to, say, 16 bits for the 
query every time.
   
   AFAICT the vector incubator is also missing direct 8-bit dot product 
instructions that would generally be faster for (DOT and SDOT on ARM, VNNI 
instructions like vpdpbusds on x86).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Two bit quantization for dense vectors [lucene]

Reply via email to