benwtrent commented on PR #15549:
URL: https://github.com/apache/lucene/pull/15549#issuecomment-3711735284

   @Pulkitg64 the latency is the main concern IMO. We must copy the vectors 
onto heap (we know this is expensive), transform the bytes to `float32` (which 
is an additional cost), then do the float32 panama vector actions (which are 
super fast). I would expect this to also impact quantization query time for 
anything that must rescore (though, likely less of an impact as that would be 
fewer vectors to decode).
   
   I wonder if all the cost is spent just decoding the vector? What does a 
flame graph tell you?
   
   Also, could you indicate your JVM, etc.? 
   
   See this interesting jep update on the ever incubating vector API: 
   
   https://openjdk.org/jeps/508
   
   > Addition, subtraction, division, multiplication, square root, and fused 
multiply/add operations on Float16 values are now auto-vectorized on supporting 
x64 CPUs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to