shubhamvishu commented on PR #13572: URL: https://github.com/apache/lucene/pull/13572#issuecomment-3647679954
Hi, Here at Amazon (customer facing product search), we’ve been testing this native dot product implementation in our production environment(ARM - Graviton 2 and 3) and we see **5-14x** faster dot product computations in JMH benchmarks and we observed semantic latency improving from **62 msec** to **28 msec** (avg) for 4K embeddings(4.5 MM). Overall we saw **10-60%** improvement on end-end avg search latencies in different scenarios (different sized vectors, vector-focused search vs search combined with other workloads). We haven’t tested all other CPUs types yet. I'm working on a draft PR on top of this PR with following changes and planning to raise it soon : - Removing the overhead from heap to off-heap copying by utilizing `Linker.Option.critical`, which eliminates unnecessary copying - Runtime dispatch using `IFUNC` to choose SVE vs NEON vs scalar implementation at runtime based on available intrinsics - Build related changes to generate the binary We kept the native code isolated in the misc package and not getting it in the core module which we know is highly discouraged. Additionally, PR #15285 would later help eliminate some code duplication and enable a cleaner implementation similar to `PanamaVectorUtilSupport` - potentially through a `NativeVectorUtilSupport` class? Our benchmarking suggests substantial optimization potential for ARM-based deployments, and we believe this could benefit the broader Lucene community. We hope to make it easy for any Lucene user to opt-in to this alternative vector implementation ideally. We're committed to refining this implementation based on community feedback and addressing any concerns during the review process. I'm eager to hear the community's thoughts on this change, as there appears to be significant optimization potential for ARM architectures that could benefit many users. Thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
