Pulkitg64 opened a new pull request, #15549: URL: https://github.com/apache/lucene/pull/15549
### Description This draft PR explores storing float vectors using 2 bytes (half-float/FP16) instead of 4 bytes (FP32), reducing vector disk usage by approximately 50%. The approach involves storing vectors on disk in half-float format while converting them back to full-float precision for dot-product computations during search and index merge operations. However, this conversion step introduces additional overhead during vector reads, resulting in slower indexing and search performance. This is an early draft to gather community feedback on the viability and direction of this implementation.. ToDo: Support for MemorySegmentVectorScorer with half-float vectors is yet to be implemented. * Benchmark Results: For no quantization, we are seeing around 100% increase in latency. For 8bit and 4 bit quantization, we are not seeing latency regression but for 4 bit we are seeing about 18% latency regression. We are seeing 20-25% drop in indexing rate across all quantization. | Encoding | recall | latency(ms) | quantized | index(s) | index_docs/s | index_size(MB) | vec_disk(MB) | vec_RAM(MB) | |----------|---------|--------------|------------|-----------|---------------|-----------------|---------------|-------------| | float16 | 0.991 | 11.392 | no | 34.8 | 2873.81 | 206.22 | 390.625 | 390.625 | | float16 | 0.981 | 4.337 | 8 bits | 41.55 | 2406.97 | 305.4 | 294.495 | 99.182 | | float16 | 0.926 | 6.069 | 4 bits | 42.07 | 2376.93 | 256.58 | 245.667 | 50.354 | | float32 | 0.991 | 4.942 | no | 28.93 | 3456.38 | 401.53 | 390.625 | 390.625 | | float32 | 0.981 | 4.367 | 8 bits | 32.04 | 3121.49 | 500.71 | 489.807 | 99.182 | | float32 | 0.926 | 5.343 | 4 bits | 32.12 | 3113.33 | 451.91 | 440.979 | 50.354 | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
