benwtrent opened a new pull request, #15564:
URL: https://github.com/apache/lucene/pull/15564
This adds 2-bit quantization as an option for scalar quantizer. For HNSW,
this gives a nice recall improvement with little latency increase.
Full disclosure, I used this work as an experiment to try and get reps with
opencode. So, yeah, its agentic. I did go through and review, then benchmarked
with Lucene util (PR to add 2 bit support there will occur soonish).
The agent added way too many comments (so annoying), and didn't correctly
reuse code (I had to move things around, or this would have been many 100s more
loc churn). But, I was pleasantly surprised by how good it did. Goes to show
that not only are these things getting pretty good, but also that the format
API was well designed.
Lucene Util benchmarks (no rescoring)
```
recall latency(ms) netCPU avgCpuCount fanout quantized visited
index(s) index_docs/s force_merge(s) index_size(MB) vec_disk(MB)
vec_RAM(MB)
0.503 0.194 0.181 0.933 0 1 bits 1151
258.04 3875.32 189.18 3136.53 3034.592 104.904
0.537 0.201 0.195 0.970 5 1 bits 1446
258.04 3875.32 189.18 3136.53 3034.592 104.904
0.558 0.239 0.233 0.975 10 1 bits 1728
258.04 3875.32 189.18 3136.53 3034.592 104.904
0.586 0.318 0.312 0.981 25 1 bits 2494
258.04 3875.32 189.18 3136.53 3034.592 104.904
0.601 0.444 0.438 0.986 50 1 bits 3701
258.04 3875.32 189.18 3136.53 3034.592 104.904
0.607 0.697 0.689 0.989 100 1 bits 5929
258.04 3875.32 189.18 3136.53 3034.592 104.904
0.594 0.190 0.183 0.963 0 2 bits 1056
262.08 3815.67 227.70 3221.32 3126.144 196.457
0.642 0.229 0.223 0.974 5 2 bits 1328
262.08 3815.67 227.70 3221.32 3126.144 196.457
0.669 0.265 0.259 0.977 10 2 bits 1582
262.08 3815.67 227.70 3221.32 3126.144 196.457
0.709 0.367 0.361 0.984 25 2 bits 2306
262.08 3815.67 227.70 3221.32 3126.144 196.457
0.730 0.519 0.512 0.987 50 2 bits 3424
262.08 3815.67 227.70 3221.32 3126.144 196.457
0.744 0.801 0.794 0.991 100 2 bits 5481
262.08 3815.67 227.70 3221.32 3126.144 196.457
0.664 0.223 0.216 0.969 0 4 bits 1000
273.00 3663.06 130.39 3402.28 3311.157 381.470
0.730 0.261 0.255 0.977 5 4 bits 1257
273.00 3663.06 130.39 3402.28 3311.157 381.470
0.765 0.304 0.296 0.974 10 4 bits 1494
273.00 3663.06 130.39 3402.28 3311.157 381.470
0.824 0.426 0.418 0.981 25 4 bits 2165
273.00 3663.06 130.39 3402.28 3311.157 381.470
0.859 0.608 0.602 0.990 50 4 bits 3228
273.00 3663.06 130.39 3402.28 3311.157 381.470
0.883 0.966 0.958 0.992 100 4 bits 5193
273.00 3663.06 130.39 3402.28 3311.157 381.470
0.681 0.322 0.309 0.960 0 7 bits 974
254.66 3926.84 223.75 3765.53 3677.368 747.681
0.753 0.378 0.372 0.984 5 7 bits 1215
254.66 3926.84 223.75 3765.53 3677.368 747.681
0.798 0.423 0.417 0.986 10 7 bits 1442
254.66 3926.84 223.75 3765.53 3677.368 747.681
0.873 0.606 0.600 0.990 25 7 bits 2096
254.66 3926.84 223.75 3765.53 3677.368 747.681
0.921 0.878 0.870 0.991 50 7 bits 3121
254.66 3926.84 223.75 3765.53 3677.368 747.681
0.953 1.427 1.419 0.994 100 7 bits 5008
254.66 3926.84 223.75 3765.53 3677.368 747.681
0.694 0.302 0.296 0.980 0 8 bits 993
298.80 3346.74 220.72 3765.59 3677.368 747.681
0.763 0.383 0.377 0.984 5 8 bits 1237
298.80 3346.74 220.72 3765.59 3677.368 747.681
0.808 0.442 0.436 0.986 10 8 bits 1466
298.80 3346.74 220.72 3765.59 3677.368 747.681
0.877 0.601 0.594 0.988 25 8 bits 2108
298.80 3346.74 220.72 3765.59 3677.368 747.681
0.927 0.891 0.883 0.991 50 8 bits 3123
298.80 3346.74 220.72 3765.59 3677.368 747.681
0.962 1.410 1.404 0.996 100 8 bits 5019
298.80 3346.74 220.72 3765.59 3677.368 747.681
0.691 0.212 0.205 0.967 0 no 1011
360.34 2775.15 258.21 3021.60 2929.688 2929.688
0.767 0.263 0.257 0.977 5 no 1265
360.34 2775.15 258.21 3021.60 2929.688 2929.688
0.813 0.312 0.306 0.981 10 no 1503
360.34 2775.15 258.21 3021.60 2929.688 2929.688
0.889 0.451 0.443 0.982 25 no 2175
360.34 2775.15 258.21 3021.60 2929.688 2929.688
0.938 0.682 0.675 0.990 50 no 3212
360.34 2775.15 258.21 3021.60 2929.688 2929.688
0.969 1.112 1.105 0.994 100 no 5177
360.34 2775.15 258.21 3021.60 2929.688 2929.688
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]