benwtrent commented on code in PR #15698:
URL: https://github.com/apache/lucene/pull/15698#discussion_r2795362391


##########
lucene/core/src/java/org/apache/lucene/codecs/lucene104/OffHeapScalarQuantizedVectorValues.java:
##########
@@ -177,19 +177,19 @@ public int getVectorByteLength() {
     return vectorValue.length;
   }
 
-  static void packNibbles(byte[] unpacked, byte[] packed) {
+  public static void packNibbles(byte[] unpacked, byte[] packed) {
     assert unpacked.length == packed.length * 2;
     for (int i = 0; i < packed.length; i++) {
-      int x = unpacked[i] << 4 | unpacked[packed.length + i];
+      int x = unpacked[i * 2] << 4 | unpacked[i * 2 + 1];
       packed[i] = (byte) x;
     }
   }

Review Comment:
   I don't understand how this doesn't change the packed bytes. 
   
   The "new" vector scorers would then assume things are in this layout (the 
scorer would just be loaded because of the format name). Consequently, scoring 
on old segments would completely break.
   
   packing `[0-15]` ends up in completely different bytes.
   
   Old way `byte[8] { 8, 25, 42, 59, 76, 93, 110, 127 }`
   
   New way `byte[8] { 1, 35, 69, 103, -119, -85, -51, -17 }`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to