Re: [PR] Close vector search final-phase gaps [pinot]

via GitHub Fri, 10 Apr 2026 23:25:02 -0700


Copilot commented on code in PR #18168:
URL: https://github.com/apache/pinot/pull/18168#discussion_r3067689740



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/readers/vector/IvfFlatVectorIndexReader.java:
##########
@@ -121,6 +131,29 @@ public IvfFlatVectorIndexReader(String column, File 
indexDir, VectorIndexConfig
           distanceFunctionOrdinal, column, allFunctions.length - 1);
       _distanceFunction = allFunctions[distanceFunctionOrdinal];
 
+      if (_indexFormatVersion == IvfFlatVectorIndexCreator.FORMAT_VERSION) {
+        int quantizerTypeOrdinal = in.readInt();
+        VectorQuantizerType[] allQuantizerTypes = VectorQuantizerType.values();
+        Preconditions.checkState(quantizerTypeOrdinal >= 0 && 
quantizerTypeOrdinal < allQuantizerTypes.length,
+            "Invalid quantizer type ordinal %s in IVF_FLAT index for column: 
%s (valid range: 0-%s)",
+            quantizerTypeOrdinal, column, allQuantizerTypes.length - 1);
+        _quantizerType = allQuantizerTypes[quantizerTypeOrdinal];
+
+        int quantizerParamsLength = in.readInt();
+        Preconditions.checkState(quantizerParamsLength >= 0,
+            "Invalid quantizer params length %s in IVF_FLAT index for column: 
%s",
+            quantizerParamsLength, column);
+        byte[] quantizerParams = new byte[quantizerParamsLength];
+        if (quantizerParamsLength > 0) {
+          in.readFully(quantizerParams);
+        }
+        _quantizer = 
VectorQuantizationUtils.createReadQuantizer(_quantizerType, _dimension, 
quantizerParams);
+      } else {
+        // Legacy v1 files contain raw big-endian float vectors only.
+        _quantizerType = VectorQuantizerType.FLAT;
+        _quantizer = null;

Review Comment:
   _quantizer is declared non-null but is set to null for legacy v1 files. 
Since Pinot treats members as non-null by default, this makes the nullability 
contract unclear and increases NPE risk for future edits. Consider marking 
_quantizer as `@Nullable` (and guarding all uses) or using a non-null sentinel 
quantizer implementation for the legacy path.
   ```suggestion
           // Legacy v1 files contain raw big-endian float vectors only. Use a 
FLAT quantizer
           // instead of null so the field remains non-null across all format 
versions.
           _quantizerType = VectorQuantizerType.FLAT;
           _quantizer = 
VectorQuantizationUtils.createReadQuantizer(_quantizerType, _dimension, new 
byte[0]);
   ```



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/readers/vector/IvfOnDiskVectorIndexReader.java:
##########
@@ -264,31 +357,74 @@ private void scanInvertedList(int centroidIdx, float[] 
query, int topK,
       docIds[i] = buf.getInt();
     }
 
-    // Reuse a single float[] for reading each document vector to avoid 
per-doc allocation
-    float[] docVector = new float[_dimension];
+    // Reuse per-doc buffers to avoid allocation churn in the hot loop.
+    float[] docVector = _legacyFloatEncoding ? new float[_dimension] : null;
+    byte[] encodedVector = _legacyFloatEncoding ? null : new 
byte[_encodedBytesPerVector];
 
     // Read and score vectors
     for (int i = 0; i < listSize; i++) {
       int docId = docIds[i];
 
-      // Read vector from buffer into the reused array
-      for (int d = 0; d < _dimension; d++) {
-        docVector[d] = buf.getFloat();
-      }
-
-      // Apply pre-filter if provided
+      // In filter-aware mode, skip decoding and distance computation for 
rejected docs.
       if (preFilterBitmap != null && !preFilterBitmap.contains(docId)) {
+        skipStoredVector(buf);
         continue;
       }
 
-      float dist = computeDistance(query, docVector);
-      if (maxHeap.size() < topK) {
-        maxHeap.offer(new ScoredDoc(docId, dist));
-      } else if (dist < maxHeap.peek()._distance) {
-        maxHeap.poll();
-        maxHeap.offer(new ScoredDoc(docId, dist));
+      float dist = readDistanceForCurrentVector(buf, query, docVector, 
encodedVector);
+      offer(maxHeap, docId, dist, topK);
+    }
+  }
+
+  private void scanInvertedListWithinApproximateRadius(int centroidIdx, 
float[] query, float threshold,
+      int maxCandidates, PriorityQueue<ScoredDoc> maxHeap) {
+    int listSize = _listSizes[centroidIdx];
+    if (listSize == 0) {
+      return;
+    }
+
+    long dataOffset = _listOffsets[centroidIdx] + 4;
+    int listBytes = listSize * (Integer.BYTES + _encodedBytesPerVector);
+    ByteBuffer buf = getOrResizeThreadLocalBuffer(listBytes);
+    try {
+      readFully(_channel, buf, dataOffset);
+    } catch (IOException e) {
+      throw new RuntimeException(
+          "Failed to read inverted list for centroid " + centroidIdx + " in 
column: " + _column, e);
+    }
+    buf.flip();
+
+    int[] docIds = new int[listSize];
+    for (int i = 0; i < listSize; i++) {
+      docIds[i] = buf.getInt();
+    }
+
+    float[] docVector = _legacyFloatEncoding ? new float[_dimension] : null;
+    byte[] encodedVector = _legacyFloatEncoding ? null : new 
byte[_encodedBytesPerVector];
+    for (int i = 0; i < listSize; i++) {
+      int docId = docIds[i];
+      float distance = readDistanceForCurrentVector(buf, query, docVector, 
encodedVector);
+      if (distance <= threshold) {
+        offer(maxHeap, docId, distance, maxCandidates);
+      }
+    }
+  }
+
+  protected float readDistanceForCurrentVector(ByteBuffer buf, float[] query, 
float[] docVector,
+      byte[] encodedVector) {
+    if (_legacyFloatEncoding) {
+      for (int d = 0; d < _dimension; d++) {
+        docVector[d] = buf.getFloat();
       }
+      return computeDistance(query, docVector);
     }
+
+    buf.get(encodedVector);
+    return _quantizer.computeDistance(query, encodedVector, _distanceFunction);
+  }

Review Comment:
   In the v2 path, readDistanceForCurrentVector delegates to 
VectorQuantizer.computeDistance() for every visited document. Current 
quantizers (FlatQuantizer/ScalarQuantizer) decode into a freshly allocated 
float[] per call, which can cause major allocation/GC overhead and latency 
spikes in this hot loop (and will impact all newly-built IVF indexes since the 
creator writes v2). Consider changing quantizer distance computation to be 
allocation-free (e.g., distance directly from encoded bytes, or decode into a 
reusable scratch array).



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/readers/vector/IvfFlatVectorIndexReader.java:
##########
@@ -297,18 +377,14 @@ public void close()
    * Internally uses L2 for EUCLIDEAN/L2, cosine for COSINE, negative dot for 
INNER_PRODUCT/DOT_PRODUCT.
    */
   private float computeDistance(float[] a, float[] b) {
-    switch (_distanceFunction) {
-      case EUCLIDEAN:
-      case L2:
-        return (float) VectorFunctions.euclideanDistance(a, b);
-      case COSINE:
-        return (float) VectorFunctions.cosineDistance(a, b);
-      case INNER_PRODUCT:
-      case DOT_PRODUCT:
-        return (float) -VectorFunctions.dotProduct(a, b);
-      default:
-        throw new IllegalArgumentException("Unsupported distance function: " + 
_distanceFunction);
+    return VectorQuantizationUtils.computeDistance(a, b, _distanceFunction);
+  }
+
+  private float getDistanceFromList(int probeIdx, int listOffset, float[] 
query) {
+    if (_indexFormatVersion == 
IvfFlatVectorIndexCreator.LEGACY_FORMAT_VERSION) {
+      return computeDistance(query, _listVectors[probeIdx][listOffset]);
     }
+    return _quantizer.computeDistance(query, 
_listEncodedVectors[probeIdx][listOffset], _distanceFunction);
   }

Review Comment:
   All v2 IVF_FLAT searches go through VectorQuantizer.computeDistance() per 
candidate. Current quantizers (FlatQuantizer and ScalarQuantizer) allocate a 
new float[] during decode for every computeDistance() call, which can create 
severe GC pressure/latency regressions in vector search hot loops (and affects 
even quantizer=FLAT because the creator always writes v2). Please rework the 
distance path to avoid per-candidate allocations (e.g., compute directly from 
encoded bytes, or decode into a reusable scratch buffer).



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/vector/IvfFlatVectorIndexCreator.java:
##########
@@ -74,6 +76,35 @@
  *   offsetToOffsets:        8 bytes (position of the offsets section)
  * </pre>
  *
+ * <h3>File format (version 2 - quantizer-aware)</h3>
+ * <pre>
+ * [Header]
+ *   magic:                  4 bytes (0x49564646 = "IVFF")
+ *   version:                4 bytes (2)
+ *   dimension:              4 bytes
+ *   numVectors:             4 bytes
+ *   nlist:                  4 bytes
+ *   distanceFunctionOrd:    4 bytes
+ *   quantizerTypeOrd:       4 bytes
+ *   quantizerParamsLen:     4 bytes
+ *   quantizerParams:        quantizerParamsLen bytes
+ *
+ * [Centroids Section]
+ *   nlist x dimension x 4 bytes (float32)
+ *
+ * [Inverted Lists Section]
+ *   For each centroid i (0..nlist-1):
+ *     listSize_i:           4 bytes
+ *     docIds_i:             listSize_i x 4 bytes (int32)
+ *     encodedVectors_i:     listSize_i x encodedBytesPerVector bytes
+ *
+ * [Inverted List Offsets]
+ *   nlist x 8 bytes (long offset to start of each inverted list)
+ *
+ * [Footer]
+ *   offsetToOffsets:        8 bytes (position of the offsets section)
+ * </pre>
+ *
  * <p>All multi-byte values are written in big-endian order (Java {@link 
DataOutputStream} default).</p>

Review Comment:
   The file-format Javadoc says all multi-byte values are big-endian 
(DataOutputStream default), but the v2 "encodedVectors" bytes are produced by 
the quantizer. In particular, FlatQuantizer encodes floats in little-endian, so 
v2 IVF_FLAT files become mixed-endian (header/centroids big-endian, vector 
payload little-endian). This is confusing for anyone parsing the file format 
and makes the spec inaccurate. Consider either switching FlatQuantizer (and any 
quantizers that write multi-byte primitives) to big-endian to match the 
container format, or updating the format spec to explicitly define endianness 
per quantizer payload (and ideally keep FLAT payload big-endian for continuity 
with v1).
   ```suggestion
    * <p>All container fields written directly by this class via {@link 
DataOutputStream} use big-endian order:
    * header integers, centroid float values, inverted-list sizes, doc IDs, 
list offsets, and the footer.</p>
    *
    * <p>For version 1, {@code vectors_i} are also written by this class as 
big-endian float32 values. For version 2,
    * {@code encodedVectors_i} are emitted by the configured {@link 
VectorQuantizer}, so their byte order is defined by
    * the quantizer implementation rather than by the container format. In 
particular,
    * {@link VectorQuantizerType#FLAT} currently writes little-endian float 
payload bytes.</p>
   ```



##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/readers/vector/IvfOnDiskVectorIndexReader.java:
##########
@@ -132,10 +146,40 @@ public IvfOnDiskVectorIndexReader(String column, File 
indexDir, VectorIndexConfi
           "Invalid distance function ordinal: %s", distanceFunctionOrdinal);
       _distanceFunction = allFunctions[distanceFunctionOrdinal];
 
+      long centroidsOffset = 24;
+      if (_indexFormatVersion == IvfFlatVectorIndexCreator.FORMAT_VERSION) {
+        ByteBuffer quantizerHeader = ByteBuffer.allocate(8);
+        readFully(_channel, quantizerHeader, 24);
+        quantizerHeader.flip();
+
+        int quantizerTypeOrdinal = quantizerHeader.getInt();
+        VectorQuantizerType[] allQuantizerTypes = VectorQuantizerType.values();
+        Preconditions.checkState(quantizerTypeOrdinal >= 0 && 
quantizerTypeOrdinal < allQuantizerTypes.length,
+            "Invalid quantizer type ordinal: %s", quantizerTypeOrdinal);
+        _quantizerType = allQuantizerTypes[quantizerTypeOrdinal];
+
+        int quantizerParamsLength = quantizerHeader.getInt();
+        Preconditions.checkState(quantizerParamsLength >= 0,
+            "Invalid quantizer params length: %s", quantizerParamsLength);
+        byte[] quantizerParams = new byte[quantizerParamsLength];
+        if (quantizerParamsLength > 0) {
+          ByteBuffer quantizerParamsBuf = ByteBuffer.wrap(quantizerParams);
+          readFully(_channel, quantizerParamsBuf, 32);
+        }
+        _quantizer = 
VectorQuantizationUtils.createReadQuantizer(_quantizerType, _dimension, 
quantizerParams);
+        _legacyFloatEncoding = false;
+        _encodedBytesPerVector = _quantizer.getEncodedBytesPerVector();
+        centroidsOffset = 32L + quantizerParamsLength;
+      } else {
+        _quantizerType = VectorQuantizerType.FLAT;
+        _quantizer = null;
+        _legacyFloatEncoding = true;
+        _encodedBytesPerVector = _dimension * Float.BYTES;

Review Comment:
   _quantizer is declared non-null but is assigned null when reading legacy v1 
IVF files. Given the project's non-null-by-default convention, this obscures 
the nullability contract and makes accidental future NPEs more likely. Consider 
marking _quantizer as `@Nullable` (and documenting/guarding) or using a 
non-null sentinel quantizer for the legacy path.
   ```suggestion
           // Legacy v1 files store raw floats without serialized quantizer 
metadata.
           // Use a FLAT read quantizer as a non-null sentinel to keep the 
field contract explicit.
           _quantizer = 
VectorQuantizationUtils.createReadQuantizer(_quantizerType, _dimension, new 
byte[0]);
           _legacyFloatEncoding = true;
           _encodedBytesPerVector = _quantizer.getEncodedBytesPerVector();
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Close vector search final-phase gaps [pinot]

Reply via email to