LuciferYang commented on code in PR #55919:
URL: https://github.com/apache/spark/pull/55919#discussion_r3308688178


##########
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedDeltaBinaryPackedReader.java:
##########
@@ -215,6 +217,114 @@ public void skipLongs(int total) {
     skipValues(total);
   }
 
+  // ---- Bulk read helpers for readIntegers / readLongs 
----------------------------
+  //
+  // The generic readValues() path dispatches a lambda per value.  For the two 
most
+  // common callers (readIntegers, readLongs) we can do much better: compute a 
prefix
+  // sum over the unpacked deltas in-place, then bulk-copy the result into the 
column
+  // vector with putInts / putLongs (backed by System.arraycopy on-heap).
+
+  /**
+   * Callback for writing a chunk of prefix-summed absolute values from
+   * {@code unpackedValuesBuffer} into a column vector.  Called once per 
mini-block
+   * (not per value), so lambda overhead is negligible.
+   */
+  @FunctionalInterface
+  private interface BulkWriter {
+    void write(WritableColumnVector c, int rowId, long[] values, int start, 
int count);
+  }
+
+  /** Narrows long[] -> int[] scratch and bulk-writes via putInts. */
+  private void bulkWriteInts(WritableColumnVector c, int rowId,
+      long[] buf, int start, int count) {
+    if (intScratchBuffer == null) {

Review Comment:
   The scratch buffer is lazily allocated on first `readIntegers` call. Since 
`miniBlockSizeInValues` is known at `initFromPage` time and the buffer is small 
(typically 128 ints = 512 bytes), it could be allocated eagerly alongside 
`unpackedValuesBuffer` for simpler code and deterministic allocation behavior. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to