(spark) branch master updated: [SPARK-55962][SQL] Use `getShort` instead of `getInt` casting in `putShortsFromIntsLittleEndian` on Little Endian platforms

yangjie01 Wed, 11 Mar 2026 21:27:58 -0700

This is an automated email from the ASF dual-hosted git repository.

yangjie01 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new 4cdaae7b9fe4 [SPARK-55962][SQL] Use `getShort` instead of `getInt` 
casting in `putShortsFromIntsLittleEndian` on Little Endian platforms
4cdaae7b9fe4 is described below

commit 4cdaae7b9fe4d760d5d680d99a8dc05143df1d65
Author: yangjie01 <[email protected]>
AuthorDate: Thu Mar 12 12:27:31 2026 +0800

    [SPARK-55962][SQL] Use `getShort` instead of `getInt` casting in 
`putShortsFromIntsLittleEndian` on Little Endian platforms
    
    ### What changes were proposed in this pull request?
    This PR optimizes the `putShortsFromIntsLittleEndian` method in 
`OnHeapColumnVector` and `OffHeapColumnVector` specifically for Little Endian 
platforms.
    
    Instead of reading a full 4-byte `int` and casting it to `short` (which 
discards the upper bytes), the new implementation directly reads the first 2 
bytes using `Platform.getShort()` when the system architecture is Little Endian.
    
    ### Why are the changes needed?
    To avoid redundant memory access by reading only the necessary 2 bytes 
instead of 4 bytes per element on Little Endian platforms.
    
    ### Does this PR introduce _any_ user-facing change?
    No
    
    ### How was this patch tested?
    Pass Github Actions
    
    ### Was this patch authored or co-authored using generative AI tooling?
    No
    
    Closes #54758 from LuciferYang/SPARK-55722.
    
    Authored-by: yangjie01 <[email protected]>
    Signed-off-by: yangjie01 <[email protected]>
---
 .../org/apache/spark/sql/execution/vectorized/OffHeapColumnVector.java  | 2 +-
 .../org/apache/spark/sql/execution/vectorized/OnHeapColumnVector.java   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git 
a/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OffHeapColumnVector.java
 
b/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OffHeapColumnVector.java
index b56a49d8ee40..86a0365ead0c 100644
--- 
a/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OffHeapColumnVector.java
+++ 
b/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OffHeapColumnVector.java
@@ -278,7 +278,7 @@ public final class OffHeapColumnVector extends 
WritableColumnVector {
       }
     } else {
       for (int i = 0; i < count; ++i, srcOffset += 4, dstOffset += 2) {
-        Platform.putShort(null, dstOffset, (short) Platform.getInt(src, 
srcOffset));
+        Platform.putShort(null, dstOffset, Platform.getShort(src, srcOffset));
       }
     }
   }
diff --git 
a/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OnHeapColumnVector.java
 
b/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OnHeapColumnVector.java
index a6472955d673..9084eec9ffc2 100644
--- 
a/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OnHeapColumnVector.java
+++ 
b/sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/OnHeapColumnVector.java
@@ -273,7 +273,7 @@ public final class OnHeapColumnVector extends 
WritableColumnVector {
       }
     } else {
       for (int i = 0; i < count; ++i, srcOffset += 4) {
-        shortData[rowId + i] = (short) Platform.getInt(src, srcOffset);
+        shortData[rowId + i] = Platform.getShort(src, srcOffset);
       }
     }
   }


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(spark) branch master updated: [SPARK-55962][SQL] Use `getShort` instead of `getInt` casting in `putShortsFromIntsLittleEndian` on Little Endian platforms

Reply via email to