Re: [PR] [SPARK-55411][SQL] SPJ may throw ArrayIndexOutOfBoundsException when join keys are less than cluster keys [spark]

via GitHub Sat, 07 Feb 2026 07:14:52 -0800


pan3793 commented on code in PR #54182:
URL: https://github.com/apache/spark/pull/54182#discussion_r2777666104



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/KeyGroupedPartitionedScan.scala:
##########
@@ -52,16 +52,16 @@ trait KeyGroupedPartitionedScan[T] {
           case Some(projectionPositions) =>
             val internalRowComparableWrapperFactory =
               
InternalRowComparableWrapper.getInternalRowComparableWrapperFactory(
-                expressions.map(_.dataType))
+                projectedExpressions.map(_.dataType))
             basePartitioning.partitionValues.map { r =>
-            val projectedRow = KeyGroupedPartitioning.project(expressions,
+            val projectedRow = 
KeyGroupedPartitioning.project(basePartitioning.expressions,

Review Comment:
   > // Do not use `bucket()` in "one side partition" tests as its 
implementation in
   > // `InMemoryBaseTable` conflicts with `BucketFunction`
   
   Oh, god, @peter-toth, thanks a lot for pointing this out, I wasn't aware of 
it and have spent a few hours trying to figure out why SMJ partition key value 
mismatch and produce wrong result after fixing the 
`ArrayIndexOutOfBoundsException` ...
   
   Actually, the current code changes are just a draft; the test cases have not 
yet passed. I will try to fix it following your guidance. Thank you again, 
@peter-toth!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-55411][SQL] SPJ may throw ArrayIndexOutOfBoundsException when join keys are less than cluster keys [spark]

Reply via email to