sepuri sai krishna created SPARK-57665:
------------------------------------------
Summary: slice() returns an empty array for a large length due to
int overflow in the interpreted path
Key: SPARK-57665
URL: https://issues.apache.org/jira/browse/SPARK-57665
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 4.0.3
Reporter: sepuri sai krishna
slice(array, start, length) silently drops all elements and returns an empty
array when `length` is large enough that `start_0based + length` overflows a
32-bit int.
How to reproduce (Spark 4.0+, no config needed): Verified on released Spark
4.0.3 (Scala 2.13.16) in spark-shell.
SELECT slice(array(1,2,3,4,5,6), 2, 2147483647)
=> [] (expected: [2,3,4,5,6])
Root cause: Slice.nullSafeEval computes data.slice(startIndex, startIndex +
lengthInt) for a large length, startIndex + lengthInt overflows to a negative
`until`. Under Scala 2.13 (Spark 4.0+), Seq.slice with a negative `until`
yields an empty result, so the whole tail is dropped. The codegen path uses
ArrayExpressionUtils.sliceLength, which clamps to the remaining element count
and returns the correct tail, so the two execution paths disagree. (Spark 3.5 /
Scala 2.12 is unaffected: 2.12's slice double overflows and accidentally
returns the correct elements.)
For constant arguments the wrong value is produced even by default, because
ConstantFolding evaluates the expression via the interpreted eval() at plan
time.
Context: SPARK-57171 extracted the index arithmetic into
ArrayExpressionUtils.sliceLength and routed the codegen path through it, but
the interpreted path (Slice.nullSafeEval) was left computing
data.slice(startIndex, startIndex + lengthInt) directly. Proposed fix: route
the interpreted path through the same sliceLength helper so both paths agree
and the index arithmetic cannot overflow.
!image-2026-06-24-20-40-03-883.png!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]