dongjoon-hyun opened a new pull request, #56500:
URL: https://github.com/apache/spark/pull/56500
### What changes were proposed in this pull request?
`Encoders.IntArrays.decode()` and `Encoders.LongArrays.decode()` now
validate the
element-count prefix before allocating the array.
Unlike the 1-byte-element `Encoders.Strings` / `Encoders.ByteArrays` cases
(SPARK-57426, SPARK-57430), where the byte length maps 1:1 to readable
bytes, each
element here occupies a fixed multiple of bytes, so the upper bound is
divided by the
element size:
- `IntArrays`: `Objects.checkFromIndexSize(0, numInts, buf.readableBytes() /
4)` (4 bytes per `int`)
- `LongArrays`: `Objects.checkFromIndexSize(0, numLongs, buf.readableBytes()
/ 8)` (8 bytes per `long`)
This requires `0 <= count <= (remaining bytes / element size)` and throws
`IndexOutOfBoundsException` otherwise. After `readInt()`, the reader index
is past the
count prefix, so `buf.readableBytes() / elementSize` is the maximum number
of elements
actually decodable from the remaining payload.
- Note that we use the Java built-in `Objects.checkFromIndexSize` because we
cannot use
**Netty's `checkReadableBytes`**. However, both methods throw
`IndexOutOfBoundsException`
identically for the error cases.
-
https://netty.io/4.2/api/io/netty/buffer/AbstractByteBuf.html#checkReadableBytes(int)
-
https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/Objects.html#checkFromIndexSize(int,int,int)
### Why are the changes needed?
`decode()` allocated `new int[numInts]` / `new long[numLongs]` from an
untrusted count prefix.
- A negative value throws an opaque `NegativeArraySizeException`.
- An oversized value can trigger `OutOfMemoryError` on a corrupt or hostile
frame.
Validating first fails fast with a clear error, consistent with SPARK-57426
(`Strings`)
and SPARK-57430 (`ByteArrays`).
These two decoders are used by the shuffle wire-protocol messages, e.g.
`FetchShuffleBlocks`, `FetchShuffleBlockChunks`, `MergeStatuses`, and
`LocalDirsForExecutors`.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Pass the CIs.
### Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Claude Opus 4.8)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]