dongjoon-hyun opened a new pull request, #56500:
URL: https://github.com/apache/spark/pull/56500

   ### What changes were proposed in this pull request?
   
   `Encoders.IntArrays.decode()` and `Encoders.LongArrays.decode()` now 
validate the
   element-count prefix before allocating the array.
   
   Unlike the 1-byte-element `Encoders.Strings` / `Encoders.ByteArrays` cases
   (SPARK-57426, SPARK-57430), where the byte length maps 1:1 to readable 
bytes, each
   element here occupies a fixed multiple of bytes, so the upper bound is 
divided by the
   element size:
   
   - `IntArrays`: `Objects.checkFromIndexSize(0, numInts, buf.readableBytes() / 
4)` (4 bytes per `int`)
   - `LongArrays`: `Objects.checkFromIndexSize(0, numLongs, buf.readableBytes() 
/ 8)` (8 bytes per `long`)
   
   This requires `0 <= count <= (remaining bytes / element size)` and throws
   `IndexOutOfBoundsException` otherwise. After `readInt()`, the reader index 
is past the
   count prefix, so `buf.readableBytes() / elementSize` is the maximum number 
of elements
   actually decodable from the remaining payload.
   
   - Note that we use the Java built-in `Objects.checkFromIndexSize` because we 
cannot use
     **Netty's `checkReadableBytes`**. However, both methods throw 
`IndexOutOfBoundsException`
     identically for the error cases.
     - 
https://netty.io/4.2/api/io/netty/buffer/AbstractByteBuf.html#checkReadableBytes(int)
     - 
https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/Objects.html#checkFromIndexSize(int,int,int)
   
   ### Why are the changes needed?
   
   `decode()` allocated `new int[numInts]` / `new long[numLongs]` from an 
untrusted count prefix.
   - A negative value throws an opaque `NegativeArraySizeException`.
   - An oversized value can trigger `OutOfMemoryError` on a corrupt or hostile 
frame.
   
   Validating first fails fast with a clear error, consistent with SPARK-57426 
(`Strings`)
   and SPARK-57430 (`ByteArrays`).
   
   These two decoders are used by the shuffle wire-protocol messages, e.g.
   `FetchShuffleBlocks`, `FetchShuffleBlockChunks`, `MergeStatuses`, and 
`LocalDirsForExecutors`.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Pass the CIs.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   Generated-by: Claude Code (Claude Opus 4.8)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to