yaooqinn opened a new pull request, #12224:
URL: https://github.com/apache/gluten/pull/12224
## What changes are proposed in this pull request?
Two defensive bound checks at the wire-edges of the cache-stats
serialization path:
1. **cpp JNI `serializeWithStats`**: `GLUTEN_CHECK` that the framed
payload fits in `jsize` (signed int32, ~2 GiB) before
`NewByteArray`. A payload in the (2 GiB, 4 GiB] window (allowed by
the inner `bytesLen <= UINT32_MAX` check) would currently wrap to
a negative `jsize` and surface as `NegativeArraySizeException`
with no actionable diagnostic. Fail fast as `GlutenException` with
byte count + limit.
2. **JVM `deserializeStats`**: when `schema != null`, require
`numCols <= schema.length`. The `>` direction would IOB on
`schema(col)` during type dispatch (a real corrupt-frame signal).
The `<` direction is intentionally allowed -- existing fixtures
like `ColumnarCachedBatchIntFamilyMarshalSuite` pass an EXPANDED
5-field-per-source-col schema where `schema.length == numCols * 5`
and only the first `numCols` entries drive dispatch. The new
guard catches corrupt frames at the wire boundary instead of
letting an undersized row propagate into a downstream
`ArrayIndexOutOfBoundsException`. The pre-existing
`0 <= numCols <= Int.MaxValue/5` bound is preserved.
Both fixes are defense-in-depth -- no production caller triggers
either path today. Recovery on the JVM side is via the existing
`NonFatal` corrupt-frame catch in `ColumnarCachedBatchSerializer.serialize`
(no change), which falls back to legacy `serialize()` for the batch.
## How was this patch tested?
- `ColumnarCachedBatchFramedBytesSuite`: 3 new JVM tests covering
`numCols > schema.length` rejection, `numCols < schema.length`
loose-schema sentinel, and the `schema=null` V1-wire backcompat
sentinel.
- `ColumnarCachedBatch*Suite` family (9 suites, 69 tests): all
pass against a fresh local `libgluten.so` / `libvelox.so` built
from this branch. Local preflight (cpp + JVM compile + format +
lint) clean.
- No cpp gtest for the `jsize` guard: exercising the bound would
require materializing a >2 GiB framed payload, which risks OOM on
CI runners with 8-16 GiB total. The guard itself is a one-liner
immediately before `NewByteArray`.
## Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Opus 4.7
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]