yaooqinn opened a new pull request, #12196: URL: https://github.com/apache/gluten/pull/12196
## What changes are proposed in this pull request? Adds byte-for-byte regression tests pinning the cross-language wire contract between `VeloxColumnarBatchSerializer::framedSerializeWithStats` (cpp producer) and `CachedColumnarBatchKryoSerializer.parseFramedBytes` (JVM consumer). Test-only, no production code touched. Three commits, one file each: 1. **cpp** (`VeloxColumnarBatchSerializerTest.cc`): `framedSerializeWithStatsGolden` byte-equal golden over a fixed 4-col / 100-row input (3198-byte literal, md5-stable across 3 reruns) + `framedSerializeWithStatsAllNullColNoBounds` field-level test for the `emitSupported=0` branch (kept separate because Velox PrestoSerde dumps uninitialized values-buffer bytes for all-null FlatVectors, breaking byte-equal goldens over null cols). 2. **JVM** (`ColumnarCachedBatchFramedBytesSuite.scala`): parser round-trip over the same 3198-byte literal + JVM mirror of the no-bounds branch test. Gated to Spark 4.x (schema-driven `parseFramedBytes` dispatch is 4.x-only). 3. **statsBlob type matrix** (`ColumnarCachedBatchStatsBlobSuite.scala`): 8 byte-for-byte cell tests expanding existing BIGINT coverage to every `emitSupported=1` type — SMALLINT, TINYINT, HUGEINT (int128), REAL, DOUBLE, BOOLEAN, TIMESTAMP, DATE. INTEGER/BIGINT/VARCHAR already in the golden. After this PR, every type kind in the dispatch table is wire-pinned on both sides. ## Why The stats-blob wire layout is shared cpp↔JVM and has evolved through several recent PRs with no regression coverage on the byte shape. Silent drift between producer and consumer would be a correctness hazard; this PR makes any unilateral change to either half fire loudly. ## How was this patch tested? Targeted suites: 17/17 PASSED. Local preflight (compile + clang-format-15 + spotless + scalastyle) clean. ## Was this patch authored or co-authored using generative AI tooling? Yes. Generated-by: Claude claude-opus-4.7 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
