schenksj commented on PR #4532: URL: https://github.com/apache/datafusion-comet/pull/4532#issuecomment-4753349260
Thanks for the thorough review, @andygrove! All four points are addressed. (Disclosure: this PR's changes and this response were written with help from AI — Claude Code.) **1. Hardcoded `"UTC"` timezone.** I went with the comment route, because on investigation `"UTC"` turns out to be the *correct* choice here rather than just a defensible one — and threading the session-local timezone would actively introduce the mismatch you were worried about. These constants are materialized alongside the batch's non-constant columns in the same `VectorSchemaRoot`, and Comet's non-constant `TimestampType` columns are Arrow vectors exported from native execution, where Comet always tags them `Timestamp(µs, "UTC")` (see native `serde.rs`). Spark itself stores `TimestampType` as micros in UTC, so the constant's value is already a UTC instant. Tagging the materialized constant `"UTC"` keeps its Arrow timezone metadata consistent with its sibling timestamp columns; threading `SQLConf.sessionLocalTimeZone` would make them diverge. I've expanded the comment in `materializeConstantColumnVector` to spell this out. **2. Test coverage in `UtilsSuite`.** Added two cases beyond `IntegerType`: - A nullable `StructType` constant (with a null nested field and a wholly-null struct) — exercises the `getStruct(rowId)` / `getChild(ordinal)` `ArrowFieldWriter` path. - A `TimestampType` constant — round-trips the micros and pins down the `"UTC"` choice, guarding against anyone later swapping the zone argument. **3. FFI `exportBatch` arm.** Done — added `NativeUtilSuite`, which runs a `ConstantColumnVector`-containing batch (scalar value, null, and a struct constant) through the actual Arrow C Data Interface via `NativeUtil.exportBatchToAddresses` + `importVector` — the same export/import round trip `getNextBatch` performs in production, so it covers `Data.exportVector` + the allocator handoff. I verified it's a real guard: removing the `ConstantColumnVector` arm in `exportBatch` makes it fail with `Comet execution only takes Arrow Arrays`. **4. CI flake.** Agreed — the `Spark 4.1 / sql_core-3` failure was the runner-setup infra flake. I've also merged latest `main` into the branch (it was out of date with base), so a fresh CI run picks all of this up. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
