andygrove opened a new pull request, #4579:
URL: https://github.com/apache/datafusion-comet/pull/4579

   ## Which issue does this PR close?
   
   Closes #4554.
   
   ## Rationale for this change
   
   `try_make_timestamp` rewrites to `MakeTimestamp(failOnError = false)`, which 
Spark's `doGenCode` implements with a `try { ... } catch (DateTimeException e) 
{ ev.isNull = true; }` block. With Comet's codegen dispatcher routing 
`MakeTimestamp` through Spark's own codegen, the kernel was honoring the 
input-null short-circuit but skipping the post-eval `ev.isNull` guard, so 
invalid date/time components (month=13, day=32, hour=25, ...) returned stale 
`ev.value` bytes instead of NULL.
   
   The dispatcher's `defaultBody` conflated `NullIntolerant`'s "null in → null 
out" guarantee with the converse "non-null in → non-null out", which 
`MakeTimestamp(failOnError=false)` does not satisfy. The same path also affects 
non-ANSI `make_timestamp` directly, since that is the same expression with the 
same flag.
   
   ## What changes are included in this PR?
   
   - `CometBatchKernelCodegen.defaultBody`: in the `NullIntolerant && 
allNullIntolerant` branch, when the bound expression is nullable, wrap the 
post-`ev.code` write in `if (ev.isNull) setNull else write`. Non-nullable roots 
keep the direct write (the existing optimization).
   - `CometCodegenSourceSuite`: lock in the fix with a generated-source 
assertion that `MakeTimestamp(failOnError=false)` emits two `setNull` sites — 
input short-circuit + post-eval guard.
   - `make_timestamp.sql`: add column and literal queries with invalid 
components covering month/day/hour/minute out-of-range. These ride the default 
`failOnError=false` path (ANSI off in `CometSqlFileTestSuite`).
   - `try_make_timestamp.sql` (new, Spark 4.0+): direct coverage for 
`try_make_timestamp` with both column and literal invalid inputs, plus a 
sentinel valid query so the dispatcher must run natively.
   
   ## How are these changes tested?
   
   - `CometCodegenSourceSuite` (Spark 3.5): 51/51 pass; the new test fails on 
the unmodified codegen and passes with the fix.
   - `CometSqlFileTestSuite` `make_timestamp` filter (Spark 3.5 and 4.0): all 
three fixtures pass; on 3.5 `try_make_timestamp.sql` is skipped via 
`MinSparkVersion: 4.0`.
   - All datetime SQL fixtures (Spark 3.5): 89/89 pass.
   - `CometCodegenSuite`, `CometCodegenHOFSuite`, 
`CometTemporalExpressionSuite`: pass.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to