andygrove commented on PR #4660:
URL: 
https://github.com/apache/datafusion-comet/pull/4660#issuecomment-4800443271

   Thanks @marvelshan. The instinct here is good. Separating the genuine 
timestamp-range incompatibility from the format-pattern caveat is the right 
idea, and pulling the format text out of `getIncompatibleReasons()` so that it 
only describes what `allowIncompatible=true` actually changes (the native 
`to_char` path) is a real improvement. The `from_unix_time_enabled.sql` cleanup 
is nice too, since the default pattern really does run natively under 
`allowIncompatible=true` and asserting that with `query` is more accurate.
   
   A few things I would like to work through before this lands.
   
   **1. The code keeps this `Incompatible`, not `Unsupported`.** The title and 
the description say `getSupportLevel` now returns `Unsupported` for non-default 
patterns, but both branches return `Incompatible`:
   
   ```scala
   override def getSupportLevel(expr: FromUnixTime): SupportLevel = {
     if (expr.format != Literal(TimestampFormatter.defaultPattern())) {
       Incompatible(Some("Only the default datetime pattern ... is supported 
natively"))
     } else {
       Incompatible(None)
     }
   }
   ```
   
   I actually think keeping it `Incompatible` is correct. `CometFromUnixTime` 
is a `CodegenDispatchFallback`, so non-default formats are handled by the JVM 
codegen dispatcher rather than being truly unsupported. If this returned 
`Unsupported`, the framework would skip the dispatcher and force a full fall 
back to Spark, which would be a regression. So the code looks right, it is the 
title and the "What changes are included" section that need to be updated to 
match.
   
   **2. `getUnsupportedReasons()` will read as self-contradictory in the 
docs.** `GenerateDocs` calls `getUnsupportedReasons()` unconditionally and 
never consults `getSupportLevel`, so this entry renders under the heading "The 
following cases are not supported by Comet:", while the bullet text itself says 
non-default patterns are handled by the codegen dispatcher. So the generated 
page says a case is not supported and handled in the same sentence, and that 
does not match runtime behavior where non-default formats run in Comet via the 
dispatcher by default. `date_format` has the same 
native-allow-list-plus-dispatcher shape and documents it in 
`getCompatibleNotes()` (`datetime.scala:715`). Could the format caveat move 
there instead? That keeps `getIncompatibleReasons()` as just the timestamp 
range, which is the part of this change I like, and avoids the misleading 
"Unsupported" heading.
   
   **3. The non-default test downgrade looks inconsistent.** In 
`from_unix_time.sql`, `from_unixtime(t, 'yyyy-MM-dd')` moved from `query` to 
`query spark_answer_only`, but the default-pattern `from_unixtime(t)` stayed 
`query`. At `allowIncompatible=false` both take the same codegen-dispatch path, 
so if the default-pattern one passes as native, the non-default one should too. 
Could those stay `query` so they keep asserting the dispatcher holds them in 
Comet? If `query` actually fails for the non-default case, that would be worth 
surfacing, since it would mean the dispatcher is not covering it the way the 
new reason text claims.
   
   **4. Minor.** `getSupportLevel` calls `TimestampFormatter.defaultPattern()` 
with parens while `convert()` just below calls `defaultPattern` without. The 
parens form matches the Spark definition, so it is fine, but the two call sites 
reading differently in the same object is worth tidying. Also a heads up that 
#4693 also edits this serde's `getSupportLevel` and `getIncompatibleReasons` 
for collation, so whichever of the two merges second will need a rebase.
   
   The `getIncompatibleReasons()` split and the enabled-test fixes are solid. 
The main rework is routing the format caveat through `getCompatibleNotes()` 
rather than `getUnsupportedReasons()`, and squaring the title and description 
with the code.
   
   *Disclosure: this review was assisted by an AI tool (Anthropic Claude via 
Claude Code). I read the diff, the linked issues, and the related serde code 
before forming these comments, and I take responsibility for them.*
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to