andygrove opened a new issue, #4467:
URL: https://github.com/apache/datafusion-comet/issues/4467

   ## Describe the bug
   
   `CometCaseConversionBase` (parent of `CometUpper` and `CometLower`) gates 
compatibility inside `convert(...)`:
   
   ```scala
   override def convert(expr: T, inputs: Seq[Attribute], binding: Boolean): 
Option[Expr] = {
     if (!CometConf.COMET_CASE_CONVERSION_ENABLED.get()) {
       withInfo(expr, "Comet is not compatible with Spark for case conversion 
in " +
         s"locale-specific cases. Set 
${CometConf.COMET_CASE_CONVERSION_ENABLED.key}=true " +
         "to enable it anyway.")
       return None
     }
     super.convert(expr, inputs, binding)
   }
   ```
   
   It never overrides `getSupportLevel`, so the class-level support level 
defaults to `Compatible(None)` even though `getIncompatibleReasons()` is 
populated. The Step 5 consistency rule in the `audit-comet-expression` skill 
calls this out specifically as the canonical antipattern to avoid. Surfaced by 
the string-expressions audit in apache/datafusion-comet#4461.
   
   Concrete consequences:
   
   1. The dispatcher's `allowIncompatible` machinery is bypassed. A user who 
flips the standard `spark.comet.expr.allowIncompatible=true` (or the 
per-expression `spark.comet.expression.Upper.allowIncompatible`) still has 
`lower`/`upper` fall back, because the gate is the older bespoke 
`spark.comet.caseConversion.enabled`.
   2. EXPLAIN and the auto-generated compatibility docs disagree. EXPLAIN shows 
the bespoke message above; the compat doc lists the populated 
`getIncompatibleReasons()` text instead.
   3. `initcap` uses the standard `Incompatible(Some(...))` flow; 
`lower`/`upper` use the bespoke flow. The asymmetry is confusing for users who 
have just opted into all string incompatibilities.
   
   ## Expected behavior
   
   Override `getSupportLevel` to return `Incompatible(Some(reason))` from a 
shared `private val`, drop the in-`convert` conf check, and either deprecate 
`spark.comet.caseConversion.enabled` in favor of the standard per-expression 
`allowIncompatible` flag or have the support level read the standard flag.
   
   ## Additional context
   
   - Serde: `CometCaseConversionBase` in 
`spark/src/main/scala/org/apache/comet/serde/strings.scala`
   - Conf: `CometConf.COMET_CASE_CONVERSION_ENABLED` (default `false`)
   - Related: #1052 (initcap divergence, already follows the standard pattern)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to