andygrove opened a new issue, #4467:
URL: https://github.com/apache/datafusion-comet/issues/4467
## Describe the bug
`CometCaseConversionBase` (parent of `CometUpper` and `CometLower`) gates
compatibility inside `convert(...)`:
```scala
override def convert(expr: T, inputs: Seq[Attribute], binding: Boolean):
Option[Expr] = {
if (!CometConf.COMET_CASE_CONVERSION_ENABLED.get()) {
withInfo(expr, "Comet is not compatible with Spark for case conversion
in " +
s"locale-specific cases. Set
${CometConf.COMET_CASE_CONVERSION_ENABLED.key}=true " +
"to enable it anyway.")
return None
}
super.convert(expr, inputs, binding)
}
```
It never overrides `getSupportLevel`, so the class-level support level
defaults to `Compatible(None)` even though `getIncompatibleReasons()` is
populated. The Step 5 consistency rule in the `audit-comet-expression` skill
calls this out specifically as the canonical antipattern to avoid. Surfaced by
the string-expressions audit in apache/datafusion-comet#4461.
Concrete consequences:
1. The dispatcher's `allowIncompatible` machinery is bypassed. A user who
flips the standard `spark.comet.expr.allowIncompatible=true` (or the
per-expression `spark.comet.expression.Upper.allowIncompatible`) still has
`lower`/`upper` fall back, because the gate is the older bespoke
`spark.comet.caseConversion.enabled`.
2. EXPLAIN and the auto-generated compatibility docs disagree. EXPLAIN shows
the bespoke message above; the compat doc lists the populated
`getIncompatibleReasons()` text instead.
3. `initcap` uses the standard `Incompatible(Some(...))` flow;
`lower`/`upper` use the bespoke flow. The asymmetry is confusing for users who
have just opted into all string incompatibilities.
## Expected behavior
Override `getSupportLevel` to return `Incompatible(Some(reason))` from a
shared `private val`, drop the in-`convert` conf check, and either deprecate
`spark.comet.caseConversion.enabled` in favor of the standard per-expression
`allowIncompatible` flag or have the support level read the standard flag.
## Additional context
- Serde: `CometCaseConversionBase` in
`spark/src/main/scala/org/apache/comet/serde/strings.scala`
- Conf: `CometConf.COMET_CASE_CONVERSION_ENABLED` (default `false`)
- Related: #1052 (initcap divergence, already follows the standard pattern)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]