andygrove opened a new pull request, #4629:
URL: https://github.com/apache/datafusion-comet/pull/4629

   ## Which issue does this PR close?
   
   Closes #4310.
   
   ## Rationale for this change
   
   #4310 noted that Comet's model for choosing between a native (incompatible, 
faster) implementation and a Spark-compatible codegen-dispatch implementation 
was confusing, and that the regex family made it visible. The behavior that 
actually landed (via #4239) is general, not regex-specific: any expression that 
has both a native and a codegen-dispatch implementation defaults to codegen 
dispatch (Spark-compatible, runs natively with a per-batch JNI cost), and the 
user opts into the native path per expression with that expression's 
`allowIncompatible` flag. This was documented per-family in the regex and JSON 
guides but never stated as the general model. This PR explains it once, 
centrally.
   
   ## What changes are included in this PR?
   
   * `compatibility/index.md`: adds a "Native and codegen-dispatch 
implementations" section describing the two implementation kinds, why codegen 
dispatch is the default, how 
`spark.comet.expression.<Expr>.allowIncompatible=true` opts into the native 
path, and the fallthrough-to-dispatcher behavior for cases the native path does 
not cover. It also distinguishes this from expressions that have no 
codegen-dispatch path (for example `cast`), where the default is a Spark 
fallback.
   * `expressions.md`: corrects the support-reference intro, which previously 
implied incompatible cases always fall back to Spark by default, and links to 
the new section.
   
   No code changes; the regex (`compatibility/regex.md`) and JSON 
(`compatibility/json.md`) guides already document their per-expression configs 
and specific differences, so this PR only adds the general framing they are 
instances of.
   
   ## How are these changes tested?
   
   Documentation-only change. The described model was verified against the 
serdes in `spark/src/main/scala/org/apache/comet/serde/strings.scala` (regex 
family) and the `spark.comet.exec.scalaUDF.codegen.enabled` config in 
`CometConf.scala`. `compatibility/index.md` was run through prettier 
(unchanged); `expressions.md` is listed in `.prettierignore`.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to