andygrove opened a new pull request, #3816: URL: https://github.com/apache/datafusion-comet/pull/3816
## Which issue does this PR close? Closes #. ## Rationale for this change When Comet can only convert a small fraction of a query's operators, the overhead from Spark-to-Comet transitions can outweigh the benefit of native execution. This adds a simple cost-based mechanism to skip Comet entirely for such queries. ## What changes are included in this PR? - Add `spark.comet.exec.coverageThreshold` config (double, 0.0–1.0, default 0.0 = disabled). When set, Comet falls back to the original Spark plan if the percentage of converted operators is below the threshold. - Extract `CometCoverageStats.fromPlan()` to compute coverage stats from a `SparkPlan` without building the explain string, reusable by both the explain output and the threshold check. - Add threshold check at the end of `CometExecRule._apply()` that logs a warning and returns the original plan when coverage is insufficient. ## How are these changes tested? Default behavior is unchanged (threshold = 0.0 disables the check). Manual verification that compile succeeds. Tests to be added. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
