andygrove opened a new issue, #4242:
URL: https://github.com/apache/datafusion-comet/issues/4242

   ## What is the problem the feature request solves?
   
   When #2894 was first addressed by #2994, `COUNT` was included in the set of 
aggregates safe to run with mixed Spark partial / Comet final execution 
alongside `MIN`, `MAX`, and the bitwise aggregates. The follow-up branch for 
that work removed `COUNT` from the safe set after two regressions surfaced. As 
a result, the TPC-DS coverage gains in #2994 (which were almost entirely driven 
by `COUNT`) are not realized in the carve-out PR.
   
   This issue tracks investigating and re-enabling `COUNT` for mixed Spark 
partial / Comet final execution.
   
   ## Known blockers
   
   The two specific regressions that caused `COUNT` to be excluded:
   
   1. **AQE `PropagateEmptyRelationAfterAQE`**. The rule matches 
`BaseAggregateExec` only, not `CometHashAggregateExec`. When the partial runs 
in Spark and the final runs in Comet, the rule no longer fires for the final 
stage, which changes results in some queries.
   
   2. **Spark 4.0 count-bug decorrelation**. The decorrelation rewrite for 
correlated `IN` subqueries drops a row in the OR pattern in `in-count-bug.sql` 
when the partial/final aggregate stages are split between Spark and Comet.
   
   ## Suggested approach
   
   - Reproduce both regressions with `COUNT` re-added to 
`supportsMixedPartialFinal` in `aggregates.scala`.
   - For (1), evaluate whether the right fix is upstream (extend 
`PropagateEmptyRelationAfterAQE` to recognize Comet aggregate exec) or in Comet 
(e.g., guard mixed-COUNT when AQE is enabled, or insert a wrapper Spark 
aggregate node).
   - For (2), determine whether the row-drop is specific to Spark 4.0 
decorrelation interacting with mixed aggregate stages, and whether a targeted 
guard is preferable to broad fallback.
   - Once both are addressed, re-add `override def supportsMixedPartialFinal: 
Boolean = true` to `CometCount` and regenerate TPC-DS golden files.
   
   ## Related
   
   - Parent: #2892
   - Original: #2894 (closed by #2994 follow-up)
   - Adjacent: #1389 (AQE materializing unsupported final HashAggregate)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to