mbutrovich commented on code in PR #4374:
URL: https://github.com/apache/datafusion-comet/pull/4374#discussion_r3284477020
##########
spark/src/test/resources/tpcds-plan-stability/approved-plans-v1_4-spark4_0/q83.ansi/extended.txt:
##########
@@ -101,4 +127,4 @@ CometNativeColumnarToRow
+- CometFilter
+- CometNativeScan
parquet spark_catalog.default.date_dim
Review Comment:
Looking at the plan-stability changes:
- `approved-plans-v1_4-spark3_5/q83/extended.txt` is deleted (was 98 ops
with 2 `ReusedSubquery`), so 3.5 falls through to the base file at 124 ops with
0 `ReusedSubquery`.
- `approved-plans-v1_4-spark4_0/q83.ansi/extended.txt` goes 98 ops with 2
`ReusedSubquery` to 124 ops with 0 `ReusedSubquery`.
On 3.5+, `CometReuseSubquery` keys on `sub.plan.canonicalized`
([CometReuseSubquery.scala:59](https://github.com/apache/datafusion-comet/blob/main/spark/src/main/scala/org/apache/comet/rules/CometReuseSubquery.scala#L59)),
which flows through the new `doCanonicalize`. AQE's `stageCache` keys the same
way. The two-`ReusedSubquery` to zero-`ReusedSubquery` shift on q83 looks like
the cache is no longer collapsing the three `date_dim` DPP subqueries that it
used to collapse, which would translate to broadcasting `date_dim` three times
instead of once at runtime. On 3.4 this is moot because
`CometSpark34AqeDppFallbackRule` keeps DPP on the Spark side, but 3.5+ doesn't
have that fallback.
My read is that `originalPlan.canonicalized` is broader than the count-bug
needs. The count+1 vs count-1 difference surfaces in `originalPlan.output`. The
q83 subqueries differ inside `originalPlan` (each scan's DPP gets its own
`AdaptiveSparkPlanExec` wrapping a structurally-equivalent build plan in
[CometPlanAdaptiveDynamicPruningFilters.scala:266](https://github.com/apache/datafusion-comet/blob/main/spark/src/main/scala/org/apache/comet/rules/CometPlanAdaptiveDynamicPruningFilters.scala#L266))
but produce equivalent broadcast outputs. Including only
`originalPlan.canonicalized.output` (or the canonicalized output expressions)
in the canonical key would distinguish count+1 from count-1 while letting q83's
three date_dim subqueries collide again. Worth confirming the regression test
still fails without the broader form before narrowing.
Unrelated, while reading the file I noticed `equals` at line 244-252
compares `originalPlan` and `child` but ignores `mode` and `output`, and
`hashCode` at line 254 only hashes `child`. Pre-existing and not in scope for
this PR, but might be worth a follow-up issue.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]