advancedxy commented on code in PR #255:
URL:
https://github.com/apache/arrow-datafusion-comet/pull/255#discussion_r1570049115
##########
spark/inspections/CometTPCHQueriesList-results.txt:
##########
@@ -0,0 +1,121 @@
+Query: q1 TPCH Snappy. Comet Exec: Enabled (CometHashAggregate, CometProject)
+Query: q1 TPCH Snappy: ExplainInfo:
+Shuffle: unsupported Spark partitioning:
org.apache.spark.sql.catalyst.plans.physical.RangePartitioning
+
+Query: q2 TPCH Snappy. Comet Exec: Enabled (CometSort, CometSortMergeJoin,
CometProject, CometFilter)
+Query: q2 TPCH Snappy: ExplainInfo:
+ObjectHashAggregate is not supported
+might_contain is not supported
+BroadcastHashJoin is not supported
+SortMergeJoin is not supported
Review Comment:
BroadcastHashJoin and SortMergeJoin should also been supported? There should
be some more specific reason why these are not supported?
##########
spark/src/test/scala/org/apache/spark/sql/CometTPCQueryListBase.scala:
##########
@@ -98,6 +105,11 @@ trait CometTPCQueryListBase
} else {
out.println(s"Query: $name$nameSuffix. Comet Exec: Disabled")
}
+ if (supportsExtendedExplainInfo(df.queryExecution)) {
Review Comment:
I think the `if` should be removed? Otherwise, running with Vanilla Spark
3.x, the ExplainInfo is not printed.
(P.S: I ran the TPCH query and found no ExplainInfo with open source Spark
3.4).
##########
spark/inspections/CometTPCHQueriesList-results.txt:
##########
@@ -0,0 +1,121 @@
+Query: q1 TPCH Snappy. Comet Exec: Enabled (CometHashAggregate, CometProject)
+Query: q1 TPCH Snappy: ExplainInfo:
+Shuffle: unsupported Spark partitioning:
org.apache.spark.sql.catalyst.plans.physical.RangePartitioning
+
+Query: q2 TPCH Snappy. Comet Exec: Enabled (CometSort, CometSortMergeJoin,
CometProject, CometFilter)
+Query: q2 TPCH Snappy: ExplainInfo:
+ObjectHashAggregate is not supported
+might_contain is not supported
Review Comment:
hmm,
`might_contain`(org.apache.spark.sql.catalyst.expressions.BloomFilterMightContain)
should already been supported in #179. Its child expression might be a
`XXHash64`, which is not supported yet.
I'm wondering if this output is a false positive? However I cannot
reproduce the `might_contain` case in my local 1GB TPCH runs.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]