andygrove commented on code in PR #577: URL: https://github.com/apache/datafusion-comet/pull/577#discussion_r1643557256
########## spark/inspections/CometTPCHQueriesList-results.txt: ########## @@ -1,133 +1,133 @@ -Query: q1 TPCH Snappy. Comet Exec: Enabled (CometHashAggregate, CometProject) +Query: q1 TPCH Snappy. Comet Exec: Enabled (CometHashAggregate, CometFilter, CometProject) Query: q1 TPCH Snappy: ExplainInfo: Comet shuffle is not enabled: spark.sql.adaptive.coalescePartitions.enabled is enabled and spark.comet.shuffle.enforceMode.enabled is not enabled Query: q2 TPCH Snappy. Comet Exec: Enabled (CometFilter, CometProject) Query: q2 TPCH Snappy: ExplainInfo: -BroadcastExchange is not supported +BroadcastHashJoin is not enabled because not all child plans are native Review Comment: Thanks the detailed write-up @parthchandra. I am really looking for query specifics, which I can see is is a different use case. The way we did this in Spark RAPIDS was to store the info on the specific node and then be able to view the query tree in the nested format similar to the example you posted above. If we stored the info on the specific nodes, I think we could still output the same summary we currently have by walking through the plan and extracting the individual reasons (and removing duplicates) and this would also allow us to have a tree view. Do you think that would add overhead for the main use case we are currently targeting? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
