[GitHub] [spark] c21 commented on a change in pull request #32875: [SPARK-35703][SQL] Relax constraint for bucket join and remove HashClusteredDistribution

GitBox Thu, 16 Dec 2021 12:38:51 -0800


c21 commented on a change in pull request #32875:
URL: https://github.com/apache/spark/pull/32875#discussion_r770904844




##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala
##########
@@ -352,3 +350,142 @@ case class BroadcastPartitioning(mode: BroadcastMode) 
extends Partitioning {
     case _ => false
   }
 }
+
+/**
+ * This is used in the scenario where an operator has multiple children (e.g., 
join) and one or more
+ * of which have their own requirement regarding whether its data can be 
considered as
+ * co-partitioned from others. This offers APIs for:
+ *
+ *   1. Comparing with specs from other children of the operator and check if 
they are compatible.
+ *      When two specs are compatible, we can say their data are 
co-partitioned, and Spark will
+ *      potentially able to eliminate shuffle if necessary.
+ *   1. Creating a partitioning that can be used to re-partition another 
child, so that to make it

Review comment:
       wow the syntax is really confusing IMHO... Is it worth to adhere to 
scaladoc syntax here? For all other Spark codebase, it seems that we always go 
with `1. 2. ...`. Given this file will be pretty frequently referenced/read in 
the future, shall we just be consistent with existing codebase? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] c21 commented on a change in pull request #32875: [SPARK-35703][SQL] Relax constraint for bucket join and remove HashClusteredDistribution

Reply via email to