gatorsmile commented on a change in pull request #23249: [SPARK-26297][SQL]
improve the doc of Distribution/Partitioning
URL: https://github.com/apache/spark/pull/23249#discussion_r240347775
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala
##########
@@ -22,13 +22,12 @@ import org.apache.spark.sql.types.{DataType, IntegerType}
/**
* Specifies how tuples that share common expressions will be distributed when
a query is executed
- * in parallel on many machines. Distribution can be used to refer to two
distinct physical
- * properties:
- * - Inter-node partitioning of data: In this case the distribution describes
how tuples are
- * partitioned across physical machines in a cluster. Knowing this
property allows some
- * operators (e.g., Aggregate) to perform partition local operations
instead of global ones.
- * - Intra-partition ordering of data: In this case the distribution
describes guarantees made
- * about how tuples are distributed within a single partition.
+ * in parallel on many machines.
+ *
+ * Distribution here refers to inter-node partitioning of data:
+ * - The distribution describes how tuples are partitioned across physical
machines in a cluster.
+ * Knowing this property allows some operators (e.g., Aggregate) to
perform partition local
+ * operations instead of global ones.
Review comment:
How about?
> Distribution here refers to inter-node partitioning of data. That is, it
describes how tuples are partitioned across physical machines in a cluster.
Knowing this property allows some operators (e.g., Aggregate) to perform
partition local operations instead of global ones.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]