[ https://issues.apache.org/jira/browse/SPARK-26297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16715425#comment-16715425 ]
ASF GitHub Bot commented on SPARK-26297: ---------------------------------------- gatorsmile commented on a change in pull request #23249: [SPARK-26297][SQL] improve the doc of Distribution/Partitioning URL: https://github.com/apache/spark/pull/23249#discussion_r240347775 ########## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala ########## @@ -22,13 +22,12 @@ import org.apache.spark.sql.types.{DataType, IntegerType} /** * Specifies how tuples that share common expressions will be distributed when a query is executed - * in parallel on many machines. Distribution can be used to refer to two distinct physical - * properties: - * - Inter-node partitioning of data: In this case the distribution describes how tuples are - * partitioned across physical machines in a cluster. Knowing this property allows some - * operators (e.g., Aggregate) to perform partition local operations instead of global ones. - * - Intra-partition ordering of data: In this case the distribution describes guarantees made - * about how tuples are distributed within a single partition. + * in parallel on many machines. + * + * Distribution here refers to inter-node partitioning of data: + * - The distribution describes how tuples are partitioned across physical machines in a cluster. + * Knowing this property allows some operators (e.g., Aggregate) to perform partition local + * operations instead of global ones. Review comment: How about? > Distribution here refers to inter-node partitioning of data. That is, it describes how tuples are partitioned across physical machines in a cluster. Knowing this property allows some operators (e.g., Aggregate) to perform partition local operations instead of global ones. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > improve the doc of Distribution/Partitioning > -------------------------------------------- > > Key: SPARK-26297 > URL: https://issues.apache.org/jira/browse/SPARK-26297 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.0.0 > Reporter: Wenchen Fan > Assignee: Wenchen Fan > Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org