[jira] [Commented] (SPARK-26297) improve the doc of Distribution/Partitioning

ASF GitHub Bot (JIRA) Mon, 10 Dec 2018 11:28:19 -0800


    [ 
https://issues.apache.org/jira/browse/SPARK-26297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16715425#comment-16715425
 ]


ASF GitHub Bot commented on SPARK-26297:
----------------------------------------

gatorsmile commented on a change in pull request #23249: [SPARK-26297][SQL] 
improve the doc of Distribution/Partitioning
URL: https://github.com/apache/spark/pull/23249#discussion_r240347775
 
 

 ##########
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala
 ##########
 @@ -22,13 +22,12 @@ import org.apache.spark.sql.types.{DataType, IntegerType}
 
 /**
  * Specifies how tuples that share common expressions will be distributed when 
a query is executed
- * in parallel on many machines.  Distribution can be used to refer to two 
distinct physical
- * properties:
- *  - Inter-node partitioning of data: In this case the distribution describes 
how tuples are
- *    partitioned across physical machines in a cluster.  Knowing this 
property allows some
- *    operators (e.g., Aggregate) to perform partition local operations 
instead of global ones.
- *  - Intra-partition ordering of data: In this case the distribution 
describes guarantees made
- *    about how tuples are distributed within a single partition.
+ * in parallel on many machines.
+ *
+ * Distribution here refers to inter-node partitioning of data:
+ *   - The distribution describes how tuples are partitioned across physical 
machines in a cluster.
+ *     Knowing this property allows some operators (e.g., Aggregate) to 
perform partition local
+ *     operations instead of global ones.
 
 Review comment:
   How about?
   
   > Distribution here refers to inter-node partitioning of data. That is, it 
describes how tuples are partitioned across physical machines in a cluster. 
Knowing this property allows some operators (e.g., Aggregate) to perform 
partition local operations instead of global ones.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> improve the doc of Distribution/Partitioning
> --------------------------------------------
>
>                 Key: SPARK-26297
>                 URL: https://issues.apache.org/jira/browse/SPARK-26297
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Wenchen Fan
>            Assignee: Wenchen Fan
>            Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-26297) improve the doc of Distribution/Partitioning

Reply via email to