[GitHub] [spark] HeartSaVioR commented on a change in pull request #31355: [SPARK-34255][SQL] Support partitioning with static number on required distribution and ordering on V2 write

GitBox Mon, 15 Feb 2021 12:10:29 -0800


HeartSaVioR commented on a change in pull request #31355:
URL: https://github.com/apache/spark/pull/31355#discussion_r576402899




##########
File path: 
sql/catalyst/src/main/java/org/apache/spark/sql/connector/write/RequiresDistributionAndOrdering.java
##########
@@ -42,6 +42,19 @@
    */
   Distribution requiredDistribution();
 
+  /**
+   * Returns the number of partitions required by this write if specific 
distribution is required.
+   * <p>
+   * Implementations may want to override this if it requires the specific 
number of partitions
+   * on distribution.
+   * <p>
+   * {@link UnspecifiedDistribution} is not affected by this method, as it 
doesn't require the
+   * specific distribution.
+   *
+   * @return the required number of partitions, non-positive values mean no 
requirement.
+   */
+  default int requiredNumPartitionsOnDistribution() { return 0; }

Review comment:
       I'm actually more familiar with the word "parallelism" but the word 
looks to be less used in Spark - "partition" is being used almost everywhere. 
I'm OK to mention it as "parallelism" but let's hear more voices on this.
   
   The name comes from the fact the number is only effective when distribution 
is specified - longer name is to avoid misunderstanding that it also takes 
effect on sorting request, whereas it is not. Probably we could discuss the 
impact first and revisit this.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] HeartSaVioR commented on a change in pull request #31355: [SPARK-34255][SQL] Support partitioning with static number on required distribution and ordering on V2 write

Reply via email to