zhengruifeng commented on code in PR #42770:
URL: https://github.com/apache/spark/pull/42770#discussion_r1314388163


##########
python/pyspark/sql/dataframe.py:
##########
@@ -1809,18 +1810,27 @@ def repartition(  # type: ignore[misc]
 
         Repartition the data into 10 partitions.
 
-        >>> df.repartition(10).rdd.getNumPartitions()
-        10
+        >>> df.repartition(10).explain()
+        == Physical Plan ==

Review Comment:
   I found that current changes can reflect the explainations like
   
   ```
           Repartition the data into 7 partitions by 'age' and 'name columns.
   ```
   
   ```
           Repartition the data into 2 partitions by range in 'age' column.
           For example, the first partition can have ``(14, "Tom")``, and the 
second
           partition would have ``(16, "Bob")`` and ``(23, "Alice")``.
   ```
   
   while if we use the numPartitions (not matter `rdd.getNumPartitions` or 
`spark_partition_id()`) in the examples, it can not provide such information.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to