[GitHub] [spark] cloud-fan commented on a change in pull request #32932: [SPARK-35786][SQL] Add a new operator to distingush if AQE can optimize safely

GitBox Wed, 23 Jun 2021 08:18:49 -0700


cloud-fan commented on a change in pull request #32932:
URL: https://github.com/apache/spark/pull/32932#discussion_r657209115




##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala
##########
@@ -93,9 +93,16 @@ case object REPARTITION_BY_COL extends ShuffleOrigin
 case object REPARTITION_BY_NUM extends ShuffleOrigin
 
 // Indicates that the shuffle operator was added by the user-specified 
repartition operator. Spark
-// firstly tries to coalesce partitions, if it cannot be coalesced, then use 
the local shuffle
-// reader.
-case object REPARTITION_BY_NONE extends ShuffleOrigin
+// will try to rebalance partitions that make per-partition size not too small 
and not too big,
+// if can not rebalance partitions then use the local shuffle reader.

Review comment:
       >  if can not rebalance partitions then use the local shuffle reader.
   
   This is incorrect. Local shuffle reader can also balance the partitions. 
Let's say
   ```
   Local shuffle reader will be used if possible to reduce network traffic.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] cloud-fan commented on a change in pull request #32932: [SPARK-35786][SQL] Add a new operator to distingush if AQE can optimize safely

Reply via email to