cloud-fan commented on a change in pull request #32932:
URL: https://github.com/apache/spark/pull/32932#discussion_r657082306
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
##########
@@ -1351,6 +1351,31 @@ object RepartitionByExpression {
}
}
+/**
+ * This operator used to rebalance the query result output partitions, so that
every partition
+ * is of a reasonable size (not too small and not too big). It can take column
names as parameters,
+ * and try its best to partition the query result by these columns. If there
are skews, Spark will
+ * split the skewed partitions, to make these partitions not too big. This
operator is useful when
+ * you need to write the result of this query to a table, to avoid too
small/big files.
+ *
+ * Note that, only AQE is enabled does the operator make sense.
Review comment:
Note that, this operator only makes sense when AQE is enabled.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]