advancedxy commented on code in PR #45267:
URL: https://github.com/apache/spark/pull/45267#discussion_r1521751659
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala:
##########
@@ -635,6 +636,22 @@ trait ShuffleSpec {
*/
def createPartitioning(clustering: Seq[Expression]): Partitioning =
throw SparkUnsupportedOperationException()
+
+ /**
+ * Return a set of [[Reducer]] for the partition expressions of this shuffle
spec,
+ * on the partition expressions of another shuffle spec.
+ * <p>
+ * A [[Reducer]] exists for a partition expression function of this shuffle
spec if it is
+ * 'reducible' on the corresponding partition expression function of the
other shuffle spec.
+ * <p>
+ * If a value is returned, there must be one Option[[Reducer]] per partition
expression.
+ * A None value in the set indicates that the particular partition
expression is not reducible
+ * on the corresponding expression on the other shuffle spec.
+ * <p>
+ * Returning none also indicates that none of the partition expressions can
be reduced on the
+ * corresponding expression on the other shuffle spec.
+ */
+ def reducers(spec: ShuffleSpec): Option[Seq[Option[Reducer[_]]]] = None
Review Comment:
> The Reducer here allows data sources to specify relationships between
transforms beyond the bucketing case.
Yeah, I can get the potential usage now. However, It's still hard for
developers to correctly understands what `Reducer` actually mean (it's
mathematically clear though) and how does it work. Maybe we should add some
concrete examples in the JavaDoc of `Reducer` class. It would also be great to
demonstrate the use case beyonds the bucketing case in the test, but I think
that's optional.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]