sunchao commented on code in PR #42306:
URL: https://github.com/apache/spark/pull/42306#discussion_r1320031556
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala:
##########
@@ -672,9 +708,17 @@ case class HashShuffleSpec(
override def numPartitions: Int = partitioning.numPartitions
}
+/**
+ * [[ShuffleSpec]] created by [[KeyGroupedPartitioning]].
+ * @param partitioning key grouped partitioning
Review Comment:
nit: leave an empty line above the first `@param`
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -1530,6 +1530,18 @@ object SQLConf {
.booleanConf
.createWithDefault(false)
+ val V2_BUCKETING_ALLOW_JOIN_KEYS_SUBSET_OF_PARTITION_KEYS =
+
buildConf("spark.sql.sources.v2.bucketing.allowJoinKeysSubsetOfPartitionKeys.enabled")
+ .doc("Whether to allow storage-partition join in the case where join
keys are" +
+ "a subset of the partition keys of the source tables. At planning
time, " +
+ "Spark will group the partitions by only those keys that are in the
join keys." +
+ "This is currently enabled only if
spark.sql.requireAllClusterKeysForDistribution " +
Review Comment:
nit: replace with `spark.sql.requireAllClusterKeysForDistribution`
`${REQUIRE_ALL_CLUSTER_KEYS_FOR_DISTRIBUTION.key}` (and add `s` to the
beginning of this line)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]