namrathamyske commented on code in PR #8042:
URL: https://github.com/apache/iceberg/pull/8042#discussion_r1277927973


##########
spark/v3.3/spark-extensions/src/main/scala/org/apache/spark/sql/execution/datasources/v2/ExtendedDistributionAndOrderingUtils.scala:
##########
@@ -57,10 +61,34 @@ object ExtendedDistributionAndOrderingUtils {
         } else {
           conf.numShufflePartitions
         }
-        // the conversion to catalyst expressions above produces SortOrder 
expressions
-        // for OrderedDistribution and generic expressions for 
ClusteredDistribution
-        // this allows RepartitionByExpression to pick either range or hash 
partitioning
-        RepartitionByExpression(ArraySeq.unsafeWrapArray(distribution), query, 
finalNumPartitions)
+
+        val tableProperties = table match {
+          case d : RowLevelOperationTable => d.table.properties()
+          case _ : Table => table.properties()
+        }
+
+        val isHashDistributionMode = write.requiredDistribution match {
+          case _ : ClusteredDistribution => true
+          case _ => false
+        }
+
+        val strictDistributionMode = tableProperties
+          .getOrDefault(TableProperties.STRICT_TABLE_DISTRIBUTION_AND_ORDERING,
+            TableProperties.STRICT_TABLE_DISTRIBUTION_AND_ORDERING_DEFAULT)
+
+        if (strictDistributionMode.equals("false") && isHashDistributionMode) {
+          // if strict distribution mode is not enabled, then we fallback to 
spark AQE
+          // to determine the number of partitions by colaesceing and 
un-skewing partitions
+          // Also to note, Rebalance is only supported for hash distribution 
mode till spark 3.3

Review Comment:
   @aokolnychyi in spark 3.4 both hash, range is supported for rebalance 
operator. But for spark 3.3 only hash is supported.I will change the statement 
to `in spark 3.3`. Do we we want to fall back to rebalance even for none 
distribution mode (aka round robin partitioning)? I am not very sure about this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to