kbendick commented on a change in pull request #2898:
URL: https://github.com/apache/iceberg/pull/2898#discussion_r809653028



##########
File path: 
flink/v1.14/flink/src/main/java/org/apache/iceberg/flink/sink/FlinkSink.java
##########
@@ -412,18 +417,38 @@ private String operatorName(String suffix) {
 
       switch (writeMode) {
         case NONE:
+          if (!equalityFieldIds.isEmpty()) {
+            LOG.info("Distribute rows by equality fields in '{}' distribution 
mode", DistributionMode.NONE.modeName());
+            return input.keyBy(new EqualityFieldKeySelector(equalityFieldIds, 
iSchema, flinkRowType));
+          }
+
           return input;
 
         case HASH:
           if (partitionSpec.isUnpartitioned()) {
+            if (!equalityFieldIds.isEmpty()) {
+              LOG.info("Distribute rows by equality fields in '{}' 
distribution mode, because table is unpartitioned",
+                  DistributionMode.HASH.modeName());
+              return input.keyBy(new 
EqualityFieldKeySelector(equalityFieldIds, iSchema, flinkRowType));
+            }
+
+            LOG.warn("Fallback to use '{}' distribution mode, because table is 
unpartitioned",
+                DistributionMode.NONE.modeName());

Review comment:
       > his is happening because the table is unpartitioned and there are no 
equality fields set. 
   
   I think what Ryan means is that if we’re logging the reason some change is 
occuring, we should log the full reason. An unpartitioned table with equality 
field ids set wouldn’t fall back to NONE but the log message tells the user 
only part of the reasoning so they have incomplete information on changing the 
behavior.
   
   And the users set writeMode, so likely some of them want to determine how to 
get the desired distribution mode in their job. So people will potentially 
think they _have_ to partition their table when they could just set equality 
fields and avoid having to change their data layout.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to