Reo-LEI commented on a change in pull request #2863:
URL: https://github.com/apache/iceberg/pull/2863#discussion_r697884300



##########
File path: flink/src/main/java/org/apache/iceberg/flink/sink/FlinkSink.java
##########
@@ -321,7 +339,27 @@ private String operatorName(String suffix) {
           equalityFieldIds.add(field.fieldId());
         }
       }
-      IcebergStreamWriter<RowData> streamWriter = createStreamWriter(table, 
flinkRowType, equalityFieldIds);
+
+      // Fallback to use upsert mode parsed from table properties if don't 
specify in job level.
+      boolean upsertMode = upsert || 
PropertyUtil.propertyAsBoolean(table.properties(),
+          UPSERT_MODE_ENABLED, UPSERT_MODE_ENABLED_DEFAULT);
+
+      // Validate the equality fields and partition fields if we enable the 
upsert mode.
+      if (upsertMode) {
+        Preconditions.checkState(!overwrite,
+            "OVERWRITE mode shouldn't be enable when configuring to use UPSERT 
data stream.");
+        Preconditions.checkState(!equalityFieldIds.isEmpty(),
+            "Equality field columns shouldn't be empty when configuring to use 
UPSERT data stream.");
+        if (!table.spec().isUnpartitioned()) {

Review comment:
       
https://github.com/apache/iceberg/blob/4d33f18f5fcd4c20aea6d4118bd03e0d181271d0/flink/src/main/java/org/apache/iceberg/flink/sink/FlinkSink.java#L222
   As @openinx comment as above, we shoule restrict the partition fields is a 
subset of equality fields to ensure we can delete the old data in same 
partition. 
   
   > e.g., we can have an equality field (like user_id) and table can be 
partitioned by hour. would that be a valid scenario?
   
   I think that is not a valid scenario, to keep `user_id` unique in all 
different `hour` parition is make no sense.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to