Reo-LEI commented on a change in pull request #2863:
URL: https://github.com/apache/iceberg/pull/2863#discussion_r697884300
##########
File path: flink/src/main/java/org/apache/iceberg/flink/sink/FlinkSink.java
##########
@@ -321,7 +339,27 @@ private String operatorName(String suffix) {
equalityFieldIds.add(field.fieldId());
}
}
- IcebergStreamWriter<RowData> streamWriter = createStreamWriter(table,
flinkRowType, equalityFieldIds);
+
+ // Fallback to use upsert mode parsed from table properties if don't
specify in job level.
+ boolean upsertMode = upsert ||
PropertyUtil.propertyAsBoolean(table.properties(),
+ UPSERT_MODE_ENABLED, UPSERT_MODE_ENABLED_DEFAULT);
+
+ // Validate the equality fields and partition fields if we enable the
upsert mode.
+ if (upsertMode) {
+ Preconditions.checkState(!overwrite,
+ "OVERWRITE mode shouldn't be enable when configuring to use UPSERT
data stream.");
+ Preconditions.checkState(!equalityFieldIds.isEmpty(),
+ "Equality field columns shouldn't be empty when configuring to use
UPSERT data stream.");
+ if (!table.spec().isUnpartitioned()) {
Review comment:
https://github.com/apache/iceberg/blob/4d33f18f5fcd4c20aea6d4118bd03e0d181271d0/flink/src/main/java/org/apache/iceberg/flink/sink/FlinkSink.java#L222
As @openinx comment as above, we shoule restrict the partition fields is a
subset of equality fields to ensure we can delete the old data in same
partition.
> e.g., we can have an equality field (like user_id) and table can be
partitioned by hour. would that be a valid scenario?
I think that is not a valid scenario, to keep `user_id` unique in all
different `hour` parition is make no sense.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]