[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1974: Flink: Add ChangeLog DataStream end-to-end unit tests.

GitBox Wed, 30 Dec 2020 10:01:32 -0800


aokolnychyi commented on a change in pull request #1974:
URL: https://github.com/apache/iceberg/pull/1974#discussion_r550279316




##########
File path: flink/src/main/java/org/apache/iceberg/flink/sink/FlinkSink.java
##########
@@ -169,6 +172,17 @@ public Builder writeParallelism(int newWriteParallelism) {
       return this;
     }
 
+    /**
+     * Configuring the equality field columns for iceberg table that accept 
CDC or UPSERT events.
+     *
+     * @param columns defines the iceberg table's key.
+     * @return {@link Builder} to connect the iceberg table.
+     */
+    public Builder equalityFieldColumns(List<String> columns) {

Review comment:
       I’d vote for not ensuring uniqueness as it is really hard at scale. If 
we are to ensure this at write, we have to join the incoming data with the 
target table making it really expensive. Doing this at read would require 
sorting the data not only by the sort key but also by the sequence number. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] aokolnychyi commented on a change in pull request #1974: Flink: Add ChangeLog DataStream end-to-end unit tests.

Reply via email to