slilichenko commented on code in PR #26975:
URL: https://github.com/apache/beam/pull/26975#discussion_r1214541990


##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/RowUpdateInformation.java:
##########
@@ -0,0 +1,50 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.gcp.bigquery;
+
+import com.google.auto.value.AutoValue;
+
+/**
+ * This class indicates how to apply a row update to BigQuery. If UPDATE or 
DELETE is selected as
+ * the update type, then a sequence number must also be supplied to order the 
updates. Incorrect
+ * sequence numbers will result in unexpected state in the BigQuery table.
+ */
+@AutoValue
+public abstract class RowUpdateInformation {
+  public enum UpdateType {
+    // Insert the row into the table, ignoring the primary key. Inserting the 
same row twice will
+    // result in two copies
+    // of the row in the table.
+    INSERT,

Review Comment:
   Curious why INSERT is needed in the first place. Looks like you just convert 
INSERT records into regular INSERTS and leave out the CDC semantics. If you 
were to force UPSERT for everything - wouldn't it make things safer, esp. with 
the possibility that with AT_LEAST_ONCE required semantics these INSERTS can 
result in duplicates (even though the input feed is "perfect" and there is no  
end-user error in the pipeline design).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to