ahmedabu98 commented on code in PR #32529:
URL: https://github.com/apache/beam/pull/32529#discussion_r1799340244


##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/providers/BigQueryStorageWriteApiSchemaTransformProvider.java:
##########
@@ -365,6 +394,34 @@ public TableSchema getSchema(String destination) {
       }
     }
 
+    private static class CdcWritesDynamicDestination extends 
RowDynamicDestinations {

Review Comment:
   For simplicity, I suggest we expand `RowDynamicDestinations` to include an 
optional primary key, i.e. `RowDynamicDestinations(Schema schema, @Nullable 
List<String> primaryKey)`.
   
   And we can add this `getTableConstraints()` implementation to 
`RowDynamicDestinations` (if primaryKey exists, return an appropriate 
`TableConstraints` object, else null).



##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/providers/BigQueryStorageWriteApiSchemaTransformProvider.java:
##########
@@ -498,5 +563,52 @@ BigQueryIO.Write<Row> 
createStorageWriteApiTransform(Schema schema) {
 
       return write;
     }
+
+    BigQueryIO.Write<Row> validateAndIncludeCDCInformation(
+        BigQueryIO.Write<Row> write, Schema schema) {
+      checkArgument(
+          
schema.getFieldNames().containsAll(Arrays.asList(ROW_PROPERTY_MUTATION_INFO, 
"record")),
+          "When writing using CDC functionality, we expect Row Schema with a "
+              + "\""
+              + ROW_PROPERTY_MUTATION_INFO
+              + "\" Row field and a \"record\" Row field.");
+      checkArgument(
+          schema
+              .getField(ROW_PROPERTY_MUTATION_INFO)
+              .getType()
+              .getRowSchema()
+              .equals(ROW_SCHEMA_MUTATION_INFO),
+          "When writing using CDC functionality, we expect a \""
+              + ROW_PROPERTY_MUTATION_INFO
+              + "\" field of Row type with fields \""
+              + ROW_PROPERTY_MUTATION_TYPE
+              + "\" and \""
+              + ROW_PROPERTY_MUTATION_SQN
+              + "\" both of type string.");
+
+      String tableDestination = null;
+
+      if (configuration.getTable().equals(DYNAMIC_DESTINATIONS)) {
+        validateDynamicDestinationsExpectedSchema(schema);
+      } else {
+        tableDestination = configuration.getTable();
+      }
+
+      return write
+          .to(
+              new CdcWritesDynamicDestination(
+                  schema.getField("record").getType().getRowSchema(),
+                  tableDestination,
+                  configuration.getPrimaryKey()))

Review Comment:
   After resolving to just `RowDynamicDestinations`, we can remove these lines. 
The rest of this part looks good



##########
sdks/python/apache_beam/io/external/xlang_bigqueryio_it_test.py:
##########
@@ -259,11 +259,11 @@ def test_write_with_beam_rows_cdc(self):
 
     rows_with_cdc = [
         beam.Row(
-            cdc_info=beam.Row(
+            row_mutation_info=beam.Row(

Review Comment:
   Let's add an identical test for python dicts (I remember you had one 
previously with the callable function, but we can leave that part out)



##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/providers/BigQueryStorageWriteApiSchemaTransformProvider.java:
##########
@@ -466,11 +530,11 @@ BigQueryIO.Write<Row> 
createStorageWriteApiTransform(Schema schema) {
               .withFormatFunction(BigQueryUtils.toTableRow())
               .withWriteDisposition(WriteDisposition.WRITE_APPEND);
 
-      if (configuration.getTable().equals(DYNAMIC_DESTINATIONS)) {
-        checkArgument(
-            schema.getFieldNames().equals(Arrays.asList("destination", 
"record")),
-            "When writing to dynamic destinations, we expect Row Schema with a 
"
-                + "\"destination\" string field and a \"record\" Row field.");
+      // in case CDC writes are configured we validate and include them in the 
configuration
+      if (Optional.ofNullable(configuration.getUseCdcWrites()).orElse(false)) {
+        write = validateAndIncludeCDCInformation(write, schema);

Review Comment:
   After resolving to just `RowDynamicDestinations`, we can bring this check 
down below. i.e. the order should be:
   
   - if DynamicDestinations, apply `to(RowDynamicDestinations)`
   - else, apply `to(table)`
   - ...
   - if CdcWrites, apply CDC information



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to