Polber commented on code in PR #32008:
URL: https://github.com/apache/beam/pull/32008#discussion_r1729467956


##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/StructUtils.java:
##########
@@ -52,6 +67,58 @@ public static Row structToBeamRow(Struct struct, Schema 
schema) {
     return Row.withSchema(schema).withFieldValues(structValues).build();
   }
 
+  public static Schema structTypeToBeamRowSchema(StructType structType, 
boolean isRead) {
+    Schema.Builder beamSchema = Schema.builder();
+    structType
+        .getFieldsList()
+        .forEach(
+            field -> {
+              Schema.FieldType fieldType;
+              try {
+                fieldType = convertSpannerTypeToBeamFieldType(field.getType());
+              } catch (IllegalArgumentException e) {
+                throw new IllegalArgumentException(
+                    "Error processing struct to row: " + e.getMessage());
+              }
+              // Treat reads from Spanner as Nullable and leave Null handling 
to Spanner
+              if (isRead) {
+                beamSchema.addNullableField(field.getName(), fieldType);
+              } else {
+                beamSchema.addField(field.getName(), fieldType);
+              }
+            });
+    return beamSchema.build();
+  }
+
+  public static Schema.FieldType convertSpannerTypeToBeamFieldType(
+      com.google.spanner.v1.Type spannerType) {
+    switch (spannerType.getCode()) {
+      case BOOL:
+        return Schema.FieldType.BOOLEAN;
+      case BYTES:
+        return Schema.FieldType.BYTES;
+      case TIMESTAMP:
+      case DATE:
+        return Schema.FieldType.DATETIME;
+      case INT64:
+        return Schema.FieldType.INT64;
+      case FLOAT32:
+        return Schema.FieldType.FLOAT;
+      case FLOAT64:
+        return Schema.FieldType.DOUBLE;
+      case NUMERIC:
+        return Schema.FieldType.DECIMAL;

Review Comment:
   Since data types are not 1:1 for many different sources in Beam (BigQuery 
has lots of data types that are exclusive to BigQuery and do not exist as Java 
types) we map to the closest Java type.
   
   For example, `FLOAT64` is not a Java type, but `Double` is implemented as a 
64-bit float, so the cast will work when the data is deserialized. 
   
   As I said in my other comment, all the data types that were added in this PR 
were tested and all the data conversions worked successfully



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to