[GitHub] [beam] nielm commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

GitBox Wed, 07 Oct 2020 07:39:08 -0700


nielm commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r501066733




##########
File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java
##########
@@ -678,6 +703,42 @@ public Read withPartitionOptions(PartitionOptions 
partitionOptions) {
               .withTransaction(getTransaction());
       return input.apply(Create.of(getReadOperation())).apply("Execute query", 
readAll);
     }
+
+    SerializableFunction<Struct, Row> getFormatFn() {
+      return (SerializableFunction<Struct, Row>)
+          input ->
+              Row.withSchema(Schema.builder().addInt64Field("Key").build())
+                  .withFieldValue("Key", 3L)
+                  .build();
+    }
+  }
+
+  public static class ReadRows extends PTransform<PBegin, PCollection<Row>> {
+    Read read;
+    Schema schema;
+
+    public ReadRows(Read read, Schema schema) {
+      super("Read rows");
+      this.read = read;
+      this.schema = schema;

Review comment:
       I don't see any good solution here...
   When reading an entire table, it could be possible to read the table's 
schema first, and determine what types the columns are, but this does not work 
for a query as the query output columns may not correspond to table columns. 
   
   Adding `LIMIT 1` would only work for simple queries, anything with joins, 
`GROUP BY`, `ORDER BY` will require the majority of the query to be executed 
before a single row is returned. 
   
   So the only solution I can see is for the caller to specify the row Schema 
as you do here..




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [beam] nielm commented on a change in pull request #12611: [BEAM-10139][BEAM-10140] Add cross-language support for Java SpannerIO with python wrapper

Reply via email to