piotr-szuberski commented on a change in pull request #12611:
URL: https://github.com/apache/beam/pull/12611#discussion_r482073899



##########
File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/SpannerIO.java
##########
@@ -678,6 +703,42 @@ public Read withPartitionOptions(PartitionOptions 
partitionOptions) {
               .withTransaction(getTransaction());
       return input.apply(Create.of(getReadOperation())).apply("Execute query", 
readAll);
     }
+
+    SerializableFunction<Struct, Row> getFormatFn() {
+      return (SerializableFunction<Struct, Row>)
+          input ->
+              Row.withSchema(Schema.builder().addInt64Field("Key").build())
+                  .withFieldValue("Key", 3L)
+                  .build();
+    }
+  }
+
+  public static class ReadRows extends PTransform<PBegin, PCollection<Row>> {
+    Read read;
+    Schema schema;
+
+    public ReadRows(Read read, Schema schema) {
+      super("Read rows");
+      this.read = read;
+      this.schema = schema;

Review comment:
       I'd really like to do it in this PR, but the only thing that comes to 
mind is to do what you said - perform the read request with client and then 
read the schema. The obvious disadvantage is that the Spanner query will be 
executed twice. I researched that limit of 1 row added to the end of query will 
not improve the performance so this is not the thing to do for huge result sets




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to