[GitHub] [beam] ahmedabu98 commented on pull request #24290: Implemented SchemaTransforms for SingleStoreIO

GitBox Wed, 14 Dec 2022 08:03:21 -0800


ahmedabu98 commented on PR #24290:
URL: https://github.com/apache/beam/pull/24290#issuecomment-1351692071


   Thanks @AdalbertMemSQL, LGTM so far.
   
   I had another design question, is it necessary to create two new read nested 
classes (`ReadRows` and `ReadWithPartitionsRows`) that cater to Row objects? 
Much of the code seems to be boilerplate for `Read` and `ReadWithPartitions`, 
except only to add some parameters to specify Row object output. It works as it 
is now, but it's better to be concise and not duplicate code.
   
   It may help to reduce this by creating a new readRows() function that calls 
on SingleStoreIO.read() and adding the specifications needed to output Rows. 
Same with partitions, a new readWithPartitionsRows() function. See this 
[example](https://github.com/apache/beam/blob/ef5351ad50a817498ea9e34a7c514dd9d60fb143/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java#L575)
 in BigQueryIO. The same could be applied here, passing in the relevant 
rowMapper and coder. Although this will probably need a check  at the end of 
Read/ReadWithPartitions expand() to see if we are reading Rows so that it can 
set row schema on the output PCollection.
   
   WDYT?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [beam] ahmedabu98 commented on pull request #24290: Implemented SchemaTransforms for SingleStoreIO

Reply via email to