brucearctor commented on pull request #16578: URL: https://github.com/apache/beam/pull/16578#issuecomment-1050561219
@chamikaramj -- how's this code look? This is an improvement, that doesn't break any existing tests -- so, I think is OK to merge. Clustering is not created properly for STORAGE_API_AT_LEAST_ONCE or STORAGE_WRITE_API -- so those will be new tickets that I add. Again, since this is an improvement over existing functionality ( and does not break things ), it seems OK to merge. I might revisit to get those working later, but for now the tests are for 'DEFAULT', and 'FILE_LOADS'. I will write some follow up tickets for: a) document this functionality. at least in https://beam.apache.org/documentation/io/built-in/google-bigquery/ , but perhaps anywhere else suggested b) get clustering working with STORAGE_API_AT_LEAST_ONCE c) get clustering working with STORAGE_WRITE_API ...? Anything else top of mind ( what else would we want to see, but needn't be captured by this current improvement? Also, would be curious to learn the implementation differences between STORAGE_API_AT_LEAST_ONCE and STORAGE_WRITE_API ... I wonder if solved for one, if that fixes both. I see some related issues that I'd like to see addressed ahead of implementing any 'fix' or additional integration tests just for clustering, when using the storage write api --> ex: https://issues.apache.org/jira/browse/BEAM-13747 and https://issues.apache.org/jira/browse/BEAM-13753 / https://issues.apache.org/jira/browse/BEAM-13990 If you agree is fine, do squash/merge, or give me the LGTM. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
