reuvenlax commented on PR #31106: URL: https://github.com/apache/beam/pull/31106#issuecomment-2158929213
I'm also wanting to know whether there was something motivating this change - i.e. is their a Beam user that currently needs this? In addition to being careful about perf, this PR adds quite a bit of complexity to code that is already fairly complex. On Mon, Jun 10, 2024 at 9:51 AM Robert Bradshaw ***@***.***> wrote: > ***@***.**** commented on this pull request. > ------------------------------ > > In > sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/StorageApiLoads.java > <https://github.com/apache/beam/pull/31106#discussion_r1633552968>: > > > @@ -52,16 +52,18 @@ > /** This ***@***.*** PTransform} manages loads into BigQuery using the Storage API. */ > public class StorageApiLoads<DestinationT, ElementT> > extends PTransform<PCollection<KV<DestinationT, ElementT>>, WriteResult> { > - final TupleTag<KV<DestinationT, StorageApiWritePayload>> successfulConvertedRowsTag = > - new TupleTag<>("successfulRows"); > + final TupleTag<KV<DestinationT, KV<ElementT, StorageApiWritePayload>>> > > As a side comment, this is another motivation to use schema coders more > ubiquitously--adding another field is update compatible. > > On another note, anything that involves shuffling more data in the main > data path should be looked at carefully from a perf standpoint. We've gone > to a lot of effort (e.g. with dynamic destinations) to ensure shuffling > metadata doesn't become a perf impediment. > > — > Reply to this email directly, view it on GitHub > <https://github.com/apache/beam/pull/31106#discussion_r1633552968>, or > unsubscribe > <https://github.com/notifications/unsubscribe-auth/AFAYJVNQ6T5NBLVCJN5CWDDZGXKQ5AVCNFSM6AAAAABGZJZPSWVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDCMBYGMYDAOBRGI> > . > You are receiving this because you were mentioned.Message ID: > <apache/beam/pull/31106/review/2108300812 ***@***.***> > -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
