Ah yes I'm +1 for that approach too - it would let us leverage all the schema-inference already in the Java SDK for translating configuration objects which would be great. Things on the Python side would be trickier as schemas don't formally support all the types you can use in the PayloadBuilder implementations [1] yet, just NamedTuple. For now we could just make the PayloadBuilder implementations generate Rows without making that translation available for use in PCollections.
Do we need to worry about update compatibility for ExternalConfigurationPayload? Brian [1] https://github.com/apache/beam/blob/master/sdks/python/apache_beam/transforms/external.py On Fri, Jul 10, 2020 at 4:23 PM Robert Bradshaw <[email protected]> wrote: > I would be in favor of just using a schema to store the entire > configuration. The reason we went with what we have to day is that we > didn't have cross language schemas yet. > > On Fri, Jul 10, 2020 at 12:24 PM Brian Hulette <[email protected]> > wrote: > > > > Hi everyone, > > I noticed that currently the ExternalConfigurationPayload uses a list of > coder URNs to represent the coder that was used to serialize each > configuration field [1]. This seems acceptable at first blush, but there's > one notable issue: it has no place to store a payload for the coder. Most > standard coders don't use a payload so it's not a problem, but row coder > does use a payload to store it's schema, which means it can't be used in an > ExternalConfigurationPayload today. > > > > Is there a reason not to just use the Coder message [2] in > ExternalConfigurationPayload instead of a list of coder URNs? That would > work with row coder, and it would also make it easier to re-use logic for > translating Pipeline protos. > > > > I'd be happy to make this change, but I wanted to ask on dev@ in case > there's something I'm missing here. > > > > Brian > > > > [1] > https://github.com/apache/beam/blob/c54a0b7f49f2eb4a15df115205e2fa455116ccbe/model/pipeline/src/main/proto/external_transforms.proto#L34-L35 > > [2] > https://github.com/apache/beam/blob/c54a0b7f49f2eb4a15df115205e2fa455116ccbe/model/pipeline/src/main/proto/beam_runner_api.proto#L542-L555 >
