twang126 commented on code in PR #25935:
URL: https://github.com/apache/beam/pull/25935#discussion_r1145237180
##########
sdks/python/apache_beam/yaml/yaml_transform.py:
##########
@@ -377,6 +378,18 @@ def pipeline_as_composite(spec):
return dict(spec, name=None, type='composite')
+def normalize_source_sink(spec):
Review Comment:
We handled them both as lists and did as you suggested (flatten all sources,
write to each sink). I don't think it was too magical, users usually expected
that behavior anyways. We had sources and sinks as separate types and looking
back it was a headache trying to maintain all of the different types, but that
divergence is probably inevitable. For example, trying to parse a user
transform is gonna be a lot different than trying to read in a specific Kafka
settings config. So even if we don't have sources be their own type from the
beginning, it might eventually move there anyways.
Something to point out: our strategy got restrictive wrt branching (e.g.
applying distinct transform chains for the same source or trying to sink
intermediate results) and this is where the "too much magic" came in to bite
us. How were you planning to support branching?
fwiw, my recommendation would be to keep the magic out of it to start and if
users want multiple sources/sinks at the moment, they can just add them
directly to the transform list
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]