TheNeuralBit commented on PR #23224: URL: https://github.com/apache/beam/pull/23224#issuecomment-1249394035
> Could you elaborate please? Is the syntax different or the whole working different? Java's [Select](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/schemas/transforms/Select.html) transform just allows projecting fields - it allows users to select fields by name or ID, possibly with nested fields separated by '.'. In Python, the [Select](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.Select) transform does allow this style: ```python beam.Select("userId", "eventId") ``` But it also allows users to declare new fields with arbitrary expressions using lambdas (the style you've used in your examples): ```python beam.Select(computedField=lambda row: row.userId + row.eventId) ``` The styles can be mixed and matched too: ```python beam.Select("userId", computedField=lambda row: row.userId + row.eventId) ``` Perhaps we could have common documentation for both Java and Python that just discusses the field name selection style (and uses it in Python examples, rather than lambdas). Then we could add additional documentation for the lambda style at a later date. The only gotcha is that I'm pretty sure the nested field syntax is not implemented in Python today. We could file an issue for that and link to it from the Programming Guide though. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
