TheNeuralBit commented on PR #23224:
URL: https://github.com/apache/beam/pull/23224#issuecomment-1249394035

   > Could you elaborate please? Is the syntax different or the whole working 
different?
   
   Java's 
[Select](https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/schemas/transforms/Select.html)
 transform  just allows projecting fields - it allows users to select fields by 
name or ID, possibly with nested fields separated by '.'. 
   
   In Python, the 
[Select](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.core.html#apache_beam.transforms.core.Select)
 transform does allow this style:
   
   ```python
   beam.Select("userId", "eventId")
   ```
   
   But it also allows users to declare new fields with arbitrary expressions 
using lambdas (the style you've used in your examples):
   ```python
   beam.Select(computedField=lambda row: row.userId + row.eventId)
   ```
   
   The styles can be mixed and matched too:
   ```python
   beam.Select("userId", computedField=lambda row: row.userId + row.eventId)
   ```
   
   Perhaps we could have common documentation for both Java and Python that 
just discusses the field name selection style (and uses it in Python examples, 
rather than lambdas). Then we could add additional documentation for the lambda 
style at a later date.
   
   The only gotcha is that I'm pretty sure the nested field syntax is not 
implemented in Python today. We could file an issue for that and link to it 
from the Programming Guide though.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to