[ 
https://issues.apache.org/jira/browse/BEAM-12955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17433888#comment-17433888
 ] 

Beam JIRA Bot commented on BEAM-12955:
--------------------------------------

This issue is assigned but has not received an update in 30 days so it has been 
labeled "stale-assigned". If you are still working on the issue, please give an 
update and remove the label. If you are no longer working on the issue, please 
unassign so someone else may work on it. In 7 days the issue will be 
automatically unassigned.

> Add support for inferring Beam Schemas from Python protobuf types
> -----------------------------------------------------------------
>
>                 Key: BEAM-12955
>                 URL: https://issues.apache.org/jira/browse/BEAM-12955
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-py-core
>            Reporter: Brian Hulette
>            Assignee: Svetak Vihaan Sundhar
>            Priority: P2
>              Labels: stale-assigned
>
> Just as we can infer a Beam Schema from a NamedTuple type 
> ([code|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/typehints/schemas.py]),
>  we should have support for inferring a schema from a [protobuf-generated 
> Python 
> type|https://developers.google.com/protocol-buffers/docs/pythontutorial].
> This should integrate well with the rest of the schema infrastructure. For 
> example it should be possible to use schema-aware transforms like 
> [SqlTransform|https://beam.apache.org/releases/pydoc/2.32.0/apache_beam.transforms.sql.html#apache_beam.transforms.sql.SqlTransform],
>  
> [Select|https://beam.apache.org/releases/pydoc/2.32.0/apache_beam.transforms.core.html#apache_beam.transforms.core.Select],
>  or 
> [beam.dataframe.convert.to_dataframe|https://beam.apache.org/releases/pydoc/2.32.0/apache_beam.dataframe.convert.html#apache_beam.dataframe.convert.to_dataframe]
>  on a PCollection that is annotated with a protobuf type. For example (using 
> the addressbook_pb2 example from the 
> [tutorial|https://developers.google.com/protocol-buffers/docs/pythontutorial#reading-a-message]):
> {code:python}
> import adressbook_pb2
> import apache_beam as beam
> from apache_beam.dataframe.convert import to_dataframe
> pc = (input_pc | 
> beam.Map(create_person).with_output_type(addressbook_pb2.Person))
> df = to_dataframe(pc) # deferred dataframe with fields id, name, email, ...
> # OR
> pc | beam.transforms.SqlTransform("SELECT name WHERE email = '[email protected]' 
> FROM PCOLLECTION")
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to