[ 
https://issues.apache.org/jira/browse/BEAM-6753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812109#comment-16812109
 ] 

Reuven Lax commented on BEAM-6753:
----------------------------------

Right now I'm only using this proto in the Dataflow runner. Before using
this proto more widely in portability, I think we need to have a broader
conversation on the dev list (or on this JIRA) about what the actual proto
should look like.

I've talked a few people who are interested in plumbing schemas through the
Python API. The basic idea is to use the Python typehints framework to
detect and enforce schema matching. Once we have that, the basic schema
transforms (filter, group, join, select, etc.) need to be implemented in
Python. Given the popularity of pandas, it's likely that a
dataframe-compatible API would be a good idea in Python (it doesn't have to
be the primary API. We could also implement the separate PTransforms, at
which point it would be trivial to create a dataframe wrapper around them).




> Create proto representation for schemas
> ---------------------------------------
>
>                 Key: BEAM-6753
>                 URL: https://issues.apache.org/jira/browse/BEAM-6753
>             Project: Beam
>          Issue Type: Sub-task
>          Components: beam-model
>            Reporter: Reuven Lax
>            Assignee: Reuven Lax
>            Priority: Major
>          Time Spent: 2.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to