[
https://issues.apache.org/jira/browse/BEAM-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275480#comment-16275480
]
ASF GitHub Bot commented on BEAM-3157:
--------------------------------------
akedin commented on issue #4204: [BEAM-3157] Generate BeamRecord types from
Pojos
URL: https://github.com/apache/beam/pull/4204#issuecomment-348677595
R: @kennknowles @reuvenlax @iemejia
Please take a look
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> BeamSql transform should support other PCollection types
> --------------------------------------------------------
>
> Key: BEAM-3157
> URL: https://issues.apache.org/jira/browse/BEAM-3157
> Project: Beam
> Issue Type: Improvement
> Components: dsl-sql
> Reporter: Ismaël Mejía
> Assignee: Anton Kedin
>
> Currently the Beam SQL transform only supports input and output data
> represented as a BeamRecord. This seems to me like an usability limitation
> (even if we can do a ParDo to prepare objects before and after the transform).
> I suppose this constraint comes from the fact that we need to map
> name/type/value from an object field into Calcite so it is convenient to have
> a specific data type (BeamRecord) for this. However we can accomplish the
> same by using a PCollection of JavaBean (where we know the same information
> via the field names/types/values) or by using Avro records where we also have
> the Schema information. For the output PCollection we can map the object via
> a Reference (e.g. a JavaBean to be filled with the names of an Avro object).
> Note: I am assuming for the moment simple mappings since the SQL does not
> support composite types for the moment.
> A simple API idea would be something like this:
> A simple filter:
> PCollection<MyPojo> col = BeamSql.query("SELECT * FROM .... WHERE
> ...").from(MyPojo.class);
> A projection:
> PCollection<MyNewPojo> newCol = BeamSql.query("SELECT id,
> name").from(MyPojo.class).as(MyNewPojo.class);
> A first approach could be to just add the extra ParDos + transform DoFns
> however I suppose that for memory use reasons maybe mapping directly into
> Calcite would make sense.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)