[jira] [Work logged] (BEAM-3157) BeamSql transform should support other PCollection types

ASF GitHub Bot (JIRA) Tue, 24 Apr 2018 13:43:31 -0700

     [ 
https://issues.apache.org/jira/browse/BEAM-3157?focusedWorklogId=94789&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-94789
 ]


ASF GitHub Bot logged work on BEAM-3157:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 24/Apr/18 20:42
            Start Date: 24/Apr/18 20:42
    Worklog Time Spent: 10m 
      Work Description: akedin commented on a change in pull request #5215: 
[BEAM-3157][SQL] Add primitive java types support to Row generation logic, add 
example
URL: https://github.com/apache/beam/pull/5215#discussion_r183870562
 
 

 ##########
 File path: sdks/java/extensions/sql/build.gradle
 ##########
 @@ -141,3 +141,19 @@ idea {
     generatedSourceDirs += file(generatedJavaccSourceDir)
   }
 }
+
+// Run basic SQL example
+task runBasicExample(type: JavaExec) {
 
 Review comment:
   There's only a direct-runner profile in the `sql/pom.xml` which add a 
dependency on the direct runner. No dependencies or profiles for other runners 
are there, so I don't think other runners work.
   
   I have updated the comments to mention that these examples currently only 
work on direct runner.
   
   Do we want to support/document other runners in SQL examples at all? It 
feels out of scope at the moment, as users are forced to use Java SDK and 
familiarize themselves with other aspects of Beam anyway. 
   
   Going forward it does make sense to create something similar to WordCount 
which would work on other runners ( 
[BEAM-4168](https://issues.apache.org/jira/browse/BEAM-4168))

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 94789)
    Time Spent: 6h  (was: 5h 50m)

> BeamSql transform should support other PCollection types
> --------------------------------------------------------
>
>                 Key: BEAM-3157
>                 URL: https://issues.apache.org/jira/browse/BEAM-3157
>             Project: Beam
>          Issue Type: Improvement
>          Components: dsl-sql
>            Reporter: Ismaël Mejía
>            Assignee: Anton Kedin
>            Priority: Major
>             Fix For: Not applicable
>
>          Time Spent: 6h
>  Remaining Estimate: 0h
>
> Currently the Beam SQL transform only supports input and output data 
> represented as a BeamRecord. This seems to me like an usability limitation 
> (even if we can do a ParDo to prepare objects before and after the transform).
> I suppose this constraint comes from the fact that we need to map 
> name/type/value from an object field into Calcite so it is convenient to have 
> a specific data type (BeamRecord) for this. However we can accomplish the 
> same by using a PCollection of JavaBean (where we know the same information 
> via the field names/types/values) or by using Avro records where we also have 
> the Schema information. For the output PCollection we can map the object via 
> a Reference (e.g. a JavaBean to be filled with the names of an Avro object).
> Note: I am assuming for the moment simple mappings since the SQL does not 
> support composite types for the moment.
> A simple API idea would be something like this:
> A simple filter:
> PCollection<MyPojo> col = BeamSql.query("SELECT * FROM .... WHERE 
> ...").from(MyPojo.class);
> A projection:
> PCollection<MyNewPojo> newCol = BeamSql.query("SELECT id, 
> name").from(MyPojo.class).as(MyNewPojo.class);
> A first approach could be to just add the extra ParDos + transform DoFns 
> however I suppose that for memory use reasons maybe mapping directly into 
> Calcite would make sense.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work logged] (BEAM-3157) BeamSql transform should support other PCollection types

Reply via email to