[
https://issues.apache.org/jira/browse/BEAM-7609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941122#comment-16941122
]
Kirill Kozlov commented on BEAM-7609:
-------------------------------------
Also, plans generated by the sqlline shell:
{code:java}
BeamEnumerableConverter
BeamAggregationRel(group=[{0, 1}])
BeamCoGBKJoinRel(condition=[=($0, $1)], joinType=[inner])
BeamIOSourceRel(table=[[beam, s1]])
BeamIOSourceRel(table=[[beam, s2]])
{code}
and pipeline:
{code:java}
BeamCoGBKJoinRel(condition=[=($0, $1)], joinType=[inner])
BeamAggregationRel(group=[{0}])
BeamIOSourceRel(table=[[beam, i1]])
BeamAggregationRel(group=[{0}])
BeamIOSourceRel(table=[[beam, i2]]){code}
do not match.
> SqlTransform#getSchema for "SELECT DISTINCT + JOIN" has invalid field names
> ---------------------------------------------------------------------------
>
> Key: BEAM-7609
> URL: https://issues.apache.org/jira/browse/BEAM-7609
> Project: Beam
> Issue Type: Bug
> Components: dsl-sql
> Affects Versions: 2.13.0
> Reporter: Gleb Kanterov
> Assignee: Kirill Kozlov
> Priority: Major
>
> Works in sqlline shell:
> {code}
> Welcome to Beam SQL 2.14.0-SNAPSHOT (based on sqlline version 1.4.0)
> 0: BeamSQL> CREATE EXTERNAL TABLE s1 (id BIGINT) TYPE 'test';
> No rows affected (0.507 seconds)
> 0: BeamSQL> CREATE EXTERNAL TABLE s2 (id BIGINT) TYPE 'test';
> No rows affected (0.004 seconds)
> 0: BeamSQL> SELECT DISTINCT s1.id as lhs, s2.id as rhs FROM s1 JOIN s2 USING
> (id);
> +---------------------+---------------------+
> | lhs | rhs |
> +---------------------+---------------------+
> +---------------------+---------------------+
> No rows selected (2.568 seconds)
> {code}
> But doesn't work in the test:
> {code}
> Schema inputSchema = Schema.of(
> Schema.Field.of("id", Schema.FieldType.INT32));
> PCollection<Row> i1 = p.apply(Create.of(ImmutableList.<Row>of())
> .withCoder(SchemaCoder.of(inputSchema)));
> PCollection<Row> i2 = p.apply(Create.of(ImmutableList.<Row>of())
> .withCoder(SchemaCoder.of(inputSchema)));
> Schema outputSchema = PCollectionTuple
> .of("i1", i1)
> .and("i2", i2)
> .apply(SqlTransform.query("SELECT DISTINCT s1.id as lhs, s2.id as rhs
> FROM i1 JOIN i2 USING (id)"))
> .getSchema();
> assertEquals(ImmutableList.of("lhs", "rhs"),
> outputSchema.getFieldNames());
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)