[jira] [Commented] (BEAM-7609) SqlTransform#getSchema for "SELECT DISTINCT + JOIN" has invalid field names
[ https://issues.apache.org/jira/browse/BEAM-7609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17548000#comment-17548000 ] Danny McCormick commented on BEAM-7609: --- This issue has been migrated to https://github.com/apache/beam/issues/19656 > SqlTransform#getSchema for "SELECT DISTINCT + JOIN" has invalid field names > --- > > Key: BEAM-7609 > URL: https://issues.apache.org/jira/browse/BEAM-7609 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.13.0 >Reporter: Gleb Kanterov >Priority: P3 > > Works in sqlline shell: > {code} > Welcome to Beam SQL 2.14.0-SNAPSHOT (based on sqlline version 1.4.0) > 0: BeamSQL> CREATE EXTERNAL TABLE s1 (id BIGINT) TYPE 'test'; > No rows affected (0.507 seconds) > 0: BeamSQL> CREATE EXTERNAL TABLE s2 (id BIGINT) TYPE 'test'; > No rows affected (0.004 seconds) > 0: BeamSQL> SELECT DISTINCT s1.id as lhs, s2.id as rhs FROM s1 JOIN s2 USING > (id); > +-+-+ > | lhs | rhs | > +-+-+ > +-+-+ > No rows selected (2.568 seconds) > {code} > But doesn't work in the test: > {code} > Schema inputSchema = Schema.of( > Schema.Field.of("id", Schema.FieldType.INT32)); > PCollection i1 = p.apply(Create.of(ImmutableList.of()) > .withCoder(SchemaCoder.of(inputSchema))); > PCollection i2 = p.apply(Create.of(ImmutableList.of()) > .withCoder(SchemaCoder.of(inputSchema))); > Schema outputSchema = PCollectionTuple > .of("i1", i1) > .and("i2", i2) > .apply(SqlTransform.query("SELECT DISTINCT s1.id as lhs, s2.id as rhs > FROM i1 JOIN i2 USING (id)")) > .getSchema(); > assertEquals(ImmutableList.of("lhs", "rhs"), > outputSchema.getFieldNames()); > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (BEAM-7609) SqlTransform#getSchema for "SELECT DISTINCT + JOIN" has invalid field names
[ https://issues.apache.org/jira/browse/BEAM-7609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17137241#comment-17137241 ] Beam JIRA Bot commented on BEAM-7609: - This issue was marked "stale-P2" and has not received a public comment in 14 days. It is now automatically moved to P3. If you are still affected by it, you can comment and move it back to P2. > SqlTransform#getSchema for "SELECT DISTINCT + JOIN" has invalid field names > --- > > Key: BEAM-7609 > URL: https://issues.apache.org/jira/browse/BEAM-7609 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.13.0 >Reporter: Gleb Kanterov >Priority: P3 > > Works in sqlline shell: > {code} > Welcome to Beam SQL 2.14.0-SNAPSHOT (based on sqlline version 1.4.0) > 0: BeamSQL> CREATE EXTERNAL TABLE s1 (id BIGINT) TYPE 'test'; > No rows affected (0.507 seconds) > 0: BeamSQL> CREATE EXTERNAL TABLE s2 (id BIGINT) TYPE 'test'; > No rows affected (0.004 seconds) > 0: BeamSQL> SELECT DISTINCT s1.id as lhs, s2.id as rhs FROM s1 JOIN s2 USING > (id); > +-+-+ > | lhs | rhs | > +-+-+ > +-+-+ > No rows selected (2.568 seconds) > {code} > But doesn't work in the test: > {code} > Schema inputSchema = Schema.of( > Schema.Field.of("id", Schema.FieldType.INT32)); > PCollection i1 = p.apply(Create.of(ImmutableList.of()) > .withCoder(SchemaCoder.of(inputSchema))); > PCollection i2 = p.apply(Create.of(ImmutableList.of()) > .withCoder(SchemaCoder.of(inputSchema))); > Schema outputSchema = PCollectionTuple > .of("i1", i1) > .and("i2", i2) > .apply(SqlTransform.query("SELECT DISTINCT s1.id as lhs, s2.id as rhs > FROM i1 JOIN i2 USING (id)")) > .getSchema(); > assertEquals(ImmutableList.of("lhs", "rhs"), > outputSchema.getFieldNames()); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-7609) SqlTransform#getSchema for "SELECT DISTINCT + JOIN" has invalid field names
[ https://issues.apache.org/jira/browse/BEAM-7609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17122772#comment-17122772 ] Beam JIRA Bot commented on BEAM-7609: - This issue is P2 but has been unassigned without any comment for 60 days so it has been labeled "stale-P2". If this issue is still affecting you, we care! Please comment and remove the label. Otherwise, in 14 days the issue will be moved to P3. Please see https://beam.apache.org/contribute/jira-priorities/ for a detailed explanation of what these priorities mean. > SqlTransform#getSchema for "SELECT DISTINCT + JOIN" has invalid field names > --- > > Key: BEAM-7609 > URL: https://issues.apache.org/jira/browse/BEAM-7609 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.13.0 >Reporter: Gleb Kanterov >Priority: P2 > Labels: stale-P2 > > Works in sqlline shell: > {code} > Welcome to Beam SQL 2.14.0-SNAPSHOT (based on sqlline version 1.4.0) > 0: BeamSQL> CREATE EXTERNAL TABLE s1 (id BIGINT) TYPE 'test'; > No rows affected (0.507 seconds) > 0: BeamSQL> CREATE EXTERNAL TABLE s2 (id BIGINT) TYPE 'test'; > No rows affected (0.004 seconds) > 0: BeamSQL> SELECT DISTINCT s1.id as lhs, s2.id as rhs FROM s1 JOIN s2 USING > (id); > +-+-+ > | lhs | rhs | > +-+-+ > +-+-+ > No rows selected (2.568 seconds) > {code} > But doesn't work in the test: > {code} > Schema inputSchema = Schema.of( > Schema.Field.of("id", Schema.FieldType.INT32)); > PCollection i1 = p.apply(Create.of(ImmutableList.of()) > .withCoder(SchemaCoder.of(inputSchema))); > PCollection i2 = p.apply(Create.of(ImmutableList.of()) > .withCoder(SchemaCoder.of(inputSchema))); > Schema outputSchema = PCollectionTuple > .of("i1", i1) > .and("i2", i2) > .apply(SqlTransform.query("SELECT DISTINCT s1.id as lhs, s2.id as rhs > FROM i1 JOIN i2 USING (id)")) > .getSchema(); > assertEquals(ImmutableList.of("lhs", "rhs"), > outputSchema.getFieldNames()); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-7609) SqlTransform#getSchema for "SELECT DISTINCT + JOIN" has invalid field names
[ https://issues.apache.org/jira/browse/BEAM-7609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17066946#comment-17066946 ] Kirill Kozlov commented on BEAM-7609: - Not sure if this issue is still reproducible. Not actively working on this, will move to Unassigned. > SqlTransform#getSchema for "SELECT DISTINCT + JOIN" has invalid field names > --- > > Key: BEAM-7609 > URL: https://issues.apache.org/jira/browse/BEAM-7609 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.13.0 >Reporter: Gleb Kanterov >Priority: Major > > Works in sqlline shell: > {code} > Welcome to Beam SQL 2.14.0-SNAPSHOT (based on sqlline version 1.4.0) > 0: BeamSQL> CREATE EXTERNAL TABLE s1 (id BIGINT) TYPE 'test'; > No rows affected (0.507 seconds) > 0: BeamSQL> CREATE EXTERNAL TABLE s2 (id BIGINT) TYPE 'test'; > No rows affected (0.004 seconds) > 0: BeamSQL> SELECT DISTINCT s1.id as lhs, s2.id as rhs FROM s1 JOIN s2 USING > (id); > +-+-+ > | lhs | rhs | > +-+-+ > +-+-+ > No rows selected (2.568 seconds) > {code} > But doesn't work in the test: > {code} > Schema inputSchema = Schema.of( > Schema.Field.of("id", Schema.FieldType.INT32)); > PCollection i1 = p.apply(Create.of(ImmutableList.of()) > .withCoder(SchemaCoder.of(inputSchema))); > PCollection i2 = p.apply(Create.of(ImmutableList.of()) > .withCoder(SchemaCoder.of(inputSchema))); > Schema outputSchema = PCollectionTuple > .of("i1", i1) > .and("i2", i2) > .apply(SqlTransform.query("SELECT DISTINCT s1.id as lhs, s2.id as rhs > FROM i1 JOIN i2 USING (id)")) > .getSchema(); > assertEquals(ImmutableList.of("lhs", "rhs"), > outputSchema.getFieldNames()); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-7609) SqlTransform#getSchema for "SELECT DISTINCT + JOIN" has invalid field names
[ https://issues.apache.org/jira/browse/BEAM-7609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17066876#comment-17066876 ] Kenneth Knowles commented on BEAM-7609: --- Still an issue? Working on this? > SqlTransform#getSchema for "SELECT DISTINCT + JOIN" has invalid field names > --- > > Key: BEAM-7609 > URL: https://issues.apache.org/jira/browse/BEAM-7609 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.13.0 >Reporter: Gleb Kanterov >Assignee: Kirill Kozlov >Priority: Major > > Works in sqlline shell: > {code} > Welcome to Beam SQL 2.14.0-SNAPSHOT (based on sqlline version 1.4.0) > 0: BeamSQL> CREATE EXTERNAL TABLE s1 (id BIGINT) TYPE 'test'; > No rows affected (0.507 seconds) > 0: BeamSQL> CREATE EXTERNAL TABLE s2 (id BIGINT) TYPE 'test'; > No rows affected (0.004 seconds) > 0: BeamSQL> SELECT DISTINCT s1.id as lhs, s2.id as rhs FROM s1 JOIN s2 USING > (id); > +-+-+ > | lhs | rhs | > +-+-+ > +-+-+ > No rows selected (2.568 seconds) > {code} > But doesn't work in the test: > {code} > Schema inputSchema = Schema.of( > Schema.Field.of("id", Schema.FieldType.INT32)); > PCollection i1 = p.apply(Create.of(ImmutableList.of()) > .withCoder(SchemaCoder.of(inputSchema))); > PCollection i2 = p.apply(Create.of(ImmutableList.of()) > .withCoder(SchemaCoder.of(inputSchema))); > Schema outputSchema = PCollectionTuple > .of("i1", i1) > .and("i2", i2) > .apply(SqlTransform.query("SELECT DISTINCT s1.id as lhs, s2.id as rhs > FROM i1 JOIN i2 USING (id)")) > .getSchema(); > assertEquals(ImmutableList.of("lhs", "rhs"), > outputSchema.getFieldNames()); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-7609) SqlTransform#getSchema for "SELECT DISTINCT + JOIN" has invalid field names
[ https://issues.apache.org/jira/browse/BEAM-7609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941122#comment-16941122 ] Kirill Kozlov commented on BEAM-7609: - Also, plans generated by the sqlline shell: {code:java} BeamEnumerableConverter BeamAggregationRel(group=[{0, 1}]) BeamCoGBKJoinRel(condition=[=($0, $1)], joinType=[inner]) BeamIOSourceRel(table=[[beam, s1]]) BeamIOSourceRel(table=[[beam, s2]]) {code} and pipeline: {code:java} BeamCoGBKJoinRel(condition=[=($0, $1)], joinType=[inner]) BeamAggregationRel(group=[{0}]) BeamIOSourceRel(table=[[beam, i1]]) BeamAggregationRel(group=[{0}]) BeamIOSourceRel(table=[[beam, i2]]){code} do not match. > SqlTransform#getSchema for "SELECT DISTINCT + JOIN" has invalid field names > --- > > Key: BEAM-7609 > URL: https://issues.apache.org/jira/browse/BEAM-7609 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.13.0 >Reporter: Gleb Kanterov >Assignee: Kirill Kozlov >Priority: Major > > Works in sqlline shell: > {code} > Welcome to Beam SQL 2.14.0-SNAPSHOT (based on sqlline version 1.4.0) > 0: BeamSQL> CREATE EXTERNAL TABLE s1 (id BIGINT) TYPE 'test'; > No rows affected (0.507 seconds) > 0: BeamSQL> CREATE EXTERNAL TABLE s2 (id BIGINT) TYPE 'test'; > No rows affected (0.004 seconds) > 0: BeamSQL> SELECT DISTINCT s1.id as lhs, s2.id as rhs FROM s1 JOIN s2 USING > (id); > +-+-+ > | lhs | rhs | > +-+-+ > +-+-+ > No rows selected (2.568 seconds) > {code} > But doesn't work in the test: > {code} > Schema inputSchema = Schema.of( > Schema.Field.of("id", Schema.FieldType.INT32)); > PCollection i1 = p.apply(Create.of(ImmutableList.of()) > .withCoder(SchemaCoder.of(inputSchema))); > PCollection i2 = p.apply(Create.of(ImmutableList.of()) > .withCoder(SchemaCoder.of(inputSchema))); > Schema outputSchema = PCollectionTuple > .of("i1", i1) > .and("i2", i2) > .apply(SqlTransform.query("SELECT DISTINCT s1.id as lhs, s2.id as rhs > FROM i1 JOIN i2 USING (id)")) > .getSchema(); > assertEquals(ImmutableList.of("lhs", "rhs"), > outputSchema.getFieldNames()); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-7609) SqlTransform#getSchema for "SELECT DISTINCT + JOIN" has invalid field names
[ https://issues.apache.org/jira/browse/BEAM-7609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16939803#comment-16939803 ] Kirill Kozlov commented on BEAM-7609: - It looks like AggregateProjectMergeRule ignores rowType field names Updating AggregateProjectMergeRule#apply (in Calcite) with: {code:java} //... if (!newKeys.equals(newGroupSet.asList())) { //... relBuilder.projectNamed(relBuilder.fields(posList), project.getRowType().getFieldNames(), true); // <- update this } else { relBuilder.projectNamed(relBuilder.fields(), project.getRowType().getFieldNames(), true); // <- add this } //...{code} Seems to fix this particular issue, but need more testing to make sure it does not break in other scenarios. > SqlTransform#getSchema for "SELECT DISTINCT + JOIN" has invalid field names > --- > > Key: BEAM-7609 > URL: https://issues.apache.org/jira/browse/BEAM-7609 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.13.0 >Reporter: Gleb Kanterov >Assignee: Kirill Kozlov >Priority: Major > > Works in sqlline shell: > {code} > Welcome to Beam SQL 2.14.0-SNAPSHOT (based on sqlline version 1.4.0) > 0: BeamSQL> CREATE EXTERNAL TABLE s1 (id BIGINT) TYPE 'test'; > No rows affected (0.507 seconds) > 0: BeamSQL> CREATE EXTERNAL TABLE s2 (id BIGINT) TYPE 'test'; > No rows affected (0.004 seconds) > 0: BeamSQL> SELECT DISTINCT s1.id as lhs, s2.id as rhs FROM s1 JOIN s2 USING > (id); > +-+-+ > | lhs | rhs | > +-+-+ > +-+-+ > No rows selected (2.568 seconds) > {code} > But doesn't work in the test: > {code} > Schema inputSchema = Schema.of( > Schema.Field.of("id", Schema.FieldType.INT32)); > PCollection i1 = p.apply(Create.of(ImmutableList.of()) > .withCoder(SchemaCoder.of(inputSchema))); > PCollection i2 = p.apply(Create.of(ImmutableList.of()) > .withCoder(SchemaCoder.of(inputSchema))); > Schema outputSchema = PCollectionTuple > .of("i1", i1) > .and("i2", i2) > .apply(SqlTransform.query("SELECT DISTINCT s1.id as lhs, s2.id as rhs > FROM i1 JOIN i2 USING (id)")) > .getSchema(); > assertEquals(ImmutableList.of("lhs", "rhs"), > outputSchema.getFieldNames()); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (BEAM-7609) SqlTransform#getSchema for "SELECT DISTINCT + JOIN" has invalid field names
[ https://issues.apache.org/jira/browse/BEAM-7609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16932958#comment-16932958 ] Kirill Kozlov commented on BEAM-7609: - Disabling rule "AggregateProjectMergeRule" seems to "fix" this problem. Further investigations is required to locate the source of the issue. > SqlTransform#getSchema for "SELECT DISTINCT + JOIN" has invalid field names > --- > > Key: BEAM-7609 > URL: https://issues.apache.org/jira/browse/BEAM-7609 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.13.0 >Reporter: Gleb Kanterov >Assignee: Kirill Kozlov >Priority: Major > > Works in sqlline shell: > {code} > Welcome to Beam SQL 2.14.0-SNAPSHOT (based on sqlline version 1.4.0) > 0: BeamSQL> CREATE EXTERNAL TABLE s1 (id BIGINT) TYPE 'test'; > No rows affected (0.507 seconds) > 0: BeamSQL> CREATE EXTERNAL TABLE s2 (id BIGINT) TYPE 'test'; > No rows affected (0.004 seconds) > 0: BeamSQL> SELECT DISTINCT s1.id as lhs, s2.id as rhs FROM s1 JOIN s2 USING > (id); > +-+-+ > | lhs | rhs | > +-+-+ > +-+-+ > No rows selected (2.568 seconds) > {code} > But doesn't work in the test: > {code} > Schema inputSchema = Schema.of( > Schema.Field.of("id", Schema.FieldType.INT32)); > PCollection i1 = p.apply(Create.of(ImmutableList.of()) > .withCoder(SchemaCoder.of(inputSchema))); > PCollection i2 = p.apply(Create.of(ImmutableList.of()) > .withCoder(SchemaCoder.of(inputSchema))); > Schema outputSchema = PCollectionTuple > .of("i1", i1) > .and("i2", i2) > .apply(SqlTransform.query("SELECT DISTINCT s1.id as lhs, s2.id as rhs > FROM i1 JOIN i2 USING (id)")) > .getSchema(); > assertEquals(ImmutableList.of("lhs", "rhs"), > outputSchema.getFieldNames()); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)