[jira] [Work logged] (BEAM-6995) SQL aggregation with where clause fails to plan
[ https://issues.apache.org/jira/browse/BEAM-6995?focusedWorklogId=324785=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-324785 ] ASF GitHub Bot logged work on BEAM-6995: Author: ASF GitHub Bot Created on: 08/Oct/19 00:33 Start Date: 08/Oct/19 00:33 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9703: [BEAM-6995] Beam basic aggregation rule only when not windowed URL: https://github.com/apache/beam/pull/9703#discussion_r332293675 ## File path: sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/BeamSqlDslAggregationTest.java ## @@ -701,7 +700,6 @@ public void testSupportsAggregationWithoutProjection() throws Exception { } @Test - @Ignore("https://issues.apache.org/jira/browse/BEAM-8317;) public void testSupportsAggregationWithFilterWithoutProjection() throws Exception { Review comment: I tried adding `pipeline.getOptions().as(BeamSqlPipelineOptions.class).setPlannerName("org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLPlannerImpl");` to the test to see if it will work using `ZetaSqlPlanner`, but I get `Class not found exception`. I assume it is because ZetaSQL is not in the build file. After attempting to add a dependency there is the following error: `Circular dependency`, probably because ZetaSQL depends on BeamSQL? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 324785) Time Spent: 4h 50m (was: 4h 40m) > SQL aggregation with where clause fails to plan > --- > > Key: BEAM-6995 > URL: https://issues.apache.org/jira/browse/BEAM-6995 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.11.0 >Reporter: David McIntosh >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 4h 50m > Remaining Estimate: 0h > > I'm finding that this code fails with a CannotPlanException listed below. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .build(); > Row row = Row.withSchema(schema).addValues(1, 2).build(); > PCollection inputData = p.apply("row input", > Create.of(row).withRowSchema(schema)); > inputData.apply("sql", > SqlTransform.query( > "SELECT id, SUM(val) " > + "FROM PCOLLECTION " > + "WHERE val > 0 " > + "GROUP BY id"));{code} > If the WHERE clause is removed the code runs successfully. > This may be similar to BEAM-5384 since I was able to work around this by > adding an extra column to the input that isn't reference in the sql. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .addInt32Field("extra") > .build();{code} > > {code:java} > org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.RelOptPlanner$CannotPlanException: > Node [rel#100:Subset#2.BEAM_LOGICAL] could not be implemented; planner state: > Root: rel#100:Subset#2.BEAM_LOGICAL > Original rel: > LogicalAggregate(subset=[rel#100:Subset#2.BEAM_LOGICAL], group=[{0}], > EXPR$1=[SUM($1)]): rowcount = 5.0, cumulative cost = {5.687500238418579 rows, > 0.0 cpu, 0.0 io}, id = 98 > LogicalFilter(subset=[rel#97:Subset#1.NONE], condition=[>($1, 0)]): > rowcount = 50.0, cumulative cost = {50.0 rows, 100.0 cpu, 0.0 io}, id = 96 > BeamIOSourceRel(subset=[rel#95:Subset#0.BEAM_LOGICAL], table=[[beam, > PCOLLECTION]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, > 0.0 io}, id = 92 > Sets: > Set#0, type: RecordType(INTEGER id, INTEGER val) > rel#95:Subset#0.BEAM_LOGICAL, best=rel#92, > importance=0.7291 > rel#92:BeamIOSourceRel.BEAM_LOGICAL(table=[beam, > PCOLLECTION]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io} > rel#110:Subset#0.ENUMERABLE, best=rel#109, > importance=0.36455 > > rel#109:BeamEnumerableConverter.ENUMERABLE(input=rel#95:Subset#0.BEAM_LOGICAL), > rowcount=100.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#1, type: RecordType(INTEGER id, INTEGER val) > rel#97:Subset#1.NONE, best=null, importance=0.81 > > rel#96:LogicalFilter.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,condition=>($1, > 0)), rowcount=50.0, cumulative cost={inf} > >
[jira] [Work logged] (BEAM-6995) SQL aggregation with where clause fails to plan
[ https://issues.apache.org/jira/browse/BEAM-6995?focusedWorklogId=324659=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-324659 ] ASF GitHub Bot logged work on BEAM-6995: Author: ASF GitHub Bot Created on: 07/Oct/19 22:15 Start Date: 07/Oct/19 22:15 Worklog Time Spent: 10m Work Description: amaliujia commented on pull request #9703: [BEAM-6995] Beam basic aggregation rule only when not windowed URL: https://github.com/apache/beam/pull/9703#discussion_r332261344 ## File path: sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/BeamSqlDslAggregationTest.java ## @@ -701,7 +700,6 @@ public void testSupportsAggregationWithoutProjection() throws Exception { } @Test - @Ignore("https://issues.apache.org/jira/browse/BEAM-8317;) public void testSupportsAggregationWithFilterWithoutProjection() throws Exception { Review comment: @apilloud I don't agree though. At least a duplicate test can be created but run for ZetaSQL only and then we can have a migration. It could be a new test file associated with ZetaSQL dialect. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 324659) Time Spent: 4h 40m (was: 4.5h) > SQL aggregation with where clause fails to plan > --- > > Key: BEAM-6995 > URL: https://issues.apache.org/jira/browse/BEAM-6995 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.11.0 >Reporter: David McIntosh >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 4h 40m > Remaining Estimate: 0h > > I'm finding that this code fails with a CannotPlanException listed below. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .build(); > Row row = Row.withSchema(schema).addValues(1, 2).build(); > PCollection inputData = p.apply("row input", > Create.of(row).withRowSchema(schema)); > inputData.apply("sql", > SqlTransform.query( > "SELECT id, SUM(val) " > + "FROM PCOLLECTION " > + "WHERE val > 0 " > + "GROUP BY id"));{code} > If the WHERE clause is removed the code runs successfully. > This may be similar to BEAM-5384 since I was able to work around this by > adding an extra column to the input that isn't reference in the sql. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .addInt32Field("extra") > .build();{code} > > {code:java} > org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.RelOptPlanner$CannotPlanException: > Node [rel#100:Subset#2.BEAM_LOGICAL] could not be implemented; planner state: > Root: rel#100:Subset#2.BEAM_LOGICAL > Original rel: > LogicalAggregate(subset=[rel#100:Subset#2.BEAM_LOGICAL], group=[{0}], > EXPR$1=[SUM($1)]): rowcount = 5.0, cumulative cost = {5.687500238418579 rows, > 0.0 cpu, 0.0 io}, id = 98 > LogicalFilter(subset=[rel#97:Subset#1.NONE], condition=[>($1, 0)]): > rowcount = 50.0, cumulative cost = {50.0 rows, 100.0 cpu, 0.0 io}, id = 96 > BeamIOSourceRel(subset=[rel#95:Subset#0.BEAM_LOGICAL], table=[[beam, > PCOLLECTION]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, > 0.0 io}, id = 92 > Sets: > Set#0, type: RecordType(INTEGER id, INTEGER val) > rel#95:Subset#0.BEAM_LOGICAL, best=rel#92, > importance=0.7291 > rel#92:BeamIOSourceRel.BEAM_LOGICAL(table=[beam, > PCOLLECTION]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io} > rel#110:Subset#0.ENUMERABLE, best=rel#109, > importance=0.36455 > > rel#109:BeamEnumerableConverter.ENUMERABLE(input=rel#95:Subset#0.BEAM_LOGICAL), > rowcount=100.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#1, type: RecordType(INTEGER id, INTEGER val) > rel#97:Subset#1.NONE, best=null, importance=0.81 > > rel#96:LogicalFilter.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,condition=>($1, > 0)), rowcount=50.0, cumulative cost={inf} > > rel#102:LogicalCalc.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,expr#0..1={inputs},expr#2=0,expr#3=>($t1, > $t2),id=$t0,val=$t1,$condition=$t3), rowcount=50.0, cumulative cost={inf} > rel#104:Subset#1.BEAM_LOGICAL, best=rel#103, importance=0.405 > >
[jira] [Work logged] (BEAM-6995) SQL aggregation with where clause fails to plan
[ https://issues.apache.org/jira/browse/BEAM-6995?focusedWorklogId=324658=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-324658 ] ASF GitHub Bot logged work on BEAM-6995: Author: ASF GitHub Bot Created on: 07/Oct/19 22:14 Start Date: 07/Oct/19 22:14 Worklog Time Spent: 10m Work Description: amaliujia commented on pull request #9703: [BEAM-6995] Beam basic aggregation rule only when not windowed URL: https://github.com/apache/beam/pull/9703#discussion_r332261344 ## File path: sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/BeamSqlDslAggregationTest.java ## @@ -701,7 +700,6 @@ public void testSupportsAggregationWithoutProjection() throws Exception { } @Test - @Ignore("https://issues.apache.org/jira/browse/BEAM-8317;) public void testSupportsAggregationWithFilterWithoutProjection() throws Exception { Review comment: @apilloud I don't agree though. At least a duplicate test can be created but run for ZetaSQL only and then we can have a migration. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 324658) Time Spent: 4.5h (was: 4h 20m) > SQL aggregation with where clause fails to plan > --- > > Key: BEAM-6995 > URL: https://issues.apache.org/jira/browse/BEAM-6995 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.11.0 >Reporter: David McIntosh >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 4.5h > Remaining Estimate: 0h > > I'm finding that this code fails with a CannotPlanException listed below. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .build(); > Row row = Row.withSchema(schema).addValues(1, 2).build(); > PCollection inputData = p.apply("row input", > Create.of(row).withRowSchema(schema)); > inputData.apply("sql", > SqlTransform.query( > "SELECT id, SUM(val) " > + "FROM PCOLLECTION " > + "WHERE val > 0 " > + "GROUP BY id"));{code} > If the WHERE clause is removed the code runs successfully. > This may be similar to BEAM-5384 since I was able to work around this by > adding an extra column to the input that isn't reference in the sql. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .addInt32Field("extra") > .build();{code} > > {code:java} > org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.RelOptPlanner$CannotPlanException: > Node [rel#100:Subset#2.BEAM_LOGICAL] could not be implemented; planner state: > Root: rel#100:Subset#2.BEAM_LOGICAL > Original rel: > LogicalAggregate(subset=[rel#100:Subset#2.BEAM_LOGICAL], group=[{0}], > EXPR$1=[SUM($1)]): rowcount = 5.0, cumulative cost = {5.687500238418579 rows, > 0.0 cpu, 0.0 io}, id = 98 > LogicalFilter(subset=[rel#97:Subset#1.NONE], condition=[>($1, 0)]): > rowcount = 50.0, cumulative cost = {50.0 rows, 100.0 cpu, 0.0 io}, id = 96 > BeamIOSourceRel(subset=[rel#95:Subset#0.BEAM_LOGICAL], table=[[beam, > PCOLLECTION]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, > 0.0 io}, id = 92 > Sets: > Set#0, type: RecordType(INTEGER id, INTEGER val) > rel#95:Subset#0.BEAM_LOGICAL, best=rel#92, > importance=0.7291 > rel#92:BeamIOSourceRel.BEAM_LOGICAL(table=[beam, > PCOLLECTION]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io} > rel#110:Subset#0.ENUMERABLE, best=rel#109, > importance=0.36455 > > rel#109:BeamEnumerableConverter.ENUMERABLE(input=rel#95:Subset#0.BEAM_LOGICAL), > rowcount=100.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#1, type: RecordType(INTEGER id, INTEGER val) > rel#97:Subset#1.NONE, best=null, importance=0.81 > > rel#96:LogicalFilter.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,condition=>($1, > 0)), rowcount=50.0, cumulative cost={inf} > > rel#102:LogicalCalc.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,expr#0..1={inputs},expr#2=0,expr#3=>($t1, > $t2),id=$t0,val=$t1,$condition=$t3), rowcount=50.0, cumulative cost={inf} > rel#104:Subset#1.BEAM_LOGICAL, best=rel#103, importance=0.405 > >
[jira] [Work logged] (BEAM-6995) SQL aggregation with where clause fails to plan
[ https://issues.apache.org/jira/browse/BEAM-6995?focusedWorklogId=324632=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-324632 ] ASF GitHub Bot logged work on BEAM-6995: Author: ASF GitHub Bot Created on: 07/Oct/19 21:40 Start Date: 07/Oct/19 21:40 Worklog Time Spent: 10m Work Description: apilloud commented on pull request #9703: [BEAM-6995] Beam basic aggregation rule only when not windowed URL: https://github.com/apache/beam/pull/9703 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 324632) Time Spent: 4h 20m (was: 4h 10m) > SQL aggregation with where clause fails to plan > --- > > Key: BEAM-6995 > URL: https://issues.apache.org/jira/browse/BEAM-6995 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.11.0 >Reporter: David McIntosh >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 4h 20m > Remaining Estimate: 0h > > I'm finding that this code fails with a CannotPlanException listed below. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .build(); > Row row = Row.withSchema(schema).addValues(1, 2).build(); > PCollection inputData = p.apply("row input", > Create.of(row).withRowSchema(schema)); > inputData.apply("sql", > SqlTransform.query( > "SELECT id, SUM(val) " > + "FROM PCOLLECTION " > + "WHERE val > 0 " > + "GROUP BY id"));{code} > If the WHERE clause is removed the code runs successfully. > This may be similar to BEAM-5384 since I was able to work around this by > adding an extra column to the input that isn't reference in the sql. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .addInt32Field("extra") > .build();{code} > > {code:java} > org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.RelOptPlanner$CannotPlanException: > Node [rel#100:Subset#2.BEAM_LOGICAL] could not be implemented; planner state: > Root: rel#100:Subset#2.BEAM_LOGICAL > Original rel: > LogicalAggregate(subset=[rel#100:Subset#2.BEAM_LOGICAL], group=[{0}], > EXPR$1=[SUM($1)]): rowcount = 5.0, cumulative cost = {5.687500238418579 rows, > 0.0 cpu, 0.0 io}, id = 98 > LogicalFilter(subset=[rel#97:Subset#1.NONE], condition=[>($1, 0)]): > rowcount = 50.0, cumulative cost = {50.0 rows, 100.0 cpu, 0.0 io}, id = 96 > BeamIOSourceRel(subset=[rel#95:Subset#0.BEAM_LOGICAL], table=[[beam, > PCOLLECTION]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, > 0.0 io}, id = 92 > Sets: > Set#0, type: RecordType(INTEGER id, INTEGER val) > rel#95:Subset#0.BEAM_LOGICAL, best=rel#92, > importance=0.7291 > rel#92:BeamIOSourceRel.BEAM_LOGICAL(table=[beam, > PCOLLECTION]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io} > rel#110:Subset#0.ENUMERABLE, best=rel#109, > importance=0.36455 > > rel#109:BeamEnumerableConverter.ENUMERABLE(input=rel#95:Subset#0.BEAM_LOGICAL), > rowcount=100.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#1, type: RecordType(INTEGER id, INTEGER val) > rel#97:Subset#1.NONE, best=null, importance=0.81 > > rel#96:LogicalFilter.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,condition=>($1, > 0)), rowcount=50.0, cumulative cost={inf} > > rel#102:LogicalCalc.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,expr#0..1={inputs},expr#2=0,expr#3=>($t1, > $t2),id=$t0,val=$t1,$condition=$t3), rowcount=50.0, cumulative cost={inf} > rel#104:Subset#1.BEAM_LOGICAL, best=rel#103, importance=0.405 > > rel#103:BeamCalcRel.BEAM_LOGICAL(input=rel#95:Subset#0.BEAM_LOGICAL,expr#0..1={inputs},expr#2=0,expr#3=>($t1, > $t2),id=$t0,val=$t1,$condition=$t3), rowcount=50.0, cumulative cost={150.0 > rows, 801.0 cpu, 0.0 io} > rel#106:Subset#1.ENUMERABLE, best=rel#105, importance=0.405 > > rel#105:BeamEnumerableConverter.ENUMERABLE(input=rel#104:Subset#1.BEAM_LOGICAL), > rowcount=50.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#2, type: RecordType(INTEGER id, INTEGER EXPR$1) > rel#99:Subset#2.NONE, best=null, importance=0.9 > >
[jira] [Work logged] (BEAM-6995) SQL aggregation with where clause fails to plan
[ https://issues.apache.org/jira/browse/BEAM-6995?focusedWorklogId=324631=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-324631 ] ASF GitHub Bot logged work on BEAM-6995: Author: ASF GitHub Bot Created on: 07/Oct/19 21:40 Start Date: 07/Oct/19 21:40 Worklog Time Spent: 10m Work Description: apilloud commented on issue #9703: [BEAM-6995] Beam basic aggregation rule only when not windowed URL: https://github.com/apache/beam/pull/9703#issuecomment-539216266 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 324631) Time Spent: 4h 10m (was: 4h) > SQL aggregation with where clause fails to plan > --- > > Key: BEAM-6995 > URL: https://issues.apache.org/jira/browse/BEAM-6995 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.11.0 >Reporter: David McIntosh >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 4h 10m > Remaining Estimate: 0h > > I'm finding that this code fails with a CannotPlanException listed below. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .build(); > Row row = Row.withSchema(schema).addValues(1, 2).build(); > PCollection inputData = p.apply("row input", > Create.of(row).withRowSchema(schema)); > inputData.apply("sql", > SqlTransform.query( > "SELECT id, SUM(val) " > + "FROM PCOLLECTION " > + "WHERE val > 0 " > + "GROUP BY id"));{code} > If the WHERE clause is removed the code runs successfully. > This may be similar to BEAM-5384 since I was able to work around this by > adding an extra column to the input that isn't reference in the sql. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .addInt32Field("extra") > .build();{code} > > {code:java} > org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.RelOptPlanner$CannotPlanException: > Node [rel#100:Subset#2.BEAM_LOGICAL] could not be implemented; planner state: > Root: rel#100:Subset#2.BEAM_LOGICAL > Original rel: > LogicalAggregate(subset=[rel#100:Subset#2.BEAM_LOGICAL], group=[{0}], > EXPR$1=[SUM($1)]): rowcount = 5.0, cumulative cost = {5.687500238418579 rows, > 0.0 cpu, 0.0 io}, id = 98 > LogicalFilter(subset=[rel#97:Subset#1.NONE], condition=[>($1, 0)]): > rowcount = 50.0, cumulative cost = {50.0 rows, 100.0 cpu, 0.0 io}, id = 96 > BeamIOSourceRel(subset=[rel#95:Subset#0.BEAM_LOGICAL], table=[[beam, > PCOLLECTION]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, > 0.0 io}, id = 92 > Sets: > Set#0, type: RecordType(INTEGER id, INTEGER val) > rel#95:Subset#0.BEAM_LOGICAL, best=rel#92, > importance=0.7291 > rel#92:BeamIOSourceRel.BEAM_LOGICAL(table=[beam, > PCOLLECTION]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io} > rel#110:Subset#0.ENUMERABLE, best=rel#109, > importance=0.36455 > > rel#109:BeamEnumerableConverter.ENUMERABLE(input=rel#95:Subset#0.BEAM_LOGICAL), > rowcount=100.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#1, type: RecordType(INTEGER id, INTEGER val) > rel#97:Subset#1.NONE, best=null, importance=0.81 > > rel#96:LogicalFilter.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,condition=>($1, > 0)), rowcount=50.0, cumulative cost={inf} > > rel#102:LogicalCalc.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,expr#0..1={inputs},expr#2=0,expr#3=>($t1, > $t2),id=$t0,val=$t1,$condition=$t3), rowcount=50.0, cumulative cost={inf} > rel#104:Subset#1.BEAM_LOGICAL, best=rel#103, importance=0.405 > > rel#103:BeamCalcRel.BEAM_LOGICAL(input=rel#95:Subset#0.BEAM_LOGICAL,expr#0..1={inputs},expr#2=0,expr#3=>($t1, > $t2),id=$t0,val=$t1,$condition=$t3), rowcount=50.0, cumulative cost={150.0 > rows, 801.0 cpu, 0.0 io} > rel#106:Subset#1.ENUMERABLE, best=rel#105, importance=0.405 > > rel#105:BeamEnumerableConverter.ENUMERABLE(input=rel#104:Subset#1.BEAM_LOGICAL), > rowcount=50.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#2, type: RecordType(INTEGER id, INTEGER EXPR$1) > rel#99:Subset#2.NONE, best=null, importance=0.9 > >
[jira] [Work logged] (BEAM-6995) SQL aggregation with where clause fails to plan
[ https://issues.apache.org/jira/browse/BEAM-6995?focusedWorklogId=323803=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-323803 ] ASF GitHub Bot logged work on BEAM-6995: Author: ASF GitHub Bot Created on: 04/Oct/19 23:57 Start Date: 04/Oct/19 23:57 Worklog Time Spent: 10m Work Description: amaliujia commented on pull request #9703: [BEAM-6995] Beam basic aggregation rule only when not windowed URL: https://github.com/apache/beam/pull/9703#discussion_r331720249 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamAggregationRel.java ## @@ -84,6 +85,31 @@ public BeamAggregationRel( this.windowFieldIndex = windowFieldIndex; } + public BeamAggregationRel( + RelOptCluster cluster, + RelTraitSet traits, + RelNode child, + RelDataType rowType, + boolean indicator, + ImmutableBitSet groupSet, + List groupSets, + List aggCalls, + @Nullable WindowFn windowFn, + int windowFieldIndex) { +this( +cluster, +traits, +child, +indicator, +groupSet, +groupSets, +aggCalls, +windowFn, +windowFieldIndex); + +this.rowType = rowType; Review comment: I think `deriveRowType()` can be called when it's needed? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 323803) Time Spent: 3.5h (was: 3h 20m) > SQL aggregation with where clause fails to plan > --- > > Key: BEAM-6995 > URL: https://issues.apache.org/jira/browse/BEAM-6995 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.11.0 >Reporter: David McIntosh >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 3.5h > Remaining Estimate: 0h > > I'm finding that this code fails with a CannotPlanException listed below. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .build(); > Row row = Row.withSchema(schema).addValues(1, 2).build(); > PCollection inputData = p.apply("row input", > Create.of(row).withRowSchema(schema)); > inputData.apply("sql", > SqlTransform.query( > "SELECT id, SUM(val) " > + "FROM PCOLLECTION " > + "WHERE val > 0 " > + "GROUP BY id"));{code} > If the WHERE clause is removed the code runs successfully. > This may be similar to BEAM-5384 since I was able to work around this by > adding an extra column to the input that isn't reference in the sql. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .addInt32Field("extra") > .build();{code} > > {code:java} > org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.RelOptPlanner$CannotPlanException: > Node [rel#100:Subset#2.BEAM_LOGICAL] could not be implemented; planner state: > Root: rel#100:Subset#2.BEAM_LOGICAL > Original rel: > LogicalAggregate(subset=[rel#100:Subset#2.BEAM_LOGICAL], group=[{0}], > EXPR$1=[SUM($1)]): rowcount = 5.0, cumulative cost = {5.687500238418579 rows, > 0.0 cpu, 0.0 io}, id = 98 > LogicalFilter(subset=[rel#97:Subset#1.NONE], condition=[>($1, 0)]): > rowcount = 50.0, cumulative cost = {50.0 rows, 100.0 cpu, 0.0 io}, id = 96 > BeamIOSourceRel(subset=[rel#95:Subset#0.BEAM_LOGICAL], table=[[beam, > PCOLLECTION]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, > 0.0 io}, id = 92 > Sets: > Set#0, type: RecordType(INTEGER id, INTEGER val) > rel#95:Subset#0.BEAM_LOGICAL, best=rel#92, > importance=0.7291 > rel#92:BeamIOSourceRel.BEAM_LOGICAL(table=[beam, > PCOLLECTION]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io} > rel#110:Subset#0.ENUMERABLE, best=rel#109, > importance=0.36455 > > rel#109:BeamEnumerableConverter.ENUMERABLE(input=rel#95:Subset#0.BEAM_LOGICAL), > rowcount=100.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#1, type: RecordType(INTEGER id, INTEGER val) > rel#97:Subset#1.NONE, best=null, importance=0.81 > > rel#96:LogicalFilter.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,condition=>($1, > 0)), rowcount=50.0, cumulative cost={inf} > >
[jira] [Work logged] (BEAM-6995) SQL aggregation with where clause fails to plan
[ https://issues.apache.org/jira/browse/BEAM-6995?focusedWorklogId=323727=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-323727 ] ASF GitHub Bot logged work on BEAM-6995: Author: ASF GitHub Bot Created on: 04/Oct/19 20:48 Start Date: 04/Oct/19 20:48 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9703: [BEAM-6995] Beam basic aggregation rule only when not windowed URL: https://github.com/apache/beam/pull/9703#discussion_r331659215 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamAggregationRel.java ## @@ -84,6 +85,31 @@ public BeamAggregationRel( this.windowFieldIndex = windowFieldIndex; } + public BeamAggregationRel( + RelOptCluster cluster, + RelTraitSet traits, + RelNode child, + RelDataType rowType, + boolean indicator, + ImmutableBitSet groupSet, + List groupSets, + List aggCalls, + @Nullable WindowFn windowFn, + int windowFieldIndex) { +this( +cluster, +traits, +child, +indicator, +groupSet, +groupSets, +aggCalls, +windowFn, +windowFieldIndex); + +this.rowType = rowType; Review comment: > RelNode type is already inferred from input nodes. Usually when you need to use it, you can use getRowType() function to get it than save it as a class member. I see, in that case I will remove this constructor. Do you think adding `deriveRowType();` to the original constructor makes sense or it would be redundant? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 323727) Time Spent: 3h 20m (was: 3h 10m) > SQL aggregation with where clause fails to plan > --- > > Key: BEAM-6995 > URL: https://issues.apache.org/jira/browse/BEAM-6995 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.11.0 >Reporter: David McIntosh >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 3h 20m > Remaining Estimate: 0h > > I'm finding that this code fails with a CannotPlanException listed below. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .build(); > Row row = Row.withSchema(schema).addValues(1, 2).build(); > PCollection inputData = p.apply("row input", > Create.of(row).withRowSchema(schema)); > inputData.apply("sql", > SqlTransform.query( > "SELECT id, SUM(val) " > + "FROM PCOLLECTION " > + "WHERE val > 0 " > + "GROUP BY id"));{code} > If the WHERE clause is removed the code runs successfully. > This may be similar to BEAM-5384 since I was able to work around this by > adding an extra column to the input that isn't reference in the sql. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .addInt32Field("extra") > .build();{code} > > {code:java} > org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.RelOptPlanner$CannotPlanException: > Node [rel#100:Subset#2.BEAM_LOGICAL] could not be implemented; planner state: > Root: rel#100:Subset#2.BEAM_LOGICAL > Original rel: > LogicalAggregate(subset=[rel#100:Subset#2.BEAM_LOGICAL], group=[{0}], > EXPR$1=[SUM($1)]): rowcount = 5.0, cumulative cost = {5.687500238418579 rows, > 0.0 cpu, 0.0 io}, id = 98 > LogicalFilter(subset=[rel#97:Subset#1.NONE], condition=[>($1, 0)]): > rowcount = 50.0, cumulative cost = {50.0 rows, 100.0 cpu, 0.0 io}, id = 96 > BeamIOSourceRel(subset=[rel#95:Subset#0.BEAM_LOGICAL], table=[[beam, > PCOLLECTION]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, > 0.0 io}, id = 92 > Sets: > Set#0, type: RecordType(INTEGER id, INTEGER val) > rel#95:Subset#0.BEAM_LOGICAL, best=rel#92, > importance=0.7291 > rel#92:BeamIOSourceRel.BEAM_LOGICAL(table=[beam, > PCOLLECTION]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io} > rel#110:Subset#0.ENUMERABLE, best=rel#109, > importance=0.36455 > > rel#109:BeamEnumerableConverter.ENUMERABLE(input=rel#95:Subset#0.BEAM_LOGICAL), > rowcount=100.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#1, type: RecordType(INTEGER id, INTEGER val) >
[jira] [Work logged] (BEAM-6995) SQL aggregation with where clause fails to plan
[ https://issues.apache.org/jira/browse/BEAM-6995?focusedWorklogId=323672=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-323672 ] ASF GitHub Bot logged work on BEAM-6995: Author: ASF GitHub Bot Created on: 04/Oct/19 19:46 Start Date: 04/Oct/19 19:46 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9703: [BEAM-6995] Beam basic aggregation rule only when not windowed URL: https://github.com/apache/beam/pull/9703#discussion_r331659215 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamAggregationRel.java ## @@ -84,6 +85,31 @@ public BeamAggregationRel( this.windowFieldIndex = windowFieldIndex; } + public BeamAggregationRel( + RelOptCluster cluster, + RelTraitSet traits, + RelNode child, + RelDataType rowType, + boolean indicator, + ImmutableBitSet groupSet, + List groupSets, + List aggCalls, + @Nullable WindowFn windowFn, + int windowFieldIndex) { +this( +cluster, +traits, +child, +indicator, +groupSet, +groupSets, +aggCalls, +windowFn, +windowFieldIndex); + +this.rowType = rowType; Review comment: I see, in that case I will remove this constructor. Do you think adding `deriveRowType();` to the original constructor makes sense or it would be redundant? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 323672) Time Spent: 3h 10m (was: 3h) > SQL aggregation with where clause fails to plan > --- > > Key: BEAM-6995 > URL: https://issues.apache.org/jira/browse/BEAM-6995 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.11.0 >Reporter: David McIntosh >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 3h 10m > Remaining Estimate: 0h > > I'm finding that this code fails with a CannotPlanException listed below. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .build(); > Row row = Row.withSchema(schema).addValues(1, 2).build(); > PCollection inputData = p.apply("row input", > Create.of(row).withRowSchema(schema)); > inputData.apply("sql", > SqlTransform.query( > "SELECT id, SUM(val) " > + "FROM PCOLLECTION " > + "WHERE val > 0 " > + "GROUP BY id"));{code} > If the WHERE clause is removed the code runs successfully. > This may be similar to BEAM-5384 since I was able to work around this by > adding an extra column to the input that isn't reference in the sql. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .addInt32Field("extra") > .build();{code} > > {code:java} > org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.RelOptPlanner$CannotPlanException: > Node [rel#100:Subset#2.BEAM_LOGICAL] could not be implemented; planner state: > Root: rel#100:Subset#2.BEAM_LOGICAL > Original rel: > LogicalAggregate(subset=[rel#100:Subset#2.BEAM_LOGICAL], group=[{0}], > EXPR$1=[SUM($1)]): rowcount = 5.0, cumulative cost = {5.687500238418579 rows, > 0.0 cpu, 0.0 io}, id = 98 > LogicalFilter(subset=[rel#97:Subset#1.NONE], condition=[>($1, 0)]): > rowcount = 50.0, cumulative cost = {50.0 rows, 100.0 cpu, 0.0 io}, id = 96 > BeamIOSourceRel(subset=[rel#95:Subset#0.BEAM_LOGICAL], table=[[beam, > PCOLLECTION]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, > 0.0 io}, id = 92 > Sets: > Set#0, type: RecordType(INTEGER id, INTEGER val) > rel#95:Subset#0.BEAM_LOGICAL, best=rel#92, > importance=0.7291 > rel#92:BeamIOSourceRel.BEAM_LOGICAL(table=[beam, > PCOLLECTION]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io} > rel#110:Subset#0.ENUMERABLE, best=rel#109, > importance=0.36455 > > rel#109:BeamEnumerableConverter.ENUMERABLE(input=rel#95:Subset#0.BEAM_LOGICAL), > rowcount=100.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#1, type: RecordType(INTEGER id, INTEGER val) > rel#97:Subset#1.NONE, best=null, importance=0.81 > > rel#96:LogicalFilter.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,condition=>($1, > 0)), rowcount=50.0, cumulative cost={inf} >
[jira] [Work logged] (BEAM-6995) SQL aggregation with where clause fails to plan
[ https://issues.apache.org/jira/browse/BEAM-6995?focusedWorklogId=323647=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-323647 ] ASF GitHub Bot logged work on BEAM-6995: Author: ASF GitHub Bot Created on: 04/Oct/19 18:24 Start Date: 04/Oct/19 18:24 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9703: [BEAM-6995] Beam basic aggregation rule only when not windowed URL: https://github.com/apache/beam/pull/9703#discussion_r331296410 ## File path: sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/BeamSqlDslAggregationTest.java ## @@ -701,7 +700,6 @@ public void testSupportsAggregationWithoutProjection() throws Exception { } @Test - @Ignore("https://issues.apache.org/jira/browse/BEAM-8317;) public void testSupportsAggregationWithFilterWithoutProjection() throws Exception { Review comment: Found a useful reference link with examples: https://github.com/Pragmatists/JUnitParams/blob/master/src/test/java/junitparams/usage/SamplesOfUsageTest.java And this one: https://github.com/junit-team/junit4/wiki/parameterized-tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 323647) Time Spent: 3h (was: 2h 50m) > SQL aggregation with where clause fails to plan > --- > > Key: BEAM-6995 > URL: https://issues.apache.org/jira/browse/BEAM-6995 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.11.0 >Reporter: David McIntosh >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 3h > Remaining Estimate: 0h > > I'm finding that this code fails with a CannotPlanException listed below. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .build(); > Row row = Row.withSchema(schema).addValues(1, 2).build(); > PCollection inputData = p.apply("row input", > Create.of(row).withRowSchema(schema)); > inputData.apply("sql", > SqlTransform.query( > "SELECT id, SUM(val) " > + "FROM PCOLLECTION " > + "WHERE val > 0 " > + "GROUP BY id"));{code} > If the WHERE clause is removed the code runs successfully. > This may be similar to BEAM-5384 since I was able to work around this by > adding an extra column to the input that isn't reference in the sql. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .addInt32Field("extra") > .build();{code} > > {code:java} > org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.RelOptPlanner$CannotPlanException: > Node [rel#100:Subset#2.BEAM_LOGICAL] could not be implemented; planner state: > Root: rel#100:Subset#2.BEAM_LOGICAL > Original rel: > LogicalAggregate(subset=[rel#100:Subset#2.BEAM_LOGICAL], group=[{0}], > EXPR$1=[SUM($1)]): rowcount = 5.0, cumulative cost = {5.687500238418579 rows, > 0.0 cpu, 0.0 io}, id = 98 > LogicalFilter(subset=[rel#97:Subset#1.NONE], condition=[>($1, 0)]): > rowcount = 50.0, cumulative cost = {50.0 rows, 100.0 cpu, 0.0 io}, id = 96 > BeamIOSourceRel(subset=[rel#95:Subset#0.BEAM_LOGICAL], table=[[beam, > PCOLLECTION]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, > 0.0 io}, id = 92 > Sets: > Set#0, type: RecordType(INTEGER id, INTEGER val) > rel#95:Subset#0.BEAM_LOGICAL, best=rel#92, > importance=0.7291 > rel#92:BeamIOSourceRel.BEAM_LOGICAL(table=[beam, > PCOLLECTION]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io} > rel#110:Subset#0.ENUMERABLE, best=rel#109, > importance=0.36455 > > rel#109:BeamEnumerableConverter.ENUMERABLE(input=rel#95:Subset#0.BEAM_LOGICAL), > rowcount=100.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#1, type: RecordType(INTEGER id, INTEGER val) > rel#97:Subset#1.NONE, best=null, importance=0.81 > > rel#96:LogicalFilter.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,condition=>($1, > 0)), rowcount=50.0, cumulative cost={inf} > > rel#102:LogicalCalc.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,expr#0..1={inputs},expr#2=0,expr#3=>($t1, > $t2),id=$t0,val=$t1,$condition=$t3), rowcount=50.0, cumulative cost={inf} > rel#104:Subset#1.BEAM_LOGICAL, best=rel#103, importance=0.405 > >
[jira] [Work logged] (BEAM-6995) SQL aggregation with where clause fails to plan
[ https://issues.apache.org/jira/browse/BEAM-6995?focusedWorklogId=323642=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-323642 ] ASF GitHub Bot logged work on BEAM-6995: Author: ASF GitHub Bot Created on: 04/Oct/19 18:15 Start Date: 04/Oct/19 18:15 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9703: [BEAM-6995] Beam basic aggregation rule only when not windowed URL: https://github.com/apache/beam/pull/9703#discussion_r331626110 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamAggregationRel.java ## @@ -84,6 +85,31 @@ public BeamAggregationRel( this.windowFieldIndex = windowFieldIndex; } + public BeamAggregationRel( + RelOptCluster cluster, + RelTraitSet traits, + RelNode child, + RelDataType rowType, + boolean indicator, + ImmutableBitSet groupSet, + List groupSets, + List aggCalls, + @Nullable WindowFn windowFn, + int windowFieldIndex) { +this( +cluster, +traits, +child, +indicator, +groupSet, +groupSets, +aggCalls, +windowFn, +windowFieldIndex); + +this.rowType = rowType; Review comment: There is a JIRA issue: https://jira.apache.org/jira/browse/BEAM-7609, when running queries with "SELECT DISTINCT + JOIN", resulting field names are not assigned proper name. Even though it does not solve this particular issue, preserving the rowType should not hurt (where previously it would just ignore it and set it to null). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 323642) Time Spent: 2h 50m (was: 2h 40m) > SQL aggregation with where clause fails to plan > --- > > Key: BEAM-6995 > URL: https://issues.apache.org/jira/browse/BEAM-6995 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.11.0 >Reporter: David McIntosh >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 2h 50m > Remaining Estimate: 0h > > I'm finding that this code fails with a CannotPlanException listed below. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .build(); > Row row = Row.withSchema(schema).addValues(1, 2).build(); > PCollection inputData = p.apply("row input", > Create.of(row).withRowSchema(schema)); > inputData.apply("sql", > SqlTransform.query( > "SELECT id, SUM(val) " > + "FROM PCOLLECTION " > + "WHERE val > 0 " > + "GROUP BY id"));{code} > If the WHERE clause is removed the code runs successfully. > This may be similar to BEAM-5384 since I was able to work around this by > adding an extra column to the input that isn't reference in the sql. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .addInt32Field("extra") > .build();{code} > > {code:java} > org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.RelOptPlanner$CannotPlanException: > Node [rel#100:Subset#2.BEAM_LOGICAL] could not be implemented; planner state: > Root: rel#100:Subset#2.BEAM_LOGICAL > Original rel: > LogicalAggregate(subset=[rel#100:Subset#2.BEAM_LOGICAL], group=[{0}], > EXPR$1=[SUM($1)]): rowcount = 5.0, cumulative cost = {5.687500238418579 rows, > 0.0 cpu, 0.0 io}, id = 98 > LogicalFilter(subset=[rel#97:Subset#1.NONE], condition=[>($1, 0)]): > rowcount = 50.0, cumulative cost = {50.0 rows, 100.0 cpu, 0.0 io}, id = 96 > BeamIOSourceRel(subset=[rel#95:Subset#0.BEAM_LOGICAL], table=[[beam, > PCOLLECTION]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, > 0.0 io}, id = 92 > Sets: > Set#0, type: RecordType(INTEGER id, INTEGER val) > rel#95:Subset#0.BEAM_LOGICAL, best=rel#92, > importance=0.7291 > rel#92:BeamIOSourceRel.BEAM_LOGICAL(table=[beam, > PCOLLECTION]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io} > rel#110:Subset#0.ENUMERABLE, best=rel#109, > importance=0.36455 > > rel#109:BeamEnumerableConverter.ENUMERABLE(input=rel#95:Subset#0.BEAM_LOGICAL), > rowcount=100.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#1, type: RecordType(INTEGER id, INTEGER val) >
[jira] [Work logged] (BEAM-6995) SQL aggregation with where clause fails to plan
[ https://issues.apache.org/jira/browse/BEAM-6995?focusedWorklogId=323640=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-323640 ] ASF GitHub Bot logged work on BEAM-6995: Author: ASF GitHub Bot Created on: 04/Oct/19 18:12 Start Date: 04/Oct/19 18:12 Worklog Time Spent: 10m Work Description: amaliujia commented on pull request #9703: [BEAM-6995] Beam basic aggregation rule only when not windowed URL: https://github.com/apache/beam/pull/9703#discussion_r331625043 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamAggregationRel.java ## @@ -84,6 +85,31 @@ public BeamAggregationRel( this.windowFieldIndex = windowFieldIndex; } + public BeamAggregationRel( + RelOptCluster cluster, + RelTraitSet traits, + RelNode child, + RelDataType rowType, + boolean indicator, + ImmutableBitSet groupSet, + List groupSets, + List aggCalls, + @Nullable WindowFn windowFn, + int windowFieldIndex) { +this( +cluster, +traits, +child, +indicator, +groupSet, +groupSets, +aggCalls, +windowFn, +windowFieldIndex); + +this.rowType = rowType; Review comment: RelNode type is already inferred from input nodes. Usually when you need to use it, you can use getRowType() function to get it than save it as a class member. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 323640) Time Spent: 2h 40m (was: 2.5h) > SQL aggregation with where clause fails to plan > --- > > Key: BEAM-6995 > URL: https://issues.apache.org/jira/browse/BEAM-6995 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.11.0 >Reporter: David McIntosh >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 2h 40m > Remaining Estimate: 0h > > I'm finding that this code fails with a CannotPlanException listed below. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .build(); > Row row = Row.withSchema(schema).addValues(1, 2).build(); > PCollection inputData = p.apply("row input", > Create.of(row).withRowSchema(schema)); > inputData.apply("sql", > SqlTransform.query( > "SELECT id, SUM(val) " > + "FROM PCOLLECTION " > + "WHERE val > 0 " > + "GROUP BY id"));{code} > If the WHERE clause is removed the code runs successfully. > This may be similar to BEAM-5384 since I was able to work around this by > adding an extra column to the input that isn't reference in the sql. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .addInt32Field("extra") > .build();{code} > > {code:java} > org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.RelOptPlanner$CannotPlanException: > Node [rel#100:Subset#2.BEAM_LOGICAL] could not be implemented; planner state: > Root: rel#100:Subset#2.BEAM_LOGICAL > Original rel: > LogicalAggregate(subset=[rel#100:Subset#2.BEAM_LOGICAL], group=[{0}], > EXPR$1=[SUM($1)]): rowcount = 5.0, cumulative cost = {5.687500238418579 rows, > 0.0 cpu, 0.0 io}, id = 98 > LogicalFilter(subset=[rel#97:Subset#1.NONE], condition=[>($1, 0)]): > rowcount = 50.0, cumulative cost = {50.0 rows, 100.0 cpu, 0.0 io}, id = 96 > BeamIOSourceRel(subset=[rel#95:Subset#0.BEAM_LOGICAL], table=[[beam, > PCOLLECTION]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, > 0.0 io}, id = 92 > Sets: > Set#0, type: RecordType(INTEGER id, INTEGER val) > rel#95:Subset#0.BEAM_LOGICAL, best=rel#92, > importance=0.7291 > rel#92:BeamIOSourceRel.BEAM_LOGICAL(table=[beam, > PCOLLECTION]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io} > rel#110:Subset#0.ENUMERABLE, best=rel#109, > importance=0.36455 > > rel#109:BeamEnumerableConverter.ENUMERABLE(input=rel#95:Subset#0.BEAM_LOGICAL), > rowcount=100.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#1, type: RecordType(INTEGER id, INTEGER val) > rel#97:Subset#1.NONE, best=null, importance=0.81 > > rel#96:LogicalFilter.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,condition=>($1, > 0)), rowcount=50.0, cumulative cost={inf}
[jira] [Work logged] (BEAM-6995) SQL aggregation with where clause fails to plan
[ https://issues.apache.org/jira/browse/BEAM-6995?focusedWorklogId=323631=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-323631 ] ASF GitHub Bot logged work on BEAM-6995: Author: ASF GitHub Bot Created on: 04/Oct/19 17:59 Start Date: 04/Oct/19 17:59 Worklog Time Spent: 10m Work Description: amaliujia commented on pull request #9703: [BEAM-6995] Beam basic aggregation rule only when not windowed URL: https://github.com/apache/beam/pull/9703#discussion_r331620109 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamAggregationRel.java ## @@ -84,6 +85,31 @@ public BeamAggregationRel( this.windowFieldIndex = windowFieldIndex; } + public BeamAggregationRel( + RelOptCluster cluster, + RelTraitSet traits, + RelNode child, + RelDataType rowType, + boolean indicator, + ImmutableBitSet groupSet, + List groupSets, + List aggCalls, + @Nullable WindowFn windowFn, + int windowFieldIndex) { +this( +cluster, +traits, +child, +indicator, +groupSet, +groupSets, +aggCalls, +windowFn, +windowFieldIndex); + +this.rowType = rowType; Review comment: Where is this rowType used? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 323631) Time Spent: 2.5h (was: 2h 20m) > SQL aggregation with where clause fails to plan > --- > > Key: BEAM-6995 > URL: https://issues.apache.org/jira/browse/BEAM-6995 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.11.0 >Reporter: David McIntosh >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 2.5h > Remaining Estimate: 0h > > I'm finding that this code fails with a CannotPlanException listed below. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .build(); > Row row = Row.withSchema(schema).addValues(1, 2).build(); > PCollection inputData = p.apply("row input", > Create.of(row).withRowSchema(schema)); > inputData.apply("sql", > SqlTransform.query( > "SELECT id, SUM(val) " > + "FROM PCOLLECTION " > + "WHERE val > 0 " > + "GROUP BY id"));{code} > If the WHERE clause is removed the code runs successfully. > This may be similar to BEAM-5384 since I was able to work around this by > adding an extra column to the input that isn't reference in the sql. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .addInt32Field("extra") > .build();{code} > > {code:java} > org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.RelOptPlanner$CannotPlanException: > Node [rel#100:Subset#2.BEAM_LOGICAL] could not be implemented; planner state: > Root: rel#100:Subset#2.BEAM_LOGICAL > Original rel: > LogicalAggregate(subset=[rel#100:Subset#2.BEAM_LOGICAL], group=[{0}], > EXPR$1=[SUM($1)]): rowcount = 5.0, cumulative cost = {5.687500238418579 rows, > 0.0 cpu, 0.0 io}, id = 98 > LogicalFilter(subset=[rel#97:Subset#1.NONE], condition=[>($1, 0)]): > rowcount = 50.0, cumulative cost = {50.0 rows, 100.0 cpu, 0.0 io}, id = 96 > BeamIOSourceRel(subset=[rel#95:Subset#0.BEAM_LOGICAL], table=[[beam, > PCOLLECTION]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, > 0.0 io}, id = 92 > Sets: > Set#0, type: RecordType(INTEGER id, INTEGER val) > rel#95:Subset#0.BEAM_LOGICAL, best=rel#92, > importance=0.7291 > rel#92:BeamIOSourceRel.BEAM_LOGICAL(table=[beam, > PCOLLECTION]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io} > rel#110:Subset#0.ENUMERABLE, best=rel#109, > importance=0.36455 > > rel#109:BeamEnumerableConverter.ENUMERABLE(input=rel#95:Subset#0.BEAM_LOGICAL), > rowcount=100.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#1, type: RecordType(INTEGER id, INTEGER val) > rel#97:Subset#1.NONE, best=null, importance=0.81 > > rel#96:LogicalFilter.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,condition=>($1, > 0)), rowcount=50.0, cumulative cost={inf} > > rel#102:LogicalCalc.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,expr#0..1={inputs},expr#2=0,expr#3=>($t1, >
[jira] [Work logged] (BEAM-6995) SQL aggregation with where clause fails to plan
[ https://issues.apache.org/jira/browse/BEAM-6995?focusedWorklogId=323071=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-323071 ] ASF GitHub Bot logged work on BEAM-6995: Author: ASF GitHub Bot Created on: 03/Oct/19 23:43 Start Date: 03/Oct/19 23:43 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9703: [BEAM-6995] Beam basic aggregation rule only when not windowed URL: https://github.com/apache/beam/pull/9703#discussion_r331296410 ## File path: sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/BeamSqlDslAggregationTest.java ## @@ -701,7 +700,6 @@ public void testSupportsAggregationWithoutProjection() throws Exception { } @Test - @Ignore("https://issues.apache.org/jira/browse/BEAM-8317;) public void testSupportsAggregationWithFilterWithoutProjection() throws Exception { Review comment: Found a useful reference link with examples: https://github.com/Pragmatists/JUnitParams/blob/master/src/test/java/junitparams/usage/SamplesOfUsageTest.java This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 323071) Time Spent: 2h 20m (was: 2h 10m) > SQL aggregation with where clause fails to plan > --- > > Key: BEAM-6995 > URL: https://issues.apache.org/jira/browse/BEAM-6995 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.11.0 >Reporter: David McIntosh >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 2h 20m > Remaining Estimate: 0h > > I'm finding that this code fails with a CannotPlanException listed below. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .build(); > Row row = Row.withSchema(schema).addValues(1, 2).build(); > PCollection inputData = p.apply("row input", > Create.of(row).withRowSchema(schema)); > inputData.apply("sql", > SqlTransform.query( > "SELECT id, SUM(val) " > + "FROM PCOLLECTION " > + "WHERE val > 0 " > + "GROUP BY id"));{code} > If the WHERE clause is removed the code runs successfully. > This may be similar to BEAM-5384 since I was able to work around this by > adding an extra column to the input that isn't reference in the sql. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .addInt32Field("extra") > .build();{code} > > {code:java} > org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.RelOptPlanner$CannotPlanException: > Node [rel#100:Subset#2.BEAM_LOGICAL] could not be implemented; planner state: > Root: rel#100:Subset#2.BEAM_LOGICAL > Original rel: > LogicalAggregate(subset=[rel#100:Subset#2.BEAM_LOGICAL], group=[{0}], > EXPR$1=[SUM($1)]): rowcount = 5.0, cumulative cost = {5.687500238418579 rows, > 0.0 cpu, 0.0 io}, id = 98 > LogicalFilter(subset=[rel#97:Subset#1.NONE], condition=[>($1, 0)]): > rowcount = 50.0, cumulative cost = {50.0 rows, 100.0 cpu, 0.0 io}, id = 96 > BeamIOSourceRel(subset=[rel#95:Subset#0.BEAM_LOGICAL], table=[[beam, > PCOLLECTION]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, > 0.0 io}, id = 92 > Sets: > Set#0, type: RecordType(INTEGER id, INTEGER val) > rel#95:Subset#0.BEAM_LOGICAL, best=rel#92, > importance=0.7291 > rel#92:BeamIOSourceRel.BEAM_LOGICAL(table=[beam, > PCOLLECTION]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io} > rel#110:Subset#0.ENUMERABLE, best=rel#109, > importance=0.36455 > > rel#109:BeamEnumerableConverter.ENUMERABLE(input=rel#95:Subset#0.BEAM_LOGICAL), > rowcount=100.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#1, type: RecordType(INTEGER id, INTEGER val) > rel#97:Subset#1.NONE, best=null, importance=0.81 > > rel#96:LogicalFilter.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,condition=>($1, > 0)), rowcount=50.0, cumulative cost={inf} > > rel#102:LogicalCalc.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,expr#0..1={inputs},expr#2=0,expr#3=>($t1, > $t2),id=$t0,val=$t1,$condition=$t3), rowcount=50.0, cumulative cost={inf} > rel#104:Subset#1.BEAM_LOGICAL, best=rel#103, importance=0.405 > >
[jira] [Work logged] (BEAM-6995) SQL aggregation with where clause fails to plan
[ https://issues.apache.org/jira/browse/BEAM-6995?focusedWorklogId=323016=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-323016 ] ASF GitHub Bot logged work on BEAM-6995: Author: ASF GitHub Bot Created on: 03/Oct/19 22:55 Start Date: 03/Oct/19 22:55 Worklog Time Spent: 10m Work Description: amaliujia commented on pull request #9703: [BEAM-6995] Beam basic aggregation rule only when not windowed URL: https://github.com/apache/beam/pull/9703#discussion_r331286370 ## File path: sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/BeamSqlDslAggregationTest.java ## @@ -701,7 +700,6 @@ public void testSupportsAggregationWithoutProjection() throws Exception { } @Test - @Ignore("https://issues.apache.org/jira/browse/BEAM-8317;) public void testSupportsAggregationWithFilterWithoutProjection() throws Exception { Review comment: @11moon11 @apilloud What I really want to propose is when we add new test cases with SQL queries, run the test for both dialects unless there is a query syntax mismatch. Using which planner is controlled by https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/BeamSqlPipelineOptions.java#L28. I am looking for a way of `@RunWith(Parameterized.class)` so it's easy to run tests for both dialect by an annotation. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 323016) Time Spent: 2h 10m (was: 2h) > SQL aggregation with where clause fails to plan > --- > > Key: BEAM-6995 > URL: https://issues.apache.org/jira/browse/BEAM-6995 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.11.0 >Reporter: David McIntosh >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 2h 10m > Remaining Estimate: 0h > > I'm finding that this code fails with a CannotPlanException listed below. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .build(); > Row row = Row.withSchema(schema).addValues(1, 2).build(); > PCollection inputData = p.apply("row input", > Create.of(row).withRowSchema(schema)); > inputData.apply("sql", > SqlTransform.query( > "SELECT id, SUM(val) " > + "FROM PCOLLECTION " > + "WHERE val > 0 " > + "GROUP BY id"));{code} > If the WHERE clause is removed the code runs successfully. > This may be similar to BEAM-5384 since I was able to work around this by > adding an extra column to the input that isn't reference in the sql. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .addInt32Field("extra") > .build();{code} > > {code:java} > org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.RelOptPlanner$CannotPlanException: > Node [rel#100:Subset#2.BEAM_LOGICAL] could not be implemented; planner state: > Root: rel#100:Subset#2.BEAM_LOGICAL > Original rel: > LogicalAggregate(subset=[rel#100:Subset#2.BEAM_LOGICAL], group=[{0}], > EXPR$1=[SUM($1)]): rowcount = 5.0, cumulative cost = {5.687500238418579 rows, > 0.0 cpu, 0.0 io}, id = 98 > LogicalFilter(subset=[rel#97:Subset#1.NONE], condition=[>($1, 0)]): > rowcount = 50.0, cumulative cost = {50.0 rows, 100.0 cpu, 0.0 io}, id = 96 > BeamIOSourceRel(subset=[rel#95:Subset#0.BEAM_LOGICAL], table=[[beam, > PCOLLECTION]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, > 0.0 io}, id = 92 > Sets: > Set#0, type: RecordType(INTEGER id, INTEGER val) > rel#95:Subset#0.BEAM_LOGICAL, best=rel#92, > importance=0.7291 > rel#92:BeamIOSourceRel.BEAM_LOGICAL(table=[beam, > PCOLLECTION]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io} > rel#110:Subset#0.ENUMERABLE, best=rel#109, > importance=0.36455 > > rel#109:BeamEnumerableConverter.ENUMERABLE(input=rel#95:Subset#0.BEAM_LOGICAL), > rowcount=100.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#1, type: RecordType(INTEGER id, INTEGER val) > rel#97:Subset#1.NONE, best=null, importance=0.81 > > rel#96:LogicalFilter.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,condition=>($1, > 0)), rowcount=50.0, cumulative cost={inf} >
[jira] [Work logged] (BEAM-6995) SQL aggregation with where clause fails to plan
[ https://issues.apache.org/jira/browse/BEAM-6995?focusedWorklogId=323007=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-323007 ] ASF GitHub Bot logged work on BEAM-6995: Author: ASF GitHub Bot Created on: 03/Oct/19 22:34 Start Date: 03/Oct/19 22:34 Worklog Time Spent: 10m Work Description: amaliujia commented on issue #9703: [BEAM-6995] Beam basic aggregation rule only when not windowed URL: https://github.com/apache/beam/pull/9703#issuecomment-538154919 cc: @amaliujia This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 323007) Time Spent: 2h (was: 1h 50m) > SQL aggregation with where clause fails to plan > --- > > Key: BEAM-6995 > URL: https://issues.apache.org/jira/browse/BEAM-6995 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.11.0 >Reporter: David McIntosh >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 2h > Remaining Estimate: 0h > > I'm finding that this code fails with a CannotPlanException listed below. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .build(); > Row row = Row.withSchema(schema).addValues(1, 2).build(); > PCollection inputData = p.apply("row input", > Create.of(row).withRowSchema(schema)); > inputData.apply("sql", > SqlTransform.query( > "SELECT id, SUM(val) " > + "FROM PCOLLECTION " > + "WHERE val > 0 " > + "GROUP BY id"));{code} > If the WHERE clause is removed the code runs successfully. > This may be similar to BEAM-5384 since I was able to work around this by > adding an extra column to the input that isn't reference in the sql. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .addInt32Field("extra") > .build();{code} > > {code:java} > org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.RelOptPlanner$CannotPlanException: > Node [rel#100:Subset#2.BEAM_LOGICAL] could not be implemented; planner state: > Root: rel#100:Subset#2.BEAM_LOGICAL > Original rel: > LogicalAggregate(subset=[rel#100:Subset#2.BEAM_LOGICAL], group=[{0}], > EXPR$1=[SUM($1)]): rowcount = 5.0, cumulative cost = {5.687500238418579 rows, > 0.0 cpu, 0.0 io}, id = 98 > LogicalFilter(subset=[rel#97:Subset#1.NONE], condition=[>($1, 0)]): > rowcount = 50.0, cumulative cost = {50.0 rows, 100.0 cpu, 0.0 io}, id = 96 > BeamIOSourceRel(subset=[rel#95:Subset#0.BEAM_LOGICAL], table=[[beam, > PCOLLECTION]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, > 0.0 io}, id = 92 > Sets: > Set#0, type: RecordType(INTEGER id, INTEGER val) > rel#95:Subset#0.BEAM_LOGICAL, best=rel#92, > importance=0.7291 > rel#92:BeamIOSourceRel.BEAM_LOGICAL(table=[beam, > PCOLLECTION]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io} > rel#110:Subset#0.ENUMERABLE, best=rel#109, > importance=0.36455 > > rel#109:BeamEnumerableConverter.ENUMERABLE(input=rel#95:Subset#0.BEAM_LOGICAL), > rowcount=100.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#1, type: RecordType(INTEGER id, INTEGER val) > rel#97:Subset#1.NONE, best=null, importance=0.81 > > rel#96:LogicalFilter.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,condition=>($1, > 0)), rowcount=50.0, cumulative cost={inf} > > rel#102:LogicalCalc.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,expr#0..1={inputs},expr#2=0,expr#3=>($t1, > $t2),id=$t0,val=$t1,$condition=$t3), rowcount=50.0, cumulative cost={inf} > rel#104:Subset#1.BEAM_LOGICAL, best=rel#103, importance=0.405 > > rel#103:BeamCalcRel.BEAM_LOGICAL(input=rel#95:Subset#0.BEAM_LOGICAL,expr#0..1={inputs},expr#2=0,expr#3=>($t1, > $t2),id=$t0,val=$t1,$condition=$t3), rowcount=50.0, cumulative cost={150.0 > rows, 801.0 cpu, 0.0 io} > rel#106:Subset#1.ENUMERABLE, best=rel#105, importance=0.405 > > rel#105:BeamEnumerableConverter.ENUMERABLE(input=rel#104:Subset#1.BEAM_LOGICAL), > rowcount=50.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#2, type: RecordType(INTEGER id, INTEGER EXPR$1) > rel#99:Subset#2.NONE, best=null, importance=0.9 > >
[jira] [Work logged] (BEAM-6995) SQL aggregation with where clause fails to plan
[ https://issues.apache.org/jira/browse/BEAM-6995?focusedWorklogId=322987=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-322987 ] ASF GitHub Bot logged work on BEAM-6995: Author: ASF GitHub Bot Created on: 03/Oct/19 22:15 Start Date: 03/Oct/19 22:15 Worklog Time Spent: 10m Work Description: apilloud commented on issue #9703: [BEAM-6995] Beam basic aggregation rule only when not windowed URL: https://github.com/apache/beam/pull/9703#issuecomment-538150076 can you do a `git pull origin && git rebase origin/master` on this? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 322987) Time Spent: 1h 50m (was: 1h 40m) > SQL aggregation with where clause fails to plan > --- > > Key: BEAM-6995 > URL: https://issues.apache.org/jira/browse/BEAM-6995 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.11.0 >Reporter: David McIntosh >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 1h 50m > Remaining Estimate: 0h > > I'm finding that this code fails with a CannotPlanException listed below. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .build(); > Row row = Row.withSchema(schema).addValues(1, 2).build(); > PCollection inputData = p.apply("row input", > Create.of(row).withRowSchema(schema)); > inputData.apply("sql", > SqlTransform.query( > "SELECT id, SUM(val) " > + "FROM PCOLLECTION " > + "WHERE val > 0 " > + "GROUP BY id"));{code} > If the WHERE clause is removed the code runs successfully. > This may be similar to BEAM-5384 since I was able to work around this by > adding an extra column to the input that isn't reference in the sql. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .addInt32Field("extra") > .build();{code} > > {code:java} > org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.RelOptPlanner$CannotPlanException: > Node [rel#100:Subset#2.BEAM_LOGICAL] could not be implemented; planner state: > Root: rel#100:Subset#2.BEAM_LOGICAL > Original rel: > LogicalAggregate(subset=[rel#100:Subset#2.BEAM_LOGICAL], group=[{0}], > EXPR$1=[SUM($1)]): rowcount = 5.0, cumulative cost = {5.687500238418579 rows, > 0.0 cpu, 0.0 io}, id = 98 > LogicalFilter(subset=[rel#97:Subset#1.NONE], condition=[>($1, 0)]): > rowcount = 50.0, cumulative cost = {50.0 rows, 100.0 cpu, 0.0 io}, id = 96 > BeamIOSourceRel(subset=[rel#95:Subset#0.BEAM_LOGICAL], table=[[beam, > PCOLLECTION]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, > 0.0 io}, id = 92 > Sets: > Set#0, type: RecordType(INTEGER id, INTEGER val) > rel#95:Subset#0.BEAM_LOGICAL, best=rel#92, > importance=0.7291 > rel#92:BeamIOSourceRel.BEAM_LOGICAL(table=[beam, > PCOLLECTION]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io} > rel#110:Subset#0.ENUMERABLE, best=rel#109, > importance=0.36455 > > rel#109:BeamEnumerableConverter.ENUMERABLE(input=rel#95:Subset#0.BEAM_LOGICAL), > rowcount=100.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#1, type: RecordType(INTEGER id, INTEGER val) > rel#97:Subset#1.NONE, best=null, importance=0.81 > > rel#96:LogicalFilter.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,condition=>($1, > 0)), rowcount=50.0, cumulative cost={inf} > > rel#102:LogicalCalc.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,expr#0..1={inputs},expr#2=0,expr#3=>($t1, > $t2),id=$t0,val=$t1,$condition=$t3), rowcount=50.0, cumulative cost={inf} > rel#104:Subset#1.BEAM_LOGICAL, best=rel#103, importance=0.405 > > rel#103:BeamCalcRel.BEAM_LOGICAL(input=rel#95:Subset#0.BEAM_LOGICAL,expr#0..1={inputs},expr#2=0,expr#3=>($t1, > $t2),id=$t0,val=$t1,$condition=$t3), rowcount=50.0, cumulative cost={150.0 > rows, 801.0 cpu, 0.0 io} > rel#106:Subset#1.ENUMERABLE, best=rel#105, importance=0.405 > > rel#105:BeamEnumerableConverter.ENUMERABLE(input=rel#104:Subset#1.BEAM_LOGICAL), > rowcount=50.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#2, type: RecordType(INTEGER id, INTEGER EXPR$1) > rel#99:Subset#2.NONE,
[jira] [Work logged] (BEAM-6995) SQL aggregation with where clause fails to plan
[ https://issues.apache.org/jira/browse/BEAM-6995?focusedWorklogId=32=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-32 ] ASF GitHub Bot logged work on BEAM-6995: Author: ASF GitHub Bot Created on: 02/Oct/19 20:29 Start Date: 02/Oct/19 20:29 Worklog Time Spent: 10m Work Description: 11moon11 commented on issue #9703: [BEAM-6995] Beam basic aggregation rule only when not windowed URL: https://github.com/apache/beam/pull/9703#issuecomment-537666830 Run Direct Runner Nexmark Tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 32) Time Spent: 1h 40m (was: 1.5h) > SQL aggregation with where clause fails to plan > --- > > Key: BEAM-6995 > URL: https://issues.apache.org/jira/browse/BEAM-6995 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.11.0 >Reporter: David McIntosh >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 1h 40m > Remaining Estimate: 0h > > I'm finding that this code fails with a CannotPlanException listed below. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .build(); > Row row = Row.withSchema(schema).addValues(1, 2).build(); > PCollection inputData = p.apply("row input", > Create.of(row).withRowSchema(schema)); > inputData.apply("sql", > SqlTransform.query( > "SELECT id, SUM(val) " > + "FROM PCOLLECTION " > + "WHERE val > 0 " > + "GROUP BY id"));{code} > If the WHERE clause is removed the code runs successfully. > This may be similar to BEAM-5384 since I was able to work around this by > adding an extra column to the input that isn't reference in the sql. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .addInt32Field("extra") > .build();{code} > > {code:java} > org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.RelOptPlanner$CannotPlanException: > Node [rel#100:Subset#2.BEAM_LOGICAL] could not be implemented; planner state: > Root: rel#100:Subset#2.BEAM_LOGICAL > Original rel: > LogicalAggregate(subset=[rel#100:Subset#2.BEAM_LOGICAL], group=[{0}], > EXPR$1=[SUM($1)]): rowcount = 5.0, cumulative cost = {5.687500238418579 rows, > 0.0 cpu, 0.0 io}, id = 98 > LogicalFilter(subset=[rel#97:Subset#1.NONE], condition=[>($1, 0)]): > rowcount = 50.0, cumulative cost = {50.0 rows, 100.0 cpu, 0.0 io}, id = 96 > BeamIOSourceRel(subset=[rel#95:Subset#0.BEAM_LOGICAL], table=[[beam, > PCOLLECTION]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, > 0.0 io}, id = 92 > Sets: > Set#0, type: RecordType(INTEGER id, INTEGER val) > rel#95:Subset#0.BEAM_LOGICAL, best=rel#92, > importance=0.7291 > rel#92:BeamIOSourceRel.BEAM_LOGICAL(table=[beam, > PCOLLECTION]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io} > rel#110:Subset#0.ENUMERABLE, best=rel#109, > importance=0.36455 > > rel#109:BeamEnumerableConverter.ENUMERABLE(input=rel#95:Subset#0.BEAM_LOGICAL), > rowcount=100.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#1, type: RecordType(INTEGER id, INTEGER val) > rel#97:Subset#1.NONE, best=null, importance=0.81 > > rel#96:LogicalFilter.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,condition=>($1, > 0)), rowcount=50.0, cumulative cost={inf} > > rel#102:LogicalCalc.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,expr#0..1={inputs},expr#2=0,expr#3=>($t1, > $t2),id=$t0,val=$t1,$condition=$t3), rowcount=50.0, cumulative cost={inf} > rel#104:Subset#1.BEAM_LOGICAL, best=rel#103, importance=0.405 > > rel#103:BeamCalcRel.BEAM_LOGICAL(input=rel#95:Subset#0.BEAM_LOGICAL,expr#0..1={inputs},expr#2=0,expr#3=>($t1, > $t2),id=$t0,val=$t1,$condition=$t3), rowcount=50.0, cumulative cost={150.0 > rows, 801.0 cpu, 0.0 io} > rel#106:Subset#1.ENUMERABLE, best=rel#105, importance=0.405 > > rel#105:BeamEnumerableConverter.ENUMERABLE(input=rel#104:Subset#1.BEAM_LOGICAL), > rowcount=50.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#2, type: RecordType(INTEGER id, INTEGER EXPR$1) > rel#99:Subset#2.NONE, best=null, importance=0.9 >
[jira] [Work logged] (BEAM-6995) SQL aggregation with where clause fails to plan
[ https://issues.apache.org/jira/browse/BEAM-6995?focusedWorklogId=321558=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321558 ] ASF GitHub Bot logged work on BEAM-6995: Author: ASF GitHub Bot Created on: 01/Oct/19 21:39 Start Date: 01/Oct/19 21:39 Worklog Time Spent: 10m Work Description: apilloud commented on issue #9703: [BEAM-6995] Beam basic aggregation rule only when not windowed URL: https://github.com/apache/beam/pull/9703#issuecomment-537242868 Run Direct Runner Nexmark Tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 321558) Time Spent: 1.5h (was: 1h 20m) > SQL aggregation with where clause fails to plan > --- > > Key: BEAM-6995 > URL: https://issues.apache.org/jira/browse/BEAM-6995 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.11.0 >Reporter: David McIntosh >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 1.5h > Remaining Estimate: 0h > > I'm finding that this code fails with a CannotPlanException listed below. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .build(); > Row row = Row.withSchema(schema).addValues(1, 2).build(); > PCollection inputData = p.apply("row input", > Create.of(row).withRowSchema(schema)); > inputData.apply("sql", > SqlTransform.query( > "SELECT id, SUM(val) " > + "FROM PCOLLECTION " > + "WHERE val > 0 " > + "GROUP BY id"));{code} > If the WHERE clause is removed the code runs successfully. > This may be similar to BEAM-5384 since I was able to work around this by > adding an extra column to the input that isn't reference in the sql. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .addInt32Field("extra") > .build();{code} > > {code:java} > org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.RelOptPlanner$CannotPlanException: > Node [rel#100:Subset#2.BEAM_LOGICAL] could not be implemented; planner state: > Root: rel#100:Subset#2.BEAM_LOGICAL > Original rel: > LogicalAggregate(subset=[rel#100:Subset#2.BEAM_LOGICAL], group=[{0}], > EXPR$1=[SUM($1)]): rowcount = 5.0, cumulative cost = {5.687500238418579 rows, > 0.0 cpu, 0.0 io}, id = 98 > LogicalFilter(subset=[rel#97:Subset#1.NONE], condition=[>($1, 0)]): > rowcount = 50.0, cumulative cost = {50.0 rows, 100.0 cpu, 0.0 io}, id = 96 > BeamIOSourceRel(subset=[rel#95:Subset#0.BEAM_LOGICAL], table=[[beam, > PCOLLECTION]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, > 0.0 io}, id = 92 > Sets: > Set#0, type: RecordType(INTEGER id, INTEGER val) > rel#95:Subset#0.BEAM_LOGICAL, best=rel#92, > importance=0.7291 > rel#92:BeamIOSourceRel.BEAM_LOGICAL(table=[beam, > PCOLLECTION]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io} > rel#110:Subset#0.ENUMERABLE, best=rel#109, > importance=0.36455 > > rel#109:BeamEnumerableConverter.ENUMERABLE(input=rel#95:Subset#0.BEAM_LOGICAL), > rowcount=100.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#1, type: RecordType(INTEGER id, INTEGER val) > rel#97:Subset#1.NONE, best=null, importance=0.81 > > rel#96:LogicalFilter.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,condition=>($1, > 0)), rowcount=50.0, cumulative cost={inf} > > rel#102:LogicalCalc.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,expr#0..1={inputs},expr#2=0,expr#3=>($t1, > $t2),id=$t0,val=$t1,$condition=$t3), rowcount=50.0, cumulative cost={inf} > rel#104:Subset#1.BEAM_LOGICAL, best=rel#103, importance=0.405 > > rel#103:BeamCalcRel.BEAM_LOGICAL(input=rel#95:Subset#0.BEAM_LOGICAL,expr#0..1={inputs},expr#2=0,expr#3=>($t1, > $t2),id=$t0,val=$t1,$condition=$t3), rowcount=50.0, cumulative cost={150.0 > rows, 801.0 cpu, 0.0 io} > rel#106:Subset#1.ENUMERABLE, best=rel#105, importance=0.405 > > rel#105:BeamEnumerableConverter.ENUMERABLE(input=rel#104:Subset#1.BEAM_LOGICAL), > rowcount=50.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#2, type: RecordType(INTEGER id, INTEGER EXPR$1) > rel#99:Subset#2.NONE, best=null, importance=0.9 >
[jira] [Work logged] (BEAM-6995) SQL aggregation with where clause fails to plan
[ https://issues.apache.org/jira/browse/BEAM-6995?focusedWorklogId=321537=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321537 ] ASF GitHub Bot logged work on BEAM-6995: Author: ASF GitHub Bot Created on: 01/Oct/19 21:20 Start Date: 01/Oct/19 21:20 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9703: [BEAM-6995] Beam basic aggregation rule only when not windowed URL: https://github.com/apache/beam/pull/9703#discussion_r330281262 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rule/BeamBasicAggregationRule.java ## @@ -40,15 +50,21 @@ public BeamBasicAggregationRule( Class aggregateClass, RelBuilderFactory relBuilderFactory) { -super(operand(aggregateClass, operand(TableScan.class, any())), relBuilderFactory, null); +super(operand(aggregateClass, operand(AbstractRelNode.class, any())), relBuilderFactory, null); Review comment: You are correct, updated match condition to use RelNode instead This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 321537) Time Spent: 1h (was: 50m) > SQL aggregation with where clause fails to plan > --- > > Key: BEAM-6995 > URL: https://issues.apache.org/jira/browse/BEAM-6995 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.11.0 >Reporter: David McIntosh >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 1h > Remaining Estimate: 0h > > I'm finding that this code fails with a CannotPlanException listed below. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .build(); > Row row = Row.withSchema(schema).addValues(1, 2).build(); > PCollection inputData = p.apply("row input", > Create.of(row).withRowSchema(schema)); > inputData.apply("sql", > SqlTransform.query( > "SELECT id, SUM(val) " > + "FROM PCOLLECTION " > + "WHERE val > 0 " > + "GROUP BY id"));{code} > If the WHERE clause is removed the code runs successfully. > This may be similar to BEAM-5384 since I was able to work around this by > adding an extra column to the input that isn't reference in the sql. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .addInt32Field("extra") > .build();{code} > > {code:java} > org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.RelOptPlanner$CannotPlanException: > Node [rel#100:Subset#2.BEAM_LOGICAL] could not be implemented; planner state: > Root: rel#100:Subset#2.BEAM_LOGICAL > Original rel: > LogicalAggregate(subset=[rel#100:Subset#2.BEAM_LOGICAL], group=[{0}], > EXPR$1=[SUM($1)]): rowcount = 5.0, cumulative cost = {5.687500238418579 rows, > 0.0 cpu, 0.0 io}, id = 98 > LogicalFilter(subset=[rel#97:Subset#1.NONE], condition=[>($1, 0)]): > rowcount = 50.0, cumulative cost = {50.0 rows, 100.0 cpu, 0.0 io}, id = 96 > BeamIOSourceRel(subset=[rel#95:Subset#0.BEAM_LOGICAL], table=[[beam, > PCOLLECTION]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, > 0.0 io}, id = 92 > Sets: > Set#0, type: RecordType(INTEGER id, INTEGER val) > rel#95:Subset#0.BEAM_LOGICAL, best=rel#92, > importance=0.7291 > rel#92:BeamIOSourceRel.BEAM_LOGICAL(table=[beam, > PCOLLECTION]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io} > rel#110:Subset#0.ENUMERABLE, best=rel#109, > importance=0.36455 > > rel#109:BeamEnumerableConverter.ENUMERABLE(input=rel#95:Subset#0.BEAM_LOGICAL), > rowcount=100.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#1, type: RecordType(INTEGER id, INTEGER val) > rel#97:Subset#1.NONE, best=null, importance=0.81 > > rel#96:LogicalFilter.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,condition=>($1, > 0)), rowcount=50.0, cumulative cost={inf} > > rel#102:LogicalCalc.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,expr#0..1={inputs},expr#2=0,expr#3=>($t1, > $t2),id=$t0,val=$t1,$condition=$t3), rowcount=50.0, cumulative cost={inf} > rel#104:Subset#1.BEAM_LOGICAL, best=rel#103, importance=0.405 > > rel#103:BeamCalcRel.BEAM_LOGICAL(input=rel#95:Subset#0.BEAM_LOGICAL,expr#0..1={inputs},expr#2=0,expr#3=>($t1,
[jira] [Work logged] (BEAM-6995) SQL aggregation with where clause fails to plan
[ https://issues.apache.org/jira/browse/BEAM-6995?focusedWorklogId=321540=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321540 ] ASF GitHub Bot logged work on BEAM-6995: Author: ASF GitHub Bot Created on: 01/Oct/19 21:20 Start Date: 01/Oct/19 21:20 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9703: [BEAM-6995] Beam basic aggregation rule only when not windowed URL: https://github.com/apache/beam/pull/9703#discussion_r330281262 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rule/BeamBasicAggregationRule.java ## @@ -40,15 +50,21 @@ public BeamBasicAggregationRule( Class aggregateClass, RelBuilderFactory relBuilderFactory) { -super(operand(aggregateClass, operand(TableScan.class, any())), relBuilderFactory, null); +super(operand(aggregateClass, operand(AbstractRelNode.class, any())), relBuilderFactory, null); Review comment: You are correct, updated match condition to use RelNode instead. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 321540) Time Spent: 1h 20m (was: 1h 10m) > SQL aggregation with where clause fails to plan > --- > > Key: BEAM-6995 > URL: https://issues.apache.org/jira/browse/BEAM-6995 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.11.0 >Reporter: David McIntosh >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 1h 20m > Remaining Estimate: 0h > > I'm finding that this code fails with a CannotPlanException listed below. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .build(); > Row row = Row.withSchema(schema).addValues(1, 2).build(); > PCollection inputData = p.apply("row input", > Create.of(row).withRowSchema(schema)); > inputData.apply("sql", > SqlTransform.query( > "SELECT id, SUM(val) " > + "FROM PCOLLECTION " > + "WHERE val > 0 " > + "GROUP BY id"));{code} > If the WHERE clause is removed the code runs successfully. > This may be similar to BEAM-5384 since I was able to work around this by > adding an extra column to the input that isn't reference in the sql. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .addInt32Field("extra") > .build();{code} > > {code:java} > org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.RelOptPlanner$CannotPlanException: > Node [rel#100:Subset#2.BEAM_LOGICAL] could not be implemented; planner state: > Root: rel#100:Subset#2.BEAM_LOGICAL > Original rel: > LogicalAggregate(subset=[rel#100:Subset#2.BEAM_LOGICAL], group=[{0}], > EXPR$1=[SUM($1)]): rowcount = 5.0, cumulative cost = {5.687500238418579 rows, > 0.0 cpu, 0.0 io}, id = 98 > LogicalFilter(subset=[rel#97:Subset#1.NONE], condition=[>($1, 0)]): > rowcount = 50.0, cumulative cost = {50.0 rows, 100.0 cpu, 0.0 io}, id = 96 > BeamIOSourceRel(subset=[rel#95:Subset#0.BEAM_LOGICAL], table=[[beam, > PCOLLECTION]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, > 0.0 io}, id = 92 > Sets: > Set#0, type: RecordType(INTEGER id, INTEGER val) > rel#95:Subset#0.BEAM_LOGICAL, best=rel#92, > importance=0.7291 > rel#92:BeamIOSourceRel.BEAM_LOGICAL(table=[beam, > PCOLLECTION]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io} > rel#110:Subset#0.ENUMERABLE, best=rel#109, > importance=0.36455 > > rel#109:BeamEnumerableConverter.ENUMERABLE(input=rel#95:Subset#0.BEAM_LOGICAL), > rowcount=100.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#1, type: RecordType(INTEGER id, INTEGER val) > rel#97:Subset#1.NONE, best=null, importance=0.81 > > rel#96:LogicalFilter.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,condition=>($1, > 0)), rowcount=50.0, cumulative cost={inf} > > rel#102:LogicalCalc.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,expr#0..1={inputs},expr#2=0,expr#3=>($t1, > $t2),id=$t0,val=$t1,$condition=$t3), rowcount=50.0, cumulative cost={inf} > rel#104:Subset#1.BEAM_LOGICAL, best=rel#103, importance=0.405 > >
[jira] [Work logged] (BEAM-6995) SQL aggregation with where clause fails to plan
[ https://issues.apache.org/jira/browse/BEAM-6995?focusedWorklogId=321539=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321539 ] ASF GitHub Bot logged work on BEAM-6995: Author: ASF GitHub Bot Created on: 01/Oct/19 21:20 Start Date: 01/Oct/19 21:20 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9703: [BEAM-6995] Beam basic aggregation rule only when not windowed URL: https://github.com/apache/beam/pull/9703#discussion_r330281513 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rule/BeamBasicAggregationRule.java ## @@ -40,15 +50,21 @@ public BeamBasicAggregationRule( Class aggregateClass, RelBuilderFactory relBuilderFactory) { -super(operand(aggregateClass, operand(TableScan.class, any())), relBuilderFactory, null); +super(operand(aggregateClass, operand(AbstractRelNode.class, any())), relBuilderFactory, null); } @Override public void onMatch(RelOptRuleCall call) { Aggregate aggregate = call.rel(0); -TableScan tableScan = call.rel(1); +AbstractRelNode relNode = call.rel(1); -RelNode newTableScan = tableScan.copy(tableScan.getTraitSet(), tableScan.getInputs()); +if (relNode instanceof Project || relNode instanceof Calc || relNode instanceof Filter) { + if (isWindowed(relNode) || hasWindowedParents(relNode)) { +return; Review comment: Added a comment. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 321539) Time Spent: 1h 10m (was: 1h) > SQL aggregation with where clause fails to plan > --- > > Key: BEAM-6995 > URL: https://issues.apache.org/jira/browse/BEAM-6995 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.11.0 >Reporter: David McIntosh >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 1h 10m > Remaining Estimate: 0h > > I'm finding that this code fails with a CannotPlanException listed below. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .build(); > Row row = Row.withSchema(schema).addValues(1, 2).build(); > PCollection inputData = p.apply("row input", > Create.of(row).withRowSchema(schema)); > inputData.apply("sql", > SqlTransform.query( > "SELECT id, SUM(val) " > + "FROM PCOLLECTION " > + "WHERE val > 0 " > + "GROUP BY id"));{code} > If the WHERE clause is removed the code runs successfully. > This may be similar to BEAM-5384 since I was able to work around this by > adding an extra column to the input that isn't reference in the sql. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .addInt32Field("extra") > .build();{code} > > {code:java} > org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.RelOptPlanner$CannotPlanException: > Node [rel#100:Subset#2.BEAM_LOGICAL] could not be implemented; planner state: > Root: rel#100:Subset#2.BEAM_LOGICAL > Original rel: > LogicalAggregate(subset=[rel#100:Subset#2.BEAM_LOGICAL], group=[{0}], > EXPR$1=[SUM($1)]): rowcount = 5.0, cumulative cost = {5.687500238418579 rows, > 0.0 cpu, 0.0 io}, id = 98 > LogicalFilter(subset=[rel#97:Subset#1.NONE], condition=[>($1, 0)]): > rowcount = 50.0, cumulative cost = {50.0 rows, 100.0 cpu, 0.0 io}, id = 96 > BeamIOSourceRel(subset=[rel#95:Subset#0.BEAM_LOGICAL], table=[[beam, > PCOLLECTION]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, > 0.0 io}, id = 92 > Sets: > Set#0, type: RecordType(INTEGER id, INTEGER val) > rel#95:Subset#0.BEAM_LOGICAL, best=rel#92, > importance=0.7291 > rel#92:BeamIOSourceRel.BEAM_LOGICAL(table=[beam, > PCOLLECTION]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io} > rel#110:Subset#0.ENUMERABLE, best=rel#109, > importance=0.36455 > > rel#109:BeamEnumerableConverter.ENUMERABLE(input=rel#95:Subset#0.BEAM_LOGICAL), > rowcount=100.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#1, type: RecordType(INTEGER id, INTEGER val) > rel#97:Subset#1.NONE, best=null, importance=0.81 > > rel#96:LogicalFilter.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,condition=>($1, > 0)), rowcount=50.0,
[jira] [Work logged] (BEAM-6995) SQL aggregation with where clause fails to plan
[ https://issues.apache.org/jira/browse/BEAM-6995?focusedWorklogId=321527=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321527 ] ASF GitHub Bot logged work on BEAM-6995: Author: ASF GitHub Bot Created on: 01/Oct/19 21:02 Start Date: 01/Oct/19 21:02 Worklog Time Spent: 10m Work Description: apilloud commented on pull request #9703: [BEAM-6995] Beam basic aggregation rule only when not windowed URL: https://github.com/apache/beam/pull/9703#discussion_r330273314 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rule/BeamBasicAggregationRule.java ## @@ -40,15 +50,21 @@ public BeamBasicAggregationRule( Class aggregateClass, RelBuilderFactory relBuilderFactory) { -super(operand(aggregateClass, operand(TableScan.class, any())), relBuilderFactory, null); +super(operand(aggregateClass, operand(AbstractRelNode.class, any())), relBuilderFactory, null); } @Override public void onMatch(RelOptRuleCall call) { Aggregate aggregate = call.rel(0); -TableScan tableScan = call.rel(1); +AbstractRelNode relNode = call.rel(1); -RelNode newTableScan = tableScan.copy(tableScan.getTraitSet(), tableScan.getInputs()); +if (relNode instanceof Project || relNode instanceof Calc || relNode instanceof Filter) { + if (isWindowed(relNode) || hasWindowedParents(relNode)) { +return; Review comment: Probably worth adding a comment here that this case is expected to be handled by `BeamAggregationRule`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 321527) Time Spent: 40m (was: 0.5h) > SQL aggregation with where clause fails to plan > --- > > Key: BEAM-6995 > URL: https://issues.apache.org/jira/browse/BEAM-6995 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.11.0 >Reporter: David McIntosh >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 40m > Remaining Estimate: 0h > > I'm finding that this code fails with a CannotPlanException listed below. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .build(); > Row row = Row.withSchema(schema).addValues(1, 2).build(); > PCollection inputData = p.apply("row input", > Create.of(row).withRowSchema(schema)); > inputData.apply("sql", > SqlTransform.query( > "SELECT id, SUM(val) " > + "FROM PCOLLECTION " > + "WHERE val > 0 " > + "GROUP BY id"));{code} > If the WHERE clause is removed the code runs successfully. > This may be similar to BEAM-5384 since I was able to work around this by > adding an extra column to the input that isn't reference in the sql. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .addInt32Field("extra") > .build();{code} > > {code:java} > org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.RelOptPlanner$CannotPlanException: > Node [rel#100:Subset#2.BEAM_LOGICAL] could not be implemented; planner state: > Root: rel#100:Subset#2.BEAM_LOGICAL > Original rel: > LogicalAggregate(subset=[rel#100:Subset#2.BEAM_LOGICAL], group=[{0}], > EXPR$1=[SUM($1)]): rowcount = 5.0, cumulative cost = {5.687500238418579 rows, > 0.0 cpu, 0.0 io}, id = 98 > LogicalFilter(subset=[rel#97:Subset#1.NONE], condition=[>($1, 0)]): > rowcount = 50.0, cumulative cost = {50.0 rows, 100.0 cpu, 0.0 io}, id = 96 > BeamIOSourceRel(subset=[rel#95:Subset#0.BEAM_LOGICAL], table=[[beam, > PCOLLECTION]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, > 0.0 io}, id = 92 > Sets: > Set#0, type: RecordType(INTEGER id, INTEGER val) > rel#95:Subset#0.BEAM_LOGICAL, best=rel#92, > importance=0.7291 > rel#92:BeamIOSourceRel.BEAM_LOGICAL(table=[beam, > PCOLLECTION]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io} > rel#110:Subset#0.ENUMERABLE, best=rel#109, > importance=0.36455 > > rel#109:BeamEnumerableConverter.ENUMERABLE(input=rel#95:Subset#0.BEAM_LOGICAL), > rowcount=100.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#1, type: RecordType(INTEGER id, INTEGER val) > rel#97:Subset#1.NONE, best=null, importance=0.81 > >
[jira] [Work logged] (BEAM-6995) SQL aggregation with where clause fails to plan
[ https://issues.apache.org/jira/browse/BEAM-6995?focusedWorklogId=321528=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321528 ] ASF GitHub Bot logged work on BEAM-6995: Author: ASF GitHub Bot Created on: 01/Oct/19 21:02 Start Date: 01/Oct/19 21:02 Worklog Time Spent: 10m Work Description: apilloud commented on pull request #9703: [BEAM-6995] Beam basic aggregation rule only when not windowed URL: https://github.com/apache/beam/pull/9703#discussion_r330272966 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rule/BeamBasicAggregationRule.java ## @@ -40,15 +50,21 @@ public BeamBasicAggregationRule( Class aggregateClass, RelBuilderFactory relBuilderFactory) { -super(operand(aggregateClass, operand(TableScan.class, any())), relBuilderFactory, null); +super(operand(aggregateClass, operand(AbstractRelNode.class, any())), relBuilderFactory, null); Review comment: Looking at examples of this in Calcite, I think `RelNode` is preferable to `AbstractRelNode`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 321528) Time Spent: 50m (was: 40m) > SQL aggregation with where clause fails to plan > --- > > Key: BEAM-6995 > URL: https://issues.apache.org/jira/browse/BEAM-6995 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.11.0 >Reporter: David McIntosh >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 50m > Remaining Estimate: 0h > > I'm finding that this code fails with a CannotPlanException listed below. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .build(); > Row row = Row.withSchema(schema).addValues(1, 2).build(); > PCollection inputData = p.apply("row input", > Create.of(row).withRowSchema(schema)); > inputData.apply("sql", > SqlTransform.query( > "SELECT id, SUM(val) " > + "FROM PCOLLECTION " > + "WHERE val > 0 " > + "GROUP BY id"));{code} > If the WHERE clause is removed the code runs successfully. > This may be similar to BEAM-5384 since I was able to work around this by > adding an extra column to the input that isn't reference in the sql. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .addInt32Field("extra") > .build();{code} > > {code:java} > org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.RelOptPlanner$CannotPlanException: > Node [rel#100:Subset#2.BEAM_LOGICAL] could not be implemented; planner state: > Root: rel#100:Subset#2.BEAM_LOGICAL > Original rel: > LogicalAggregate(subset=[rel#100:Subset#2.BEAM_LOGICAL], group=[{0}], > EXPR$1=[SUM($1)]): rowcount = 5.0, cumulative cost = {5.687500238418579 rows, > 0.0 cpu, 0.0 io}, id = 98 > LogicalFilter(subset=[rel#97:Subset#1.NONE], condition=[>($1, 0)]): > rowcount = 50.0, cumulative cost = {50.0 rows, 100.0 cpu, 0.0 io}, id = 96 > BeamIOSourceRel(subset=[rel#95:Subset#0.BEAM_LOGICAL], table=[[beam, > PCOLLECTION]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, > 0.0 io}, id = 92 > Sets: > Set#0, type: RecordType(INTEGER id, INTEGER val) > rel#95:Subset#0.BEAM_LOGICAL, best=rel#92, > importance=0.7291 > rel#92:BeamIOSourceRel.BEAM_LOGICAL(table=[beam, > PCOLLECTION]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io} > rel#110:Subset#0.ENUMERABLE, best=rel#109, > importance=0.36455 > > rel#109:BeamEnumerableConverter.ENUMERABLE(input=rel#95:Subset#0.BEAM_LOGICAL), > rowcount=100.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#1, type: RecordType(INTEGER id, INTEGER val) > rel#97:Subset#1.NONE, best=null, importance=0.81 > > rel#96:LogicalFilter.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,condition=>($1, > 0)), rowcount=50.0, cumulative cost={inf} > > rel#102:LogicalCalc.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,expr#0..1={inputs},expr#2=0,expr#3=>($t1, > $t2),id=$t0,val=$t1,$condition=$t3), rowcount=50.0, cumulative cost={inf} > rel#104:Subset#1.BEAM_LOGICAL, best=rel#103, importance=0.405 > >
[jira] [Work logged] (BEAM-6995) SQL aggregation with where clause fails to plan
[ https://issues.apache.org/jira/browse/BEAM-6995?focusedWorklogId=321522=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321522 ] ASF GitHub Bot logged work on BEAM-6995: Author: ASF GitHub Bot Created on: 01/Oct/19 20:54 Start Date: 01/Oct/19 20:54 Worklog Time Spent: 10m Work Description: apilloud commented on issue #9703: [BEAM-6995] Beam basic aggregation rule only when not windowed URL: https://github.com/apache/beam/pull/9703#issuecomment-537226095 Run Direct Runner Nexmark Tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 321522) Time Spent: 0.5h (was: 20m) > SQL aggregation with where clause fails to plan > --- > > Key: BEAM-6995 > URL: https://issues.apache.org/jira/browse/BEAM-6995 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.11.0 >Reporter: David McIntosh >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 0.5h > Remaining Estimate: 0h > > I'm finding that this code fails with a CannotPlanException listed below. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .build(); > Row row = Row.withSchema(schema).addValues(1, 2).build(); > PCollection inputData = p.apply("row input", > Create.of(row).withRowSchema(schema)); > inputData.apply("sql", > SqlTransform.query( > "SELECT id, SUM(val) " > + "FROM PCOLLECTION " > + "WHERE val > 0 " > + "GROUP BY id"));{code} > If the WHERE clause is removed the code runs successfully. > This may be similar to BEAM-5384 since I was able to work around this by > adding an extra column to the input that isn't reference in the sql. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .addInt32Field("extra") > .build();{code} > > {code:java} > org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.RelOptPlanner$CannotPlanException: > Node [rel#100:Subset#2.BEAM_LOGICAL] could not be implemented; planner state: > Root: rel#100:Subset#2.BEAM_LOGICAL > Original rel: > LogicalAggregate(subset=[rel#100:Subset#2.BEAM_LOGICAL], group=[{0}], > EXPR$1=[SUM($1)]): rowcount = 5.0, cumulative cost = {5.687500238418579 rows, > 0.0 cpu, 0.0 io}, id = 98 > LogicalFilter(subset=[rel#97:Subset#1.NONE], condition=[>($1, 0)]): > rowcount = 50.0, cumulative cost = {50.0 rows, 100.0 cpu, 0.0 io}, id = 96 > BeamIOSourceRel(subset=[rel#95:Subset#0.BEAM_LOGICAL], table=[[beam, > PCOLLECTION]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, > 0.0 io}, id = 92 > Sets: > Set#0, type: RecordType(INTEGER id, INTEGER val) > rel#95:Subset#0.BEAM_LOGICAL, best=rel#92, > importance=0.7291 > rel#92:BeamIOSourceRel.BEAM_LOGICAL(table=[beam, > PCOLLECTION]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io} > rel#110:Subset#0.ENUMERABLE, best=rel#109, > importance=0.36455 > > rel#109:BeamEnumerableConverter.ENUMERABLE(input=rel#95:Subset#0.BEAM_LOGICAL), > rowcount=100.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#1, type: RecordType(INTEGER id, INTEGER val) > rel#97:Subset#1.NONE, best=null, importance=0.81 > > rel#96:LogicalFilter.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,condition=>($1, > 0)), rowcount=50.0, cumulative cost={inf} > > rel#102:LogicalCalc.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,expr#0..1={inputs},expr#2=0,expr#3=>($t1, > $t2),id=$t0,val=$t1,$condition=$t3), rowcount=50.0, cumulative cost={inf} > rel#104:Subset#1.BEAM_LOGICAL, best=rel#103, importance=0.405 > > rel#103:BeamCalcRel.BEAM_LOGICAL(input=rel#95:Subset#0.BEAM_LOGICAL,expr#0..1={inputs},expr#2=0,expr#3=>($t1, > $t2),id=$t0,val=$t1,$condition=$t3), rowcount=50.0, cumulative cost={150.0 > rows, 801.0 cpu, 0.0 io} > rel#106:Subset#1.ENUMERABLE, best=rel#105, importance=0.405 > > rel#105:BeamEnumerableConverter.ENUMERABLE(input=rel#104:Subset#1.BEAM_LOGICAL), > rowcount=50.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#2, type: RecordType(INTEGER id, INTEGER EXPR$1) > rel#99:Subset#2.NONE, best=null, importance=0.9 >
[jira] [Work logged] (BEAM-6995) SQL aggregation with where clause fails to plan
[ https://issues.apache.org/jira/browse/BEAM-6995?focusedWorklogId=321440=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321440 ] ASF GitHub Bot logged work on BEAM-6995: Author: ASF GitHub Bot Created on: 01/Oct/19 18:43 Start Date: 01/Oct/19 18:43 Worklog Time Spent: 10m Work Description: 11moon11 commented on issue #9703: [BEAM-6995] Beam basic aggregation rule only when not windowed URL: https://github.com/apache/beam/pull/9703#issuecomment-537174145 R: @apilloud This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 321440) Time Spent: 20m (was: 10m) > SQL aggregation with where clause fails to plan > --- > > Key: BEAM-6995 > URL: https://issues.apache.org/jira/browse/BEAM-6995 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.11.0 >Reporter: David McIntosh >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 20m > Remaining Estimate: 0h > > I'm finding that this code fails with a CannotPlanException listed below. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .build(); > Row row = Row.withSchema(schema).addValues(1, 2).build(); > PCollection inputData = p.apply("row input", > Create.of(row).withRowSchema(schema)); > inputData.apply("sql", > SqlTransform.query( > "SELECT id, SUM(val) " > + "FROM PCOLLECTION " > + "WHERE val > 0 " > + "GROUP BY id"));{code} > If the WHERE clause is removed the code runs successfully. > This may be similar to BEAM-5384 since I was able to work around this by > adding an extra column to the input that isn't reference in the sql. > {code:java} > Schema schema = Schema.builder() > .addInt32Field("id") > .addInt32Field("val") > .addInt32Field("extra") > .build();{code} > > {code:java} > org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.plan.RelOptPlanner$CannotPlanException: > Node [rel#100:Subset#2.BEAM_LOGICAL] could not be implemented; planner state: > Root: rel#100:Subset#2.BEAM_LOGICAL > Original rel: > LogicalAggregate(subset=[rel#100:Subset#2.BEAM_LOGICAL], group=[{0}], > EXPR$1=[SUM($1)]): rowcount = 5.0, cumulative cost = {5.687500238418579 rows, > 0.0 cpu, 0.0 io}, id = 98 > LogicalFilter(subset=[rel#97:Subset#1.NONE], condition=[>($1, 0)]): > rowcount = 50.0, cumulative cost = {50.0 rows, 100.0 cpu, 0.0 io}, id = 96 > BeamIOSourceRel(subset=[rel#95:Subset#0.BEAM_LOGICAL], table=[[beam, > PCOLLECTION]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, > 0.0 io}, id = 92 > Sets: > Set#0, type: RecordType(INTEGER id, INTEGER val) > rel#95:Subset#0.BEAM_LOGICAL, best=rel#92, > importance=0.7291 > rel#92:BeamIOSourceRel.BEAM_LOGICAL(table=[beam, > PCOLLECTION]), rowcount=100.0, cumulative cost={100.0 rows, 101.0 cpu, 0.0 io} > rel#110:Subset#0.ENUMERABLE, best=rel#109, > importance=0.36455 > > rel#109:BeamEnumerableConverter.ENUMERABLE(input=rel#95:Subset#0.BEAM_LOGICAL), > rowcount=100.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#1, type: RecordType(INTEGER id, INTEGER val) > rel#97:Subset#1.NONE, best=null, importance=0.81 > > rel#96:LogicalFilter.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,condition=>($1, > 0)), rowcount=50.0, cumulative cost={inf} > > rel#102:LogicalCalc.NONE(input=rel#95:Subset#0.BEAM_LOGICAL,expr#0..1={inputs},expr#2=0,expr#3=>($t1, > $t2),id=$t0,val=$t1,$condition=$t3), rowcount=50.0, cumulative cost={inf} > rel#104:Subset#1.BEAM_LOGICAL, best=rel#103, importance=0.405 > > rel#103:BeamCalcRel.BEAM_LOGICAL(input=rel#95:Subset#0.BEAM_LOGICAL,expr#0..1={inputs},expr#2=0,expr#3=>($t1, > $t2),id=$t0,val=$t1,$condition=$t3), rowcount=50.0, cumulative cost={150.0 > rows, 801.0 cpu, 0.0 io} > rel#106:Subset#1.ENUMERABLE, best=rel#105, importance=0.405 > > rel#105:BeamEnumerableConverter.ENUMERABLE(input=rel#104:Subset#1.BEAM_LOGICAL), > rowcount=50.0, cumulative cost={1.7976931348623157E308 rows, > 1.7976931348623157E308 cpu, 1.7976931348623157E308 io} > Set#2, type: RecordType(INTEGER id, INTEGER EXPR$1) > rel#99:Subset#2.NONE, best=null, importance=0.9 > >
[jira] [Work logged] (BEAM-6995) SQL aggregation with where clause fails to plan
[ https://issues.apache.org/jira/browse/BEAM-6995?focusedWorklogId=321439=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321439 ] ASF GitHub Bot logged work on BEAM-6995: Author: ASF GitHub Bot Created on: 01/Oct/19 18:41 Start Date: 01/Oct/19 18:41 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9703: [BEAM-6995] Beam basic aggregation rule only when not windowed URL: https://github.com/apache/beam/pull/9703 Beam basic aggregation rule should not be applied on Calc, Project, and Filter when their parents/they utilize windowed functions. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build