[jira] [Commented] (CALCITE-3366) RelDecorrelator supports Union
[ https://issues.apache.org/jira/browse/CALCITE-3366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17351626#comment-17351626 ] Jin Xing commented on CALCITE-3366: --- I means it's good if some other people could take this jira and continue the work, since the issue is valid. pr-2370 resolved conflicts of pr-1468. But pr-2370 has also been out dated for quite a while. It's great if @xzh_dz could continue the work. > RelDecorrelator supports Union > -- > > Key: CALCITE-3366 > URL: https://issues.apache.org/jira/browse/CALCITE-3366 > Project: Calcite > Issue Type: Improvement >Reporter: Jin Xing >Priority: Major > Labels: pull-request-available > Time Spent: 4h 10m > Remaining Estimate: 0h > > This issue proposes to support decorrelation for below sql > {code:java} > SELECT deptno FROM dept where exists > (SELECT 1 FROM emp where sal < 100 and emp.deptno=dept.deptno > union all > SELECT 1 FROM emp where sal > 200 and emp.deptno=dept.deptno){code} > This issue was found when I resolve CALCITE-3363 in > https://github.com/apache/calcite/pull/1466 > I failed to construct an semi-join operator from SQL string. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3366) RelDecorrelator supports Union
[ https://issues.apache.org/jira/browse/CALCITE-3366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17351545#comment-17351545 ] Jin Xing commented on CALCITE-3366: --- I closed pr-1468 for now. It's good if someone could take over. > RelDecorrelator supports Union > -- > > Key: CALCITE-3366 > URL: https://issues.apache.org/jira/browse/CALCITE-3366 > Project: Calcite > Issue Type: Improvement >Reporter: Jin Xing >Priority: Major > Labels: pull-request-available > Time Spent: 4h 10m > Remaining Estimate: 0h > > This issue proposes to support decorrelation for below sql > {code:java} > SELECT deptno FROM dept where exists > (SELECT 1 FROM emp where sal < 100 and emp.deptno=dept.deptno > union all > SELECT 1 FROM emp where sal > 200 and emp.deptno=dept.deptno){code} > This issue was found when I resolve CALCITE-3363 in > https://github.com/apache/calcite/pull/1466 > I failed to construct an semi-join operator from SQL string. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (CALCITE-3935) Materialization-Failed, when querying with LeftJoinWithFilter
[ https://issues.apache.org/jira/browse/CALCITE-3935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17134120#comment-17134120 ] Jin Xing edited comment on CALCITE-3935 at 6/12/20, 10:46 AM: -- Please refine the Jira title and give some basic analysis in the description. was (Author: jinxing6...@126.com): Refine the Jira title and give some basic analysis in the description. > Materialization-Failed, when querying with LeftJoinWithFilter > - > > Key: CALCITE-3935 > URL: https://issues.apache.org/jira/browse/CALCITE-3935 > Project: Calcite > Issue Type: Bug >Reporter: Xurenhe >Priority: Major > Attachments: Jietu20200417-200532.png > > Time Spent: 20m > Remaining Estimate: 0h > > {code:java} > @Test public void testJoinOnLeftProjectWithFilterToJoin() { > String mv = "" > + "select \"emps\".\"empid\", \"depts\".\"name\", > \"emps\".\"salary\"\n" + > "from \"emps\"\n" + > "left join \"depts\"\n" + > "on \"emps\".\"deptno\" = \"depts\".\"deptno\"\n" > + "where \"emps\".\"empid\" > 10"; > String query = "" > + "select \"emps\".\"empid\", \"depts\".\"name\", > \"emps\".\"salary\"\n" + > "from \"emps\"\n" + > "left join \"depts\"\n" + > "on \"emps\".\"deptno\" = \"depts\".\"deptno\"\n" > + "where \"emps\".\"empid\" > 40"; > sql(mv, query).withOnlyBySubstitution(true).ok(); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3935) Materialization-Failed, when querying with LeftJoinWithFilter
[ https://issues.apache.org/jira/browse/CALCITE-3935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17134120#comment-17134120 ] Jin Xing commented on CALCITE-3935: --- Refine the Jira title and give some basic analysis in the description. > Materialization-Failed, when querying with LeftJoinWithFilter > - > > Key: CALCITE-3935 > URL: https://issues.apache.org/jira/browse/CALCITE-3935 > Project: Calcite > Issue Type: Bug >Reporter: Xurenhe >Priority: Major > Attachments: Jietu20200417-200532.png > > Time Spent: 20m > Remaining Estimate: 0h > > {code:java} > @Test public void testJoinOnLeftProjectWithFilterToJoin() { > String mv = "" > + "select \"emps\".\"empid\", \"depts\".\"name\", > \"emps\".\"salary\"\n" + > "from \"emps\"\n" + > "left join \"depts\"\n" + > "on \"emps\".\"deptno\" = \"depts\".\"deptno\"\n" > + "where \"emps\".\"empid\" > 10"; > String query = "" > + "select \"emps\".\"empid\", \"depts\".\"name\", > \"emps\".\"salary\"\n" + > "from \"emps\"\n" + > "left join \"depts\"\n" + > "on \"emps\".\"deptno\" = \"depts\".\"deptno\"\n" > + "where \"emps\".\"empid\" > 40"; > sql(mv, query).withOnlyBySubstitution(true).ok(); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CALCITE-4040) Materialized View matching failed with NPE if aggregate function doesn't support roll up
[ https://issues.apache.org/jira/browse/CALCITE-4040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jin Xing updated CALCITE-4040: -- Summary: Materialized View matching failed with NPE if aggregate function doesn't support roll up (was: When an aggregate function does not support roll up, materialization recognition should fail and return the original relnode instead of throwing an exception.) > Materialized View matching failed with NPE if aggregate function doesn't > support roll up > > > Key: CALCITE-4040 > URL: https://issues.apache.org/jira/browse/CALCITE-4040 > Project: Calcite > Issue Type: Wish >Reporter: xzh_dz >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > When i try to rollup some SqlAggFunctions in my project,I find something > wrong. > A case can be reproduced as below: > MaterializationTest: > {code:java} > @Test public void testSqlAggFunctionRollup() { > checkNoMaterialize( > "select \"empid\", stddev_pop(\"deptno\") from \"emps\" group by > \"empid\", \"deptno\"", > "select \"empid\", stddev_pop(\"deptno\") from \"emps\" group by > \"empid\"", > HR_FKUK_MODEL); > } > {code} > When an aggregate function does not support roll up, materialization > recognition should fail and return the original relnode instead of throwing > an exception. > Exception: > {code:java} > java.sql.SQLException: Error while executing SQL "explain plan for select > "empid", stddev_pop("deptno") from "emps" group by "empid"": null > at org.apache.calcite.avatica.Helper.createException(Helper.java:56) > at org.apache.calcite.avatica.Helper.createException(Helper.java:41) > at > org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:163) > at > org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:227) > at > org.apache.calcite.test.CalciteAssert.assertQuery(CalciteAssert.java:528) > Caused by: java.lang.NullPointerException > at java.util.Objects.requireNonNull(Objects.java:203) > at > org.apache.calcite.rel.core.AggregateCall.(AggregateCall.java:98) > at > org.apache.calcite.rel.core.AggregateCall.create(AggregateCall.java:198) > at > org.apache.calcite.plan.SubstitutionVisitor.unifyAggregates(SubstitutionVisitor.java:1854) > at > org.apache.calcite.plan.SubstitutionVisitor$AggregateToAggregateUnifyRule.apply(SubstitutionVisitor.java:1545) > at > org.apache.calcite.plan.SubstitutionVisitor.go(SubstitutionVisitor.java:544) > at > org.apache.calcite.plan.SubstitutionVisitor.go(SubstitutionVisitor.java:478) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3786) Add Digest interface to enable efficient hashCode(equals) for RexNode and RelNode
[ https://issues.apache.org/jira/browse/CALCITE-3786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17132926#comment-17132926 ] Jin Xing commented on CALCITE-3786: --- In my environment, some users use program to generate big queries (about 10k lines) and suffers memory issue a lot. The reuse of instance is the pain point and I agree with [~hyuan] that a thorough fix is to have the ability of deduplication inside planner. Really hope we can move forward on this direction. The approach proposed in the patch ease the memory issue and provide a lighter way to do shallow comparison by cached hashcode. It's very good and I can feel the benefit but I have the same concern – – does the cached hashcode has to be bound with Digest ? > Add Digest interface to enable efficient hashCode(equals) for RexNode and > RelNode > - > > Key: CALCITE-3786 > URL: https://issues.apache.org/jira/browse/CALCITE-3786 > Project: Calcite > Issue Type: New Feature > Components: core >Affects Versions: 1.21.0 >Reporter: Vladimir Sitnikov >Assignee: Danny Chen >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > Current digests for RexNode, RelNode, RelType, and similar cases use String > concatenation. > It is easy to implement, however, it has drawbacks: > 1) String objects cannot be reused. For instance, RexCall has operands, > however, the digest is duplicated. It causes extra memory use and extra CPU > for string copying > 2) There's no way to have multiple #toString() methods. RelType might need > multiple digests: "including field names", "excluding field names". > A suggested resolution might be behind the lines of > {code:java} > class Digest { // immutable > final int hashCode; // speedup hashCode and equals > final Object[] contents; // The values are either other Digest objects or > Strings > String toString(); // e.g. for debugging purposes > int compareTo(Digest); // e.g. for debugging purposes. > } > {code} > Note how fields in Kotlin are aligned much better, and it makes it easier to > read: > {code:java} > class Digest { // immutable > val hashCode: Int // speedup hashCode and equals > val contents: Array // The values are either other Digest objects or > Strings > fun toString(): String // e.g. for debugging purposes > fun compareTo(other: Digest): Int // e.g. for debugging purposes. > } > {code} > Then the digest for RexCall could be the bits relevant to RexCall itself + > digests of the operands (which can be reused as is) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-4022) After SqlInsert is traversed by SqlShuttle, SqlBasicCall throws UnsupportedOperationException
[ https://issues.apache.org/jira/browse/CALCITE-4022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17116348#comment-17116348 ] Jin Xing commented on CALCITE-4022: --- Can you please give the Sql and the stacktrace ? Or some description if possible ? > After SqlInsert is traversed by SqlShuttle, SqlBasicCall throws > UnsupportedOperationException > - > > Key: CALCITE-4022 > URL: https://issues.apache.org/jira/browse/CALCITE-4022 > Project: Calcite > Issue Type: Wish >Reporter: xzh_dz >Priority: Major > Attachments: image-2020-05-25-17-38-36-009.png > > > {code:java} > SqlNode a = sqlNode.accept(new SqlShuttle() { > @Override public SqlNode visit(SqlIdentifier identifier) { > return new SqlIdentifier(identifier.names, > identifier.getParserPosition()); > } > }); > {code} > !image-2020-05-25-17-38-36-009.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (CALCITE-2997) Avoid pushing down join condition in SqlToRelConverter
[ https://issues.apache.org/jira/browse/CALCITE-2997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17111699#comment-17111699 ] Jin Xing edited comment on CALCITE-2997 at 5/20/20, 2:36 AM: - Thanks [~swtalbot] and [~julianhyde] to bring this up ~ Yes, In some scenarios we need SqlToRelConverter to return a logical plan which corresponds to original SQL String exactly. Mixing optimizations into the process of building logical plan sometimes make the user confused. I would +1 on Julian's idea – – move the rewrite/optimize code into RelBuilder and add a config to enable/disable, but several concerns as below: # Should we disable the rewrite/optimize in SqlToRelConverter by default ? Calcite have clear phases for the lifecycle of a SQL. SqlToRelConverter can return a purely reflection of the original SQL string, and let the optimization/rewrite done in the following optimizing phase; # In addition to pushDownJoinConditions, there are other rewrites like pushDownNotForIn, shall we disable these optimizing work by default or even move out of SqlToRelConverter to the optimizaing phase ? was (Author: jinxing6...@126.com): Thanks [~swtalbot] and [~julianhyde] to bring this up ~ Yes, In some scenarios we need SqlToRelConverter to return a logical plan which corresponds to original SQL String exactly. Mixing optimizations into the process of building logical plan sometimes make the user confused. I would +1 on Julian's idea – – move the rewrite/optimize code into RelBuilder and add a config to enable/disable, but several concerns as below: # Should we disable the rewrite/optimize in SqlToRelConverter by default ? Calcite have clear phases for the lifecycle of a SQL. SqlToRelConverter can return a purely reflection of the original SQL string, and let the optimization/rewrite done in the following optimizing phase; # In addition to pushDownJoinConditions, there are other rewrites like pushDownNotForIn, shall we disable these optimizing work or even move out of SqlToRelConverter to the optimizaing phase ? > Avoid pushing down join condition in SqlToRelConverter > -- > > Key: CALCITE-2997 > URL: https://issues.apache.org/jira/browse/CALCITE-2997 > Project: Calcite > Issue Type: Bug >Reporter: Jin Xing >Assignee: Julian Hyde >Priority: Major > > In current code, *SqlToRelConverter:createJoin* is calling > *RelOptUtil.pushDownJoinConditions* for optimization. And we can find below > conversion from *SqlNode* to *RelNode*: > {code:java} > SqlNode: > select * from A join B on A.x = B.x * 2 > RelNode (Logical-Plan): > Join (condition:col0=col1) > |-Project(x as col0) > | |-Scan(A) > |-Project(x * 2 as col1) > |-Scan(B){code} > As we can see the logical plan(*RelNode*) posted above is not the pure > reflection of the original SQL String(*SqlNode*). The optimization is mixed > into the phase on which AST is converted to Logical-Plan. Actually optimizing > rule of JoinPushExpressionsRule is doing exactly the same kind of thing. > Shall we just keep the optimization inside Optimized-Logical-Plan ? I mean > shall we avoid calling *RelOptUtil.pushDownJoinConditions* in > *SqlToRelConverter:createJoin* > I raised this issue because that we are doing something based on the > Logical-Plan. And it makes us really confused that the Logical-Plan doesn't > corresponds to SqlNode. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-2997) Avoid pushing down join condition in SqlToRelConverter
[ https://issues.apache.org/jira/browse/CALCITE-2997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17111699#comment-17111699 ] Jin Xing commented on CALCITE-2997: --- Thanks [~swtalbot] and [~julianhyde] to bring this up ~ Yes, In some scenarios we need SqlToRelConverter to return a logical plan which corresponds to original SQL String exactly. Mixing optimizations into the process of building logical plan sometimes make the user confused. I would +1 on Julian's idea – – move the rewrite/optimize code into RelBuilder and add a config to enable/disable, but several concerns as below: # Should we disable the rewrite/optimize in SqlToRelConverter by default ? Calcite have clear phases for the lifecycle of a SQL. SqlToRelConverter can return a purely reflection of the original SQL string, and let the optimization/rewrite done in the following optimizing phase; # In addition to pushDownJoinConditions, there are other rewrites like pushDownNotForIn, shall we disable these optimizing work or even move out of SqlToRelConverter to the optimizaing phase ? > Avoid pushing down join condition in SqlToRelConverter > -- > > Key: CALCITE-2997 > URL: https://issues.apache.org/jira/browse/CALCITE-2997 > Project: Calcite > Issue Type: Bug >Reporter: Jin Xing >Assignee: Julian Hyde >Priority: Major > > In current code, *SqlToRelConverter:createJoin* is calling > *RelOptUtil.pushDownJoinConditions* for optimization. And we can find below > conversion from *SqlNode* to *RelNode*: > {code:java} > SqlNode: > select * from A join B on A.x = B.x * 2 > RelNode (Logical-Plan): > Join (condition:col0=col1) > |-Project(x as col0) > | |-Scan(A) > |-Project(x * 2 as col1) > |-Scan(B){code} > As we can see the logical plan(*RelNode*) posted above is not the pure > reflection of the original SQL String(*SqlNode*). The optimization is mixed > into the phase on which AST is converted to Logical-Plan. Actually optimizing > rule of JoinPushExpressionsRule is doing exactly the same kind of thing. > Shall we just keep the optimization inside Optimized-Logical-Plan ? I mean > shall we avoid calling *RelOptUtil.pushDownJoinConditions* in > *SqlToRelConverter:createJoin* > I raised this issue because that we are doing something based on the > Logical-Plan. And it makes us really confused that the Logical-Plan doesn't > corresponds to SqlNode. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-4003) In MaterializationTest, FilterProjectTransposeRule matches with logical and physical convention
[ https://issues.apache.org/jira/browse/CALCITE-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17109081#comment-17109081 ] Jin Xing commented on CALCITE-4003: --- Hi, [~hyuan] , thanks for ping me ~ In my project, all optimizations are done during logical stage, including view matching optimization and common RelOptRule matching. In addition, RelNode transformations are totally independent between conventions. Before coming to Calcite world ~ I worked on Spark Sql for some time. LogicalPlan transformations and PhysicalPlan transformations are also seperated. "In case of multiple conventions, there will be a converter between different conventions.", I'm with you by this point. In our data lake scenario, we do federated query between several physical engines. We have a primary engine for distributed computation (in our case Spark). And converters will be transformed to data source, which ingest data into Spark from other engines. Common optimizing rules don't work across the converter. Cross convention rules are only needed for some special cases. I agree to keep the ability for cross convention rules for special cases, but for built-in transformation rules, I think we will gain a lot if we limit them in single convention, thus to save tons of unnecessary matching. > In MaterializationTest, FilterProjectTransposeRule matches with logical and > physical convention > --- > > Key: CALCITE-4003 > URL: https://issues.apache.org/jira/browse/CALCITE-4003 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Haisheng Yuan >Priority: Major > > In MaterializationTest.testMaterializationSubstitution2, > FilterProjectTransposeRule matches with logical and physical convention at > the same time, that means, LogicalFilter and EnumerableProject. We should > check and prevent this from happening. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3366) RelDecorrelator supports Union
[ https://issues.apache.org/jira/browse/CALCITE-3366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103065#comment-17103065 ] Jin Xing commented on CALCITE-3366: --- Thanks [~Juhwan] There's a PR there. You may take a look or comment. We can discuss if you have some idea on it :D > RelDecorrelator supports Union > -- > > Key: CALCITE-3366 > URL: https://issues.apache.org/jira/browse/CALCITE-3366 > Project: Calcite > Issue Type: Improvement >Reporter: Jin Xing >Priority: Major > Labels: pull-request-available > Time Spent: 3h 20m > Remaining Estimate: 0h > > This issue proposes to support decorrelation for below sql > {code:java} > SELECT deptno FROM dept where exists > (SELECT 1 FROM emp where sal < 100 and emp.deptno=dept.deptno > union all > SELECT 1 FROM emp where sal > 200 and emp.deptno=dept.deptno){code} > This issue was found when I resolve CALCITE-3363 in > https://github.com/apache/calcite/pull/1466 > I failed to construct an semi-join operator from SQL string. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3958) revise description of gradle.properties and SubstitutionVisitor
[ https://issues.apache.org/jira/browse/CALCITE-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17092551#comment-17092551 ] Jin Xing commented on CALCITE-3958: --- # Seems artifact.name and description in gradle.properties are wrong. # Regarding the SubstituionVisitor, do you mean the list of supported operators [1] ? [1] [https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/plan/SubstitutionVisitor.java#L116] > revise description of gradle.properties and SubstitutionVisitor > --- > > Key: CALCITE-3958 > URL: https://issues.apache.org/jira/browse/CALCITE-3958 > Project: Calcite > Issue Type: Wish >Reporter: xzh_dz >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > 1. revise cassandra and piglet gradle.properties > 2. revise doc of SubstitutionVisitor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3959) Implement INSTR function
[ https://issues.apache.org/jira/browse/CALCITE-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17092544#comment-17092544 ] Jin Xing commented on CALCITE-3959: --- Do we also need to update reference.md ? BTW, please link the PR into this Jira. > Implement INSTR function > > > Key: CALCITE-3959 > URL: https://issues.apache.org/jira/browse/CALCITE-3959 > Project: Calcite > Issue Type: Wish >Reporter: xzh_dz >Priority: Major > > Implement INSTR function -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (CALCITE-3950) Doc of SqlGroupingFunction contradicts its behavior
[ https://issues.apache.org/jira/browse/CALCITE-3950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jin Xing reassigned CALCITE-3950: - Assignee: Jin Xing > Doc of SqlGroupingFunction contradicts its behavior > --- > > Key: CALCITE-3950 > URL: https://issues.apache.org/jira/browse/CALCITE-3950 > Project: Calcite > Issue Type: Bug >Reporter: Jin Xing >Assignee: Jin Xing >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Currently doc of SqlGroupingFunctions says: > {code:java} > /** > * The {@code GROUPING} function. > * > * Accepts 1 or more arguments. > * Example: {@code GROUPING(deptno, gender)} returns > * 3 if both deptno and gender are being grouped, > * 2 if only deptno is being grouped, > * 1 if only gender is being groped, > * 0 if neither deptno nor gender are being grouped.{code} > But its behavior in agg.iq is as below: > {code:java} > # GROUPING in SELECT clause of CUBE query > select deptno, job, count(*) as c, grouping(deptno) as d, > grouping(job) j, grouping(deptno, job) as x > from "scott".emp > group by cube(deptno, job); > ++---++---+---+---+ > | DEPTNO | JOB | C | D | J | X | > ++---++---+---+---+ > | 10 | CLERK | 1 | 0 | 0 | 0 | > | 10 | MANAGER | 1 | 0 | 0 | 0 | > | 10 | PRESIDENT | 1 | 0 | 0 | 0 | > | 10 | | 3 | 0 | 1 | 1 | > | 20 | ANALYST | 2 | 0 | 0 | 0 | > | 20 | CLERK | 2 | 0 | 0 | 0 | > | 20 | MANAGER | 1 | 0 | 0 | 0 | > | 20 | | 5 | 0 | 1 | 1 | > | 30 | CLERK | 1 | 0 | 0 | 0 | > | 30 | MANAGER | 1 | 0 | 0 | 0 | > | 30 | SALESMAN | 4 | 0 | 0 | 0 | > | 30 | | 6 | 0 | 1 | 1 | > || ANALYST | 2 | 1 | 0 | 2 | > || CLERK | 4 | 1 | 0 | 2 | > || MANAGER | 3 | 1 | 0 | 2 | > || PRESIDENT | 1 | 1 | 0 | 2 | > || SALESMAN | 4 | 1 | 0 | 2 | > || | 14 | 1 | 1 | 3 | > ++---++---+---+---+ > (18 rows) > {code} > > The doc needs to be rectified thus to be consistent with query result and the > behavior of Hive[1] and PostgreSQL[2] > [1] > [https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup?spm=ata.13261165.0.0.528c6dfcXalQFy#EnhancedAggregation,Cube,GroupingandRollup-Groupingfunction] > [2] [https://www.postgresql.org/docs/9.5/functions-aggregate.html] > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CALCITE-3950) Doc of SqlGroupingFunction contradicts its behavior
[ https://issues.apache.org/jira/browse/CALCITE-3950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jin Xing updated CALCITE-3950: -- Summary: Doc of SqlGroupingFunction contradicts its behavior (was: Doc of SqlGroupingFunction contradicts with its behavior) > Doc of SqlGroupingFunction contradicts its behavior > --- > > Key: CALCITE-3950 > URL: https://issues.apache.org/jira/browse/CALCITE-3950 > Project: Calcite > Issue Type: Bug >Reporter: Jin Xing >Priority: Major > > Currently doc of SqlGroupingFunctions says: > {code:java} > /** > * The {@code GROUPING} function. > * > * Accepts 1 or more arguments. > * Example: {@code GROUPING(deptno, gender)} returns > * 3 if both deptno and gender are being grouped, > * 2 if only deptno is being grouped, > * 1 if only gender is being groped, > * 0 if neither deptno nor gender are being grouped.{code} > But its behavior in agg.iq is as below: > {code:java} > # GROUPING in SELECT clause of CUBE query > select deptno, job, count(*) as c, grouping(deptno) as d, > grouping(job) j, grouping(deptno, job) as x > from "scott".emp > group by cube(deptno, job); > ++---++---+---+---+ > | DEPTNO | JOB | C | D | J | X | > ++---++---+---+---+ > | 10 | CLERK | 1 | 0 | 0 | 0 | > | 10 | MANAGER | 1 | 0 | 0 | 0 | > | 10 | PRESIDENT | 1 | 0 | 0 | 0 | > | 10 | | 3 | 0 | 1 | 1 | > | 20 | ANALYST | 2 | 0 | 0 | 0 | > | 20 | CLERK | 2 | 0 | 0 | 0 | > | 20 | MANAGER | 1 | 0 | 0 | 0 | > | 20 | | 5 | 0 | 1 | 1 | > | 30 | CLERK | 1 | 0 | 0 | 0 | > | 30 | MANAGER | 1 | 0 | 0 | 0 | > | 30 | SALESMAN | 4 | 0 | 0 | 0 | > | 30 | | 6 | 0 | 1 | 1 | > || ANALYST | 2 | 1 | 0 | 2 | > || CLERK | 4 | 1 | 0 | 2 | > || MANAGER | 3 | 1 | 0 | 2 | > || PRESIDENT | 1 | 1 | 0 | 2 | > || SALESMAN | 4 | 1 | 0 | 2 | > || | 14 | 1 | 1 | 3 | > ++---++---+---+---+ > (18 rows) > {code} > > The doc needs to be rectified thus to be consistent with query result and the > behavior of Hive[1] and PostgreSQL[2] > [1] > [https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup?spm=ata.13261165.0.0.528c6dfcXalQFy#EnhancedAggregation,Cube,GroupingandRollup-Groupingfunction] > [2] [https://www.postgresql.org/docs/9.5/functions-aggregate.html] > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3952) Improve SortRemoveRule to remove Sort based on rowcount
[ https://issues.apache.org/jira/browse/CALCITE-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090431#comment-17090431 ] Jin Xing commented on CALCITE-3952: --- Yes, [~julianhyde], it should be getMaxRowCount :D > Improve SortRemoveRule to remove Sort based on rowcount > --- > > Key: CALCITE-3952 > URL: https://issues.apache.org/jira/browse/CALCITE-3952 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > > If a query is guaranteed to produce maximum one row it is safe to remove Sort > (along with limit). > Example: > {code:sql} > select count(*) cs from store_sales where ss_ext_sales_price > 100.00 order > by cs limit 100 > {code} > Although logically equivalent this can greatly benefit physical plans by > removing extra operator and avoiding unnecessary data transfer. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3952) Improve SortRemoveRule to remove Sort based on rowcount
[ https://issues.apache.org/jira/browse/CALCITE-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090322#comment-17090322 ] Jin Xing commented on CALCITE-3952: --- Given a rel which emit at most 1 row (RelMetadataQuery.getRowCount<=1), should RelMetadataQuery.collations(rel) match the Sort order ? If so the operator of Sort can be removed automatically. > Improve SortRemoveRule to remove Sort based on rowcount > --- > > Key: CALCITE-3952 > URL: https://issues.apache.org/jira/browse/CALCITE-3952 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > > If a query is guaranteed to produce maximum one row it is safe to remove Sort > (along with limit). > Example: > {code:sql} > select count(*) cs from store_sales where ss_ext_sales_price > 100.00 order > by cs limit 100 > {code} > Although logically equivalent this can greatly benefit physical plans by > removing extra operator and avoiding unnecessary data transfer. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Issue Comment Deleted] (CALCITE-3952) Improve SortRemoveRule to remove Sort based on rowcount
[ https://issues.apache.org/jira/browse/CALCITE-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jin Xing updated CALCITE-3952: -- Comment: was deleted (was: Given a rel which emit at most 1 row (RelMetadataQuery.getRowCount(rel) <=1), besides the optimization on Sort, seems we can also optimize other operators like Aggregate.) > Improve SortRemoveRule to remove Sort based on rowcount > --- > > Key: CALCITE-3952 > URL: https://issues.apache.org/jira/browse/CALCITE-3952 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > > If a query is guaranteed to produce maximum one row it is safe to remove Sort > (along with limit). > Example: > {code:sql} > select count(*) cs from store_sales where ss_ext_sales_price > 100.00 order > by cs limit 100 > {code} > Although logically equivalent this can greatly benefit physical plans by > removing extra operator and avoiding unnecessary data transfer. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3952) Improve SortRemoveRule to remove Sort based on rowcount
[ https://issues.apache.org/jira/browse/CALCITE-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090309#comment-17090309 ] Jin Xing commented on CALCITE-3952: --- Given a rel which emit at most 1 row (RelMetadataQuery.getRowCount(rel) <=1), besides the optimization on Sort, seems we can also optimize other operators like Aggregate. > Improve SortRemoveRule to remove Sort based on rowcount > --- > > Key: CALCITE-3952 > URL: https://issues.apache.org/jira/browse/CALCITE-3952 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Major > > If a query is guaranteed to produce maximum one row it is safe to remove Sort > (along with limit). > Example: > {code:sql} > select count(*) cs from store_sales where ss_ext_sales_price > 100.00 order > by cs limit 100 > {code} > Although logically equivalent this can greatly benefit physical plans by > removing extra operator and avoiding unnecessary data transfer. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3758) FilterTableScanRule generate wrong mapping for filter condition when underlying is BindableTableScan
[ https://issues.apache.org/jira/browse/CALCITE-3758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090185#comment-17090185 ] Jin Xing commented on CALCITE-3758: --- Thanks for merging ! > FilterTableScanRule generate wrong mapping for filter condition when > underlying is BindableTableScan > > > Key: CALCITE-3758 > URL: https://issues.apache.org/jira/browse/CALCITE-3758 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Jin Xing >Assignee: Jin Xing >Priority: Major > Labels: pull-request-available > Fix For: 1.23.0 > > Time Spent: 40m > Remaining Estimate: 0h > > When FilterTableScanRule push filter condition into > ProjectableFilterableTable, the filter condition should be adjusted to be > based on the internal table of BindableTableScan. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3936) RelToSqlConverter changes target of ambiguous HAVING clause with a Project on Filter on Aggregate
[ https://issues.apache.org/jira/browse/CALCITE-3936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090180#comment-17090180 ] Jin Xing commented on CALCITE-3936: --- Yes [~julianhyde], the generated SQL is invalid. We need to fix the bug when convert rel to sql. > RelToSqlConverter changes target of ambiguous HAVING clause with a Project on > Filter on Aggregate > - > > Key: CALCITE-3936 > URL: https://issues.apache.org/jira/browse/CALCITE-3936 > Project: Calcite > Issue Type: Bug >Reporter: Steven Talbot >Priority: Major > > ... for dialects with SqlConformance.isHavingAlias=false > Very, very similar to -CALCITE-3593.- > Reproducing test case in RelToSqlConverter: > {code:java} > @Test public void testHavingAlias2() { > final String query = "select \"product_id\" + 1, sum(\"gross_weight\") as > gross_weight\n" + > " from \"product\"\n" + > " group by \"product_id\"\n" + > " having sum(\"product\".\"gross_weight\") < 200"; > final String expected = "SELECT product_id + 1, GROSS_WEIGHT\n" + > "FROM (SELECT product_id, SUM(gross_weight) AS GROSS_WEIGHT\n" + > "FROM foodmart.product\n" + > "GROUP BY product_id\n" + > "HAVING SUM(product.gross_weight) < 200) AS t1" > // (or) "HAVING gross_weight < 200) AS t1" > // (or) ") AS t1\nWHERE t1.gross_weight < 200) AS t1" > // INSTEAD, we get "HAVING SUM(gross_weight) < 200) AS t1" > // which on BigQuery gives you an error about aggregating aggregates > ; > sql(query).withBigQuery().ok(expected); > } > {code} > In that one, the pattern was Project/Filter/Aggregate, here it is > Filter/Aggregate/Project. In 3593, the project created a new alias, which got > added to the same SELECT clause and caused the ambiguity. Here, the aggregate > creates an alias, but the filter will write a HAVING clause using the aliases > from before the Aggregate, and that will cause the SQL engine to think that > the filter is on the aggregate field, rather than on the underlying field. > Note that this is less an absurdly unlikely occurrence than it might seem > because when Calcite's default aliasing kicks in and everything gets the name > "$f6", "$f4", etc, so chances of a collision are higher if you have multiply > nested selects with default aliases. > Potential fixes: > # force a subselect, as was done for 3593. > # Force the expression in the HAVING to be fully aliased by table (works at > least in BigQuery, where I tested) > # Write the HAVING expression in terms of the aliases from the aggregate, > rather than what's coming from the aggregate (also works on BigQuery) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (CALCITE-3948) Improve operand's RelSubset matching handling in VolcanoRuleCall
[ https://issues.apache.org/jira/browse/CALCITE-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17089563#comment-17089563 ] Jin Xing edited comment on CALCITE-3948 at 4/22/20, 11:11 AM: -- [~Chunwei Lei] Yes, I've also found some PRs failed to be linked in Jira recently. I manually made a link for this Jira. was (Author: jinxing6...@126.com): [~Chunwei Lei] Yes, I've also found some PRs failed to be linked in Jira recently. > Improve operand's RelSubset matching handling in VolcanoRuleCall > > > Key: CALCITE-3948 > URL: https://issues.apache.org/jira/browse/CALCITE-3948 > Project: Calcite > Issue Type: Improvement >Reporter: Botong Huang >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > For operands matching for a RelSubset, more handling under various cases are > needed to be consistent in VolcanoRuleCall -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3948) Improve operand's RelSubset matching handling in VolcanoRuleCall
[ https://issues.apache.org/jira/browse/CALCITE-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17089563#comment-17089563 ] Jin Xing commented on CALCITE-3948: --- [~Chunwei Lei] Yes, I've also found some PRs failed to be linked in Jira recently. > Improve operand's RelSubset matching handling in VolcanoRuleCall > > > Key: CALCITE-3948 > URL: https://issues.apache.org/jira/browse/CALCITE-3948 > Project: Calcite > Issue Type: Improvement >Reporter: Botong Huang >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > For operands matching for a RelSubset, more handling under various cases are > needed to be consistent in VolcanoRuleCall -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CALCITE-3950) Doc of SqlGroupingFunction contradicts with its behavior
[ https://issues.apache.org/jira/browse/CALCITE-3950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jin Xing updated CALCITE-3950: -- Description: Currently doc of SqlGroupingFunctions says: {code:java} /** * The {@code GROUPING} function. * * Accepts 1 or more arguments. * Example: {@code GROUPING(deptno, gender)} returns * 3 if both deptno and gender are being grouped, * 2 if only deptno is being grouped, * 1 if only gender is being groped, * 0 if neither deptno nor gender are being grouped.{code} But its behavior in agg.iq is as below: {code:java} # GROUPING in SELECT clause of CUBE query select deptno, job, count(*) as c, grouping(deptno) as d, grouping(job) j, grouping(deptno, job) as x from "scott".emp group by cube(deptno, job); ++---++---+---+---+ | DEPTNO | JOB | C | D | J | X | ++---++---+---+---+ | 10 | CLERK | 1 | 0 | 0 | 0 | | 10 | MANAGER | 1 | 0 | 0 | 0 | | 10 | PRESIDENT | 1 | 0 | 0 | 0 | | 10 | | 3 | 0 | 1 | 1 | | 20 | ANALYST | 2 | 0 | 0 | 0 | | 20 | CLERK | 2 | 0 | 0 | 0 | | 20 | MANAGER | 1 | 0 | 0 | 0 | | 20 | | 5 | 0 | 1 | 1 | | 30 | CLERK | 1 | 0 | 0 | 0 | | 30 | MANAGER | 1 | 0 | 0 | 0 | | 30 | SALESMAN | 4 | 0 | 0 | 0 | | 30 | | 6 | 0 | 1 | 1 | || ANALYST | 2 | 1 | 0 | 2 | || CLERK | 4 | 1 | 0 | 2 | || MANAGER | 3 | 1 | 0 | 2 | || PRESIDENT | 1 | 1 | 0 | 2 | || SALESMAN | 4 | 1 | 0 | 2 | || | 14 | 1 | 1 | 3 | ++---++---+---+---+ (18 rows) {code} The doc need to be rectified thus to be consistent with query result and the behavior of Hive[1] and PostgreSQL[2] [1] [https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup?spm=ata.13261165.0.0.528c6dfcXalQFy#EnhancedAggregation,Cube,GroupingandRollup-Groupingfunction] [2] [https://www.postgresql.org/docs/9.5/functions-aggregate.html] > Doc of SqlGroupingFunction contradicts with its behavior > > > Key: CALCITE-3950 > URL: https://issues.apache.org/jira/browse/CALCITE-3950 > Project: Calcite > Issue Type: Bug >Reporter: Jin Xing >Priority: Major > > Currently doc of SqlGroupingFunctions says: > {code:java} > /** > * The {@code GROUPING} function. > * > * Accepts 1 or more arguments. > * Example: {@code GROUPING(deptno, gender)} returns > * 3 if both deptno and gender are being grouped, > * 2 if only deptno is being grouped, > * 1 if only gender is being groped, > * 0 if neither deptno nor gender are being grouped.{code} > But its behavior in agg.iq is as below: > {code:java} > # GROUPING in SELECT clause of CUBE query > select deptno, job, count(*) as c, grouping(deptno) as d, > grouping(job) j, grouping(deptno, job) as x > from "scott".emp > group by cube(deptno, job); > ++---++---+---+---+ > | DEPTNO | JOB | C | D | J | X | > ++---++---+---+---+ > | 10 | CLERK | 1 | 0 | 0 | 0 | > | 10 | MANAGER | 1 | 0 | 0 | 0 | > | 10 | PRESIDENT | 1 | 0 | 0 | 0 | > | 10 | | 3 | 0 | 1 | 1 | > | 20 | ANALYST | 2 | 0 | 0 | 0 | > | 20 | CLERK | 2 | 0 | 0 | 0 | > | 20 | MANAGER | 1 | 0 | 0 | 0 | > | 20 | | 5 | 0 | 1 | 1 | > | 30 | CLERK | 1 | 0 | 0 | 0 | > | 30 | MANAGER | 1 | 0 | 0 | 0 | > | 30 | SALESMAN | 4 | 0 | 0 | 0 | > | 30 | | 6 | 0 | 1 | 1 | > || ANALYST | 2 | 1 | 0 | 2 | > || CLERK | 4 | 1 | 0 | 2 | > || MANAGER | 3 | 1 | 0 | 2 | > || PRESIDENT | 1 | 1 | 0 | 2 | > || SALESMAN | 4 | 1 | 0 | 2 | > || | 14 | 1 | 1 | 3 | > ++---++---+---+---+ > (18 rows) > {code} > > The doc need to be rectified thus to be consistent with query result and the > behavior of Hive[1] and PostgreSQL[2] > [1] > [https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup?spm=ata.13261165.0.0.528c6dfcXalQFy#EnhancedAggregation,Cube,GroupingandRollup-Groupingfunction] > [2] [https://www.postgresql.org/docs/9.5/functions-aggregate.html] > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CALCITE-3950) Doc of SqlGroupingFunction contradicts with its behavior
[ https://issues.apache.org/jira/browse/CALCITE-3950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jin Xing updated CALCITE-3950: -- Description: Currently doc of SqlGroupingFunctions says: {code:java} /** * The {@code GROUPING} function. * * Accepts 1 or more arguments. * Example: {@code GROUPING(deptno, gender)} returns * 3 if both deptno and gender are being grouped, * 2 if only deptno is being grouped, * 1 if only gender is being groped, * 0 if neither deptno nor gender are being grouped.{code} But its behavior in agg.iq is as below: {code:java} # GROUPING in SELECT clause of CUBE query select deptno, job, count(*) as c, grouping(deptno) as d, grouping(job) j, grouping(deptno, job) as x from "scott".emp group by cube(deptno, job); ++---++---+---+---+ | DEPTNO | JOB | C | D | J | X | ++---++---+---+---+ | 10 | CLERK | 1 | 0 | 0 | 0 | | 10 | MANAGER | 1 | 0 | 0 | 0 | | 10 | PRESIDENT | 1 | 0 | 0 | 0 | | 10 | | 3 | 0 | 1 | 1 | | 20 | ANALYST | 2 | 0 | 0 | 0 | | 20 | CLERK | 2 | 0 | 0 | 0 | | 20 | MANAGER | 1 | 0 | 0 | 0 | | 20 | | 5 | 0 | 1 | 1 | | 30 | CLERK | 1 | 0 | 0 | 0 | | 30 | MANAGER | 1 | 0 | 0 | 0 | | 30 | SALESMAN | 4 | 0 | 0 | 0 | | 30 | | 6 | 0 | 1 | 1 | || ANALYST | 2 | 1 | 0 | 2 | || CLERK | 4 | 1 | 0 | 2 | || MANAGER | 3 | 1 | 0 | 2 | || PRESIDENT | 1 | 1 | 0 | 2 | || SALESMAN | 4 | 1 | 0 | 2 | || | 14 | 1 | 1 | 3 | ++---++---+---+---+ (18 rows) {code} The doc needs to be rectified thus to be consistent with query result and the behavior of Hive[1] and PostgreSQL[2] [1] [https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup?spm=ata.13261165.0.0.528c6dfcXalQFy#EnhancedAggregation,Cube,GroupingandRollup-Groupingfunction] [2] [https://www.postgresql.org/docs/9.5/functions-aggregate.html] was: Currently doc of SqlGroupingFunctions says: {code:java} /** * The {@code GROUPING} function. * * Accepts 1 or more arguments. * Example: {@code GROUPING(deptno, gender)} returns * 3 if both deptno and gender are being grouped, * 2 if only deptno is being grouped, * 1 if only gender is being groped, * 0 if neither deptno nor gender are being grouped.{code} But its behavior in agg.iq is as below: {code:java} # GROUPING in SELECT clause of CUBE query select deptno, job, count(*) as c, grouping(deptno) as d, grouping(job) j, grouping(deptno, job) as x from "scott".emp group by cube(deptno, job); ++---++---+---+---+ | DEPTNO | JOB | C | D | J | X | ++---++---+---+---+ | 10 | CLERK | 1 | 0 | 0 | 0 | | 10 | MANAGER | 1 | 0 | 0 | 0 | | 10 | PRESIDENT | 1 | 0 | 0 | 0 | | 10 | | 3 | 0 | 1 | 1 | | 20 | ANALYST | 2 | 0 | 0 | 0 | | 20 | CLERK | 2 | 0 | 0 | 0 | | 20 | MANAGER | 1 | 0 | 0 | 0 | | 20 | | 5 | 0 | 1 | 1 | | 30 | CLERK | 1 | 0 | 0 | 0 | | 30 | MANAGER | 1 | 0 | 0 | 0 | | 30 | SALESMAN | 4 | 0 | 0 | 0 | | 30 | | 6 | 0 | 1 | 1 | || ANALYST | 2 | 1 | 0 | 2 | || CLERK | 4 | 1 | 0 | 2 | || MANAGER | 3 | 1 | 0 | 2 | || PRESIDENT | 1 | 1 | 0 | 2 | || SALESMAN | 4 | 1 | 0 | 2 | || | 14 | 1 | 1 | 3 | ++---++---+---+---+ (18 rows) {code} The doc need to be rectified thus to be consistent with query result and the behavior of Hive[1] and PostgreSQL[2] [1] [https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup?spm=ata.13261165.0.0.528c6dfcXalQFy#EnhancedAggregation,Cube,GroupingandRollup-Groupingfunction] [2] [https://www.postgresql.org/docs/9.5/functions-aggregate.html] > Doc of SqlGroupingFunction contradicts with its behavior > > > Key: CALCITE-3950 > URL: https://issues.apache.org/jira/browse/CALCITE-3950 > Project: Calcite > Issue Type: Bug >Reporter: Jin Xing >Priority: Major > > Currently doc of SqlGroupingFunctions says: > {code:java} > /** > * The {@code GROUPING} function. > * > * Accepts 1 or more arguments. > * Example: {@code GROUPING(deptno, gender)} returns > * 3 if both deptno and gender are being grouped, > * 2 if only deptno is being grouped, > * 1 if only gender is being groped, > * 0 if neither deptno nor gender are being grouped.{code} > But its behavior in agg.iq is as below: > {code:java} > # GROUPING in SELECT clause of CUBE query > select deptno, job, count(*) as c, grouping(deptno) as d, > grouping(job) j, grouping(deptno, job) as x > from "scott".emp > g
[jira] [Created] (CALCITE-3950) Doc of SqlGroupingFunction contradicts with its behavior
Jin Xing created CALCITE-3950: - Summary: Doc of SqlGroupingFunction contradicts with its behavior Key: CALCITE-3950 URL: https://issues.apache.org/jira/browse/CALCITE-3950 Project: Calcite Issue Type: Bug Reporter: Jin Xing -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3946) Add parser support for MULTISET/SET and VOLATILE modifiers in CREATE TABLE statements
[ https://issues.apache.org/jira/browse/CALCITE-3946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17089321#comment-17089321 ] Jin Xing commented on CALCITE-3946: --- Shall we go by PostgreSQL grammar [1] ? For MULTISET/SET, we can use "table constraint"; For VOLATILE, we can use "temporary"; If we bring these modifiers in, besides the change in parser layer, we should also add additional functionalities. (e.g. drop the temporary table when end of user session and check duplication when insert values into a table marked as UNIQUE) . [1] [https://www.postgresql.org/docs/current/sql-createtable.html] > Add parser support for MULTISET/SET and VOLATILE modifiers in CREATE TABLE > statements > - > > Key: CALCITE-3946 > URL: https://issues.apache.org/jira/browse/CALCITE-3946 > Project: Calcite > Issue Type: Improvement > Components: babel >Affects Versions: 1.22.0 >Reporter: dasch >Priority: Major > > Add support to Calcite's Babel parser for MULTISET/SET and VOLATILE modifiers > in CREATE TABLE statements. > The syntax for these statements is: > CREATE TABLE [SET|MULTISET] [VOLATILE] [IF NOT EXISTS] > ( , ...); -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3928) Canonicalization doesn't do field trimming before materialized view matching
[ https://issues.apache.org/jira/browse/CALCITE-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17089277#comment-17089277 ] Jin Xing commented on CALCITE-3928: --- [~xzh_dz] Thanks for PR ~ comments left. > Canonicalization doesn't do field trimming before materialized view matching > > > Key: CALCITE-3928 > URL: https://issues.apache.org/jira/browse/CALCITE-3928 > Project: Calcite > Issue Type: Bug >Reporter: Jin Xing >Priority: Major > Labels: materializedviews > Time Spent: 0.5h > Remaining Estimate: 0h > > If we have query and materialized view as below: > {code:java} > query: > LogicalAggregate(group=[{4}], agg#0=[COUNT($3)]) > LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], SAL=[$5], DEPTNO=[$7]) > LogicalTableScan(table=[[scott, EMP]]) > mv logic: > LogicalAggregate(group=[{1}], agg#0=[COUNT($0)]) > LogicalProject(SAL=[$5], DEPTNO=[$7]) > LogicalTableScan(table=[[scott, EMP]]) > {code} > The semantics of query and mv logic are the same. Materialized view matching > failed, because field trimming is not done when canonicalizing the plans. > Currently Calcite does field trimming when convert sql to rel. But my > company's internal system does materialization detection – – generates & > transforms & stores the RelNode. > Shall we add the field trimming when canonicalizing materialized view logic? > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CALCITE-3928) Canonicalization doesn't do field trimming before materialized view matching
[ https://issues.apache.org/jira/browse/CALCITE-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jin Xing updated CALCITE-3928: -- Description: If we have query and materialized view as below: {code:java} query: LogicalAggregate(group=[{4}], agg#0=[COUNT($3)]) LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], SAL=[$5], DEPTNO=[$7]) LogicalTableScan(table=[[scott, EMP]]) mv logic: LogicalAggregate(group=[{1}], agg#0=[COUNT($0)]) LogicalProject(SAL=[$5], DEPTNO=[$7]) LogicalTableScan(table=[[scott, EMP]]) {code} The semantics of query and mv logic are the same. Materialized view matching failed, because field trimming is not done when canonicalizing the plans. Currently Calcite does field trimming when convert sql to rel. But my company's internal system does materialization detection – – generates & transforms & stores the RelNode. Shall we add the field trimming when canonicalizing materialized view logic? was: If we have query and materialized view as below: {code:java} query: LogicalAggregate(group=[{0}], EXPR$1=[afunc($1, $1)]) LogicalProject(a=$0, b=[bfunc($1)]) LogicalTableScan(table=[[default, user_table]]) mv: LogicalAggregate(group=[{0}], EXPR$1=[afunc($1, $2)]) LogicalProject(a=$0, b=[bfunc($1)], c=[bfunc($1)]) LogicalTableScan(table=[[default, user_table]]) {code} The semantics of query and mv logic are the same. Materialized view matching failed, because field trimming is not done when canonicalizing the plans. Currently Calcite does field trimming when convert sql to rel. But my company's internal system does materialization detection – – generates & transforms & stores the RelNode. Shall we add the field trimming when canonicalizing materialized view logic? > Canonicalization doesn't do field trimming before materialized view matching > > > Key: CALCITE-3928 > URL: https://issues.apache.org/jira/browse/CALCITE-3928 > Project: Calcite > Issue Type: Bug >Reporter: Jin Xing >Priority: Major > Labels: materializedviews > > If we have query and materialized view as below: > {code:java} > query: > LogicalAggregate(group=[{4}], agg#0=[COUNT($3)]) > LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], SAL=[$5], DEPTNO=[$7]) > LogicalTableScan(table=[[scott, EMP]]) > mv logic: > LogicalAggregate(group=[{1}], agg#0=[COUNT($0)]) > LogicalProject(SAL=[$5], DEPTNO=[$7]) > LogicalTableScan(table=[[scott, EMP]]) > {code} > The semantics of query and mv logic are the same. Materialized view matching > failed, because field trimming is not done when canonicalizing the plans. > Currently Calcite does field trimming when convert sql to rel. But my > company's internal system does materialization detection – – generates & > transforms & stores the RelNode. > Shall we add the field trimming when canonicalizing materialized view logic? > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3928) Canonicalization doesn't do field trimming before materialized view matching
[ https://issues.apache.org/jira/browse/CALCITE-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17089249#comment-17089249 ] Jin Xing commented on CALCITE-3928: --- The case I mentioned is not accurate, [~xzh_dz] I updated your case into Jira description. > Canonicalization doesn't do field trimming before materialized view matching > > > Key: CALCITE-3928 > URL: https://issues.apache.org/jira/browse/CALCITE-3928 > Project: Calcite > Issue Type: Bug >Reporter: Jin Xing >Priority: Major > Labels: materializedviews > > If we have query and materialized view as below: > {code:java} > query: > LogicalAggregate(group=[{0}], EXPR$1=[afunc($1, $1)]) > LogicalProject(a=$0, b=[bfunc($1)]) > LogicalTableScan(table=[[default, user_table]]) > mv: > LogicalAggregate(group=[{0}], EXPR$1=[afunc($1, $2)]) > LogicalProject(a=$0, b=[bfunc($1)], c=[bfunc($1)]) > LogicalTableScan(table=[[default, user_table]]) > {code} > The semantics of query and mv logic are the same. Materialized view matching > failed, because field trimming is not done when canonicalizing the plans. > Currently Calcite does field trimming when convert sql to rel. But my > company's internal system does materialization detection – – generates & > transforms & stores the RelNode. > Shall we add the field trimming when canonicalizing materialized view logic? > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3925) Support materialized view matching for LogicalCorrelate in SubsitutionVisitor
[ https://issues.apache.org/jira/browse/CALCITE-3925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17088631#comment-17088631 ] Jin Xing commented on CALCITE-3925: --- [~xzh_dz] Thanks a lot for PR, comments left ~ > Support materialized view matching for LogicalCorrelate in SubsitutionVisitor > - > > Key: CALCITE-3925 > URL: https://issues.apache.org/jira/browse/CALCITE-3925 > Project: Calcite > Issue Type: Bug >Reporter: Jin Xing >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > LogicalCorrelate is widely used in user SQLs to performs nested-loop join. > We need to support substitution based materialized view matching for it. The > matching algorithm should be able to do Project & Filter compensation when > matching query with materialized view. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (CALCITE-3447) Fix equivalents in method SubstitutionVisitor#go
[ https://issues.apache.org/jira/browse/CALCITE-3447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jin Xing reassigned CALCITE-3447: - Assignee: Jin Xing > Fix equivalents in method SubstitutionVisitor#go > > > Key: CALCITE-3447 > URL: https://issues.apache.org/jira/browse/CALCITE-3447 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: daimin >Assignee: Jin Xing >Priority: Minor > Labels: pull-request-available > Fix For: 1.23.0 > > Time Spent: 7h > Remaining Estimate: 0h > > Code segment here depends on `hashcode` and `equals` methods of class > `MutableRel`: > [https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/plan/SubstitutionVisitor.java#L492-L502] > > However the implementations of class `MutableScan` delegates to class > `TableScan`, which directly relies on implementations of class `Object`. This > leads to a situation that two `MutableScan` on the exactly same table will > not be considered as equivalent. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3938) implement visit LogicalCalc in RelShuttleImpl
[ https://issues.apache.org/jira/browse/CALCITE-3938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17086999#comment-17086999 ] Jin Xing commented on CALCITE-3938: --- How about rename the title as "Support LogicalCalc in RelShuttle" just as CALCITE-3607 ? > implement visit LogicalCalc in RelShuttleImpl > - > > Key: CALCITE-3938 > URL: https://issues.apache.org/jira/browse/CALCITE-3938 > Project: Calcite > Issue Type: Wish >Reporter: xzh_dz >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > visit LogicalCalc in RelShuttleImpl -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3936) RelToSqlConverter changes target of ambiguous HAVING clause with a Project on Filter on Aggregate
[ https://issues.apache.org/jira/browse/CALCITE-3936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17086382#comment-17086382 ] Jin Xing commented on CALCITE-3936: --- Add a simpler test case: {code:java} // Sql: isHaving = false SELECT DEPTNO, SUM(SAL) SAL FROM EMP GROUP BY DEPTNO HAVING SUM(SAL) > 100 // Plan: LogicalFilter(condition=[>($1, 100)]) LogicalAggregate(group=[{0}], SAL=[SUM($1)]) LogicalProject(DEPTNO=[$7], SAL=[$5]) JdbcTableScan(table=[[JDBC_SCOTT, EMP]]) // Converted: isHaving = true SELECT DEPTNO, SUM(SAL) AS SAL FROM SCOTT.EMP GROUP BY DEPTNO HAVING SUM(SAL) > 100{code} [1] Calcite doesn't check if the filtering clause conflicts with the alias in Aggregate and convert the filtering condition directly, that's where the bug comes from; We need to add a checking logic when converting a Filter/Aggregate pattern. BTW, [~swtalbot], in the Jira title, do you mean "... for dialects with SqlConformance.isHavingAlias=true" ? I think the issue happen when parse the sql with config (isHaving=false) and convert rel with config (isHaving=true). [1] [https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/rel2sql/RelToSqlConverter.java#L315] > RelToSqlConverter changes target of ambiguous HAVING clause with a Project on > Filter on Aggregate > - > > Key: CALCITE-3936 > URL: https://issues.apache.org/jira/browse/CALCITE-3936 > Project: Calcite > Issue Type: Bug >Reporter: Steven Talbot >Priority: Major > > ... for dialects with SqlConformance.isHavingAlias=false > Very, very similar to -CALCITE-3593.- > Reproducing test case in RelToSqlConverter: > {code:java} > @Test public void testHavingAlias2() { > final String query = "select \"product_id\" + 1, sum(\"gross_weight\") as > gross_weight\n" + > " from \"product\"\n" + > " group by \"product_id\"\n" + > " having sum(\"product\".\"gross_weight\") < 200"; > final String expected = "SELECT product_id + 1, GROSS_WEIGHT\n" + > "FROM (SELECT product_id, SUM(gross_weight) AS GROSS_WEIGHT\n" + > "FROM foodmart.product\n" + > "GROUP BY product_id\n" + > "HAVING SUM(product.gross_weight) < 200) AS t1" > // (or) "HAVING gross_weight < 200) AS t1" > // (or) ") AS t1\nWHERE t1.gross_weight < 200) AS t1" > // INSTEAD, we get "HAVING SUM(gross_weight) < 200) AS t1" > // which on BigQuery gives you an error about aggregating aggregates > ; > sql(query).withBigQuery().ok(expected); > } > {code} > In that one, the pattern was Project/Filter/Aggregate, here it is > Filter/Aggregate/Project. In 3593, the project created a new alias, which got > added to the same SELECT clause and caused the ambiguity. Here, the aggregate > creates an alias, but the filter will write a HAVING clause using the aliases > from before the Aggregate, and that will cause the SQL engine to think that > the filter is on the aggregate field, rather than on the underlying field. > Note that this is less an absurdly unlikely occurrence than it might seem > because when Calcite's default aliasing kicks in and everything gets the name > "$f6", "$f4", etc, so chances of a collision are higher if you have multiply > nested selects with default aliases. > Potential fixes: > # force a subselect, as was done for 3593. > # Force the expression in the HAVING to be fully aliased by table (works at > least in BigQuery, where I tested) > # Write the HAVING expression in terms of the aliases from the aggregate, > rather than what's coming from the aggregate (also works on BigQuery) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (CALCITE-3935) Materialization-Failed, when querying with LeftJoinWithFilter
[ https://issues.apache.org/jira/browse/CALCITE-3935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085707#comment-17085707 ] Jin Xing edited comment on CALCITE-3935 at 4/17/20, 12:45 PM: -- Thanks ! I really appreciate your fix ~ I can take a review when it's ready :D was (Author: jinxing6...@126.com): Thanks ! I really appreciate your fix ~ I can take a review when it's ready :D - Jin > Materialization-Failed, when querying with LeftJoinWithFilter > - > > Key: CALCITE-3935 > URL: https://issues.apache.org/jira/browse/CALCITE-3935 > Project: Calcite > Issue Type: Bug >Reporter: Xurenhe >Priority: Major > Attachments: Jietu20200417-200532.png > > > {code:java} > @Test public void testJoinOnLeftProjectWithFilterToJoin() { > String mv = "" > + "select \"emps\".\"empid\", \"depts\".\"name\", > \"emps\".\"salary\"\n" + > "from \"emps\"\n" + > "left join \"depts\"\n" + > "on \"emps\".\"deptno\" = \"depts\".\"deptno\"\n" > + "where \"emps\".\"empid\" > 10"; > String query = "" > + "select \"emps\".\"empid\", \"depts\".\"name\", > \"emps\".\"salary\"\n" + > "from \"emps\"\n" + > "left join \"depts\"\n" + > "on \"emps\".\"deptno\" = \"depts\".\"deptno\"\n" > + "where \"emps\".\"empid\" > 40"; > sql(mv, query).withOnlyBySubstitution(true).ok(); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3935) Materialization-Failed, when querying with LeftJoinWithFilter
[ https://issues.apache.org/jira/browse/CALCITE-3935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085707#comment-17085707 ] Jin Xing commented on CALCITE-3935: --- Thanks ! I really appreciate your fix ~ I can take a review when it's ready :D - Jin > Materialization-Failed, when querying with LeftJoinWithFilter > - > > Key: CALCITE-3935 > URL: https://issues.apache.org/jira/browse/CALCITE-3935 > Project: Calcite > Issue Type: Bug >Reporter: Xurenhe >Priority: Major > Attachments: Jietu20200417-200532.png > > > {code:java} > @Test public void testJoinOnLeftProjectWithFilterToJoin() { > String mv = "" > + "select \"emps\".\"empid\", \"depts\".\"name\", > \"emps\".\"salary\"\n" + > "from \"emps\"\n" + > "left join \"depts\"\n" + > "on \"emps\".\"deptno\" = \"depts\".\"deptno\"\n" > + "where \"emps\".\"empid\" > 10"; > String query = "" > + "select \"emps\".\"empid\", \"depts\".\"name\", > \"emps\".\"salary\"\n" + > "from \"emps\"\n" + > "left join \"depts\"\n" + > "on \"emps\".\"deptno\" = \"depts\".\"deptno\"\n" > + "where \"emps\".\"empid\" > 40"; > sql(mv, query).withOnlyBySubstitution(true).ok(); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (CALCITE-3935) Materialization-Failed, when querying with LeftJoinWithFilter
[ https://issues.apache.org/jira/browse/CALCITE-3935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085701#comment-17085701 ] Jin Xing edited comment on CALCITE-3935 at 4/17/20, 12:37 PM: -- Thanks for reporting ! Seems the check in [1] is too strict, we should allow compensating Calc from the left child node when LEFT JOIN in JoinOnLeftCalcToJoinUnifyRule. BTW I think the issue can also happen in JoinOnRightCalcToJoinUnifyRule [[1] https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/plan/SubstitutionVisitor.java#L1204|https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/plan/SubstitutionVisitor.java#L1204] was (Author: jinxing6...@126.com): Seems the check in [1] is too strict, we should allow compensating Calc from the left child node when LEFT JOIN in JoinOnLeftCalcToJoinUnifyRule. BTW I think the issue can also happen in JoinOnRightCalcToJoinUnifyRule [[1] https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/plan/SubstitutionVisitor.java#L1204|https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/plan/SubstitutionVisitor.java#L1204] > Materialization-Failed, when querying with LeftJoinWithFilter > - > > Key: CALCITE-3935 > URL: https://issues.apache.org/jira/browse/CALCITE-3935 > Project: Calcite > Issue Type: Bug >Reporter: Xurenhe >Priority: Major > Attachments: Jietu20200417-200532.png > > > {code:java} > @Test public void testJoinOnLeftProjectWithFilterToJoin() { > String mv = "" > + "select \"emps\".\"empid\", \"depts\".\"name\", > \"emps\".\"salary\"\n" + > "from \"emps\"\n" + > "left join \"depts\"\n" + > "on \"emps\".\"deptno\" = \"depts\".\"deptno\"\n" > + "where \"emps\".\"empid\" > 10"; > String query = "" > + "select \"emps\".\"empid\", \"depts\".\"name\", > \"emps\".\"salary\"\n" + > "from \"emps\"\n" + > "left join \"depts\"\n" + > "on \"emps\".\"deptno\" = \"depts\".\"deptno\"\n" > + "where \"emps\".\"empid\" > 40"; > sql(mv, query).withOnlyBySubstitution(true).ok(); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3935) Materialization-Failed, when querying with LeftJoinWithFilter
[ https://issues.apache.org/jira/browse/CALCITE-3935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17085701#comment-17085701 ] Jin Xing commented on CALCITE-3935: --- Seems the check in [1] is too strict, we should allow compensating Calc from the left child node when LEFT JOIN in JoinOnLeftCalcToJoinUnifyRule. BTW I think the issue can also happen in JoinOnRightCalcToJoinUnifyRule [[1] https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/plan/SubstitutionVisitor.java#L1204|https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/plan/SubstitutionVisitor.java#L1204] > Materialization-Failed, when querying with LeftJoinWithFilter > - > > Key: CALCITE-3935 > URL: https://issues.apache.org/jira/browse/CALCITE-3935 > Project: Calcite > Issue Type: Bug >Reporter: Xurenhe >Priority: Major > Attachments: Jietu20200417-200532.png > > > {code:java} > @Test public void testJoinOnLeftProjectWithFilterToJoin() { > String mv = "" > + "select \"emps\".\"empid\", \"depts\".\"name\", > \"emps\".\"salary\"\n" + > "from \"emps\"\n" + > "left join \"depts\"\n" + > "on \"emps\".\"deptno\" = \"depts\".\"deptno\"\n" > + "where \"emps\".\"empid\" > 10"; > String query = "" > + "select \"emps\".\"empid\", \"depts\".\"name\", > \"emps\".\"salary\"\n" + > "from \"emps\"\n" + > "left join \"depts\"\n" + > "on \"emps\".\"deptno\" = \"depts\".\"deptno\"\n" > + "where \"emps\".\"empid\" > 40"; > sql(mv, query).withOnlyBySubstitution(true).ok(); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CALCITE-3928) Canonicalization doesn't do field trimming before materialized view matching
[ https://issues.apache.org/jira/browse/CALCITE-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jin Xing updated CALCITE-3928: -- Summary: Canonicalization doesn't do field trimming before materialized view matching (was: Canonicalization doesn't do field trimming before materialized matching) > Canonicalization doesn't do field trimming before materialized view matching > > > Key: CALCITE-3928 > URL: https://issues.apache.org/jira/browse/CALCITE-3928 > Project: Calcite > Issue Type: Bug >Reporter: Jin Xing >Priority: Major > > If we have query and materialized view as below: > {code:java} > query: > LogicalAggregate(group=[{0}], EXPR$1=[afunc($1, $1)]) > LogicalProject(a=$0, b=[bfunc($1)]) > LogicalTableScan(table=[[default, user_table]]) > mv: > LogicalAggregate(group=[{0}], EXPR$1=[afunc($1, $2)]) > LogicalProject(a=$0, b=[bfunc($1)], c=[bfunc($1)]) > LogicalTableScan(table=[[default, user_table]]) > {code} > The semantics of query and mv logic are the same. Materialized view matching > failed, because field trimming is not done when canonicalizing the plans. > Currently Calcite does field trimming when convert sql to rel. But my > company's internal system does materialization detection – – generates & > transforms & stores the RelNode. > Shall we add the field trimming when canonicalizing materialized view logic? > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CALCITE-3928) Canonicalization doesn't do field trimming before materialized matching
[ https://issues.apache.org/jira/browse/CALCITE-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jin Xing updated CALCITE-3928: -- Description: If we have query and materialized view as below: {code:java} query: LogicalAggregate(group=[{0}], EXPR$1=[afunc($1, $1)]) LogicalProject(a=$0, b=[bfunc($1)]) LogicalTableScan(table=[[default, user_table]]) mv: LogicalAggregate(group=[{0}], EXPR$1=[afunc($1, $2)]) LogicalProject(a=$0, b=[bfunc($1)], c=[bfunc($1)]) LogicalTableScan(table=[[default, user_table]]) {code} The semantics of query and mv logic are the same. Materialized view matching failed, because field trimming is not done when canonicalizing the plans. Currently Calcite does field trimming when convert sql to rel. But my company's internal system does materialization detection – – generates & transforms & stores the RelNode. Shall we add the field trimming when canonicalizing materialized view logic? > Canonicalization doesn't do field trimming before materialized matching > --- > > Key: CALCITE-3928 > URL: https://issues.apache.org/jira/browse/CALCITE-3928 > Project: Calcite > Issue Type: Bug >Reporter: Jin Xing >Priority: Major > > If we have query and materialized view as below: > {code:java} > query: > LogicalAggregate(group=[{0}], EXPR$1=[afunc($1, $1)]) > LogicalProject(a=$0, b=[bfunc($1)]) > LogicalTableScan(table=[[default, user_table]]) > mv: > LogicalAggregate(group=[{0}], EXPR$1=[afunc($1, $2)]) > LogicalProject(a=$0, b=[bfunc($1)], c=[bfunc($1)]) > LogicalTableScan(table=[[default, user_table]]) > {code} > The semantics of query and mv logic are the same. Materialized view matching > failed, because field trimming is not done when canonicalizing the plans. > Currently Calcite does field trimming when convert sql to rel. But my > company's internal system does materialization detection – – generates & > transforms & stores the RelNode. > Shall we add the field trimming when canonicalizing materialized view logic? > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3928) Canonicalization doesn't do field trimming before materialized matching
Jin Xing created CALCITE-3928: - Summary: Canonicalization doesn't do field trimming before materialized matching Key: CALCITE-3928 URL: https://issues.apache.org/jira/browse/CALCITE-3928 Project: Calcite Issue Type: Bug Reporter: Jin Xing -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3925) Support materialized view matching for LogicalCorrelate in SubsitutionVisitor
[ https://issues.apache.org/jira/browse/CALCITE-3925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17084064#comment-17084064 ] Jin Xing commented on CALCITE-3925: --- Sure ~ You can refer to matching implementation for Join when working on this one. > Support materialized view matching for LogicalCorrelate in SubsitutionVisitor > - > > Key: CALCITE-3925 > URL: https://issues.apache.org/jira/browse/CALCITE-3925 > Project: Calcite > Issue Type: Bug >Reporter: Jin Xing >Priority: Major > > LogicalCorrelate is widely used in user SQLs to performs nested-loop join. > We need to support substitution based materialized view matching for it. The > matching algorithm should be able to do Project & Filter compensation when > matching query with materialized view. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CALCITE-3758) FilterTableScanRule generate wrong mapping for filter condition when underlying is BindableTableScan
[ https://issues.apache.org/jira/browse/CALCITE-3758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jin Xing updated CALCITE-3758: -- Description: When FilterTableScanRule push filter condition into ProjectableFilterableTable, the filter condition should be adjusted to be based on the internal table of BindableTableScan. (was: When FilterTableScanRule push filter condition into ProjectableFilterableTable, the filter condition should be adjusted to be based on the internal table of BindableTableScan. This line[1] is rectified as *Mappings.source* [1][https://github.com/apache/calcite/blob/571731b80a58eb095ebac7123285c375e7afff90/core/src/main/java/org/apache/calcite/rel/rules/FilterTableScanRule.java#L124]) > FilterTableScanRule generate wrong mapping for filter condition when > underlying is BindableTableScan > > > Key: CALCITE-3758 > URL: https://issues.apache.org/jira/browse/CALCITE-3758 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Jin Xing >Assignee: Jin Xing >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > When FilterTableScanRule push filter condition into > ProjectableFilterableTable, the filter condition should be adjusted to be > based on the internal table of BindableTableScan. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CALCITE-3925) Support materialized view matching for LogicalCorrelate in SubsitutionVisitor
[ https://issues.apache.org/jira/browse/CALCITE-3925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jin Xing updated CALCITE-3925: -- Description: LogicalCorrelate is widely used in user SQLs to performs nested-loop join. We need to support substitution based materialized view matching for it. The matching algorithm should be able to do Project & Filter compensation when matching query with materialized view. (was: LogicalCorrelate is widely used in user SQLs to performs nested-loop join. We need to support substitution based materialized view matching for it. The matching algorithm should support Project & Filter compensation when matching query with materialized view.) > Support materialized view matching for LogicalCorrelate in SubsitutionVisitor > - > > Key: CALCITE-3925 > URL: https://issues.apache.org/jira/browse/CALCITE-3925 > Project: Calcite > Issue Type: Bug >Reporter: Jin Xing >Priority: Major > > LogicalCorrelate is widely used in user SQLs to performs nested-loop join. > We need to support substitution based materialized view matching for it. The > matching algorithm should be able to do Project & Filter compensation when > matching query with materialized view. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CALCITE-3925) Support materialized view matching for LogicalCorrelate in SubsitutionVisitor
[ https://issues.apache.org/jira/browse/CALCITE-3925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jin Xing updated CALCITE-3925: -- Description: LogicalCorrelate is widely used in user SQLs to performs nested-loop join. We need to support substitution based materialized view matching for it. The matching algorithm should support Project & Filter compensation when matching query with materialized view. > Support materialized view matching for LogicalCorrelate in SubsitutionVisitor > - > > Key: CALCITE-3925 > URL: https://issues.apache.org/jira/browse/CALCITE-3925 > Project: Calcite > Issue Type: Bug >Reporter: Jin Xing >Priority: Major > > LogicalCorrelate is widely used in user SQLs to performs nested-loop join. > We need to support substitution based materialized view matching for it. The > matching algorithm should support Project & Filter compensation when matching > query with materialized view. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3925) Support materialized view matching for LogicalCorrelate in SubsitutionVisitor
Jin Xing created CALCITE-3925: - Summary: Support materialized view matching for LogicalCorrelate in SubsitutionVisitor Key: CALCITE-3925 URL: https://issues.apache.org/jira/browse/CALCITE-3925 Project: Calcite Issue Type: Bug Reporter: Jin Xing -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (CALCITE-3716) ResultSetMetaData.getTableName should return empty string, not null, when column does not map to a table
[ https://issues.apache.org/jira/browse/CALCITE-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17030352#comment-17030352 ] Jin Xing edited comment on CALCITE-3716 at 2/6/20 1:49 AM: --- Thanks a lot Julian ~ I also want mention that the behavior of getColumnName is different between Sql systems; When I have below table and run sql by JDBC connect: {code:java} Table: test (col0 int) select a x, 1 y from test{code} ColumnName and ColumnLabel returned as below: {code:java} Mysql: ResultSet#getMetadata().getColumnName(1): "a" ResultSet#getMetadata().getColumnLabel(1): "x" ResultSet#getMetadata().getColumnName(1): "y" ResultSet#getMetadata().getColumnLabel(1): "y" Postgre: ResultSet#getMetadata().getColumnName(1): "x" ResultSet#getMetadata().getColumnLabel(1): "x" ResultSet#getMetadata().getColumnName(1): "y" ResultSet#getMetadata().getColumnLabel(1): "y"{code} I didn't find which one is the standard behavior from JDBC spec; the behavior of Calcite is consistent with Mysql now; was (Author: jinxing6...@126.com): Thanks a lot Julian ~ I also want mention that the behavior of getColumnName is different between Sql systems; When I have below table and run sql by JDBC connect: {code:java} Table: test (col0 int) select a x, 1 y from test{code} ColumnName and ColumnLabel returned as below: {code:java} Mysql: ResultSet#getMetadata().getColumnName(1): "a" ResultSet#getMetadata().getColumnLabel(1): "x" ResultSet#getMetadata().getColumnName(1): "y" ResultSet#getMetadata().getColumnLabel(1): "y" Postgre: ResultSet#getMetadata().getColumnName(1): "x" ResultSet#getMetadata().getColumnLabel(1): "x" ResultSet#getMetadata().getColumnName(1): "y" ResultSet#getMetadata().getColumnLabel(1): "y"{code} I didn't find which one is standard behavior from JDBC spec; Behavior of Calcite is consistent with Mysql; > ResultSetMetaData.getTableName should return empty string, not null, when > column does not map to a table > > > Key: CALCITE-3716 > URL: https://issues.apache.org/jira/browse/CALCITE-3716 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Jin Xing >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Per the [JDBC > spec|https://docs.oracle.com/javase/8/docs/api/java/sql/ResultSetMetaData.html#getTableName-int-], > {{ResultSetMetaData.getTableName}} should return empty string, not null, > when column does not map to a table. Similarly getCatalogName, getSchemaName, > getColumnName. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (CALCITE-3716) ResultSetMetaData.getTableName should return empty string, not null, when column does not map to a table
[ https://issues.apache.org/jira/browse/CALCITE-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17030352#comment-17030352 ] Jin Xing edited comment on CALCITE-3716 at 2/5/20 4:37 AM: --- Thanks a lot Julian ~ I also want mention that the behavior of getColumnName is different between Sql systems; When I have below table and run sql by JDBC connect: {code:java} Table: test (col0 int) select a x, 1 y from test{code} ColumnName and ColumnLabel returned as below: {code:java} Mysql: ResultSet#getMetadata().getColumnName(1): "a" ResultSet#getMetadata().getColumnLabel(1): "x" ResultSet#getMetadata().getColumnName(1): "y" ResultSet#getMetadata().getColumnLabel(1): "y" Postgre: ResultSet#getMetadata().getColumnName(1): "x" ResultSet#getMetadata().getColumnLabel(1): "x" ResultSet#getMetadata().getColumnName(1): "y" ResultSet#getMetadata().getColumnLabel(1): "y"{code} I didn't find which one is standard behavior from JDBC spec; Behavior of Calcite is consistent with Mysql; was (Author: jinxing6...@126.com): Thanks a lot Julian ~ I also want mention that the behavior of getColumnName is different between Sql systems; When I have below table and run sql by JDBC connect: {code:java} Table: test (col0 int) select a x, 1 y from test{code} ColumnName and ColumnLabel returned as below: {code:java} Mysql: ResultSet#getMetadata().getColumnName(1): "a" ResultSet#getMetadata().getColumnLabel(1): "x" ResultSet#getMetadata().getColumnName(1): "y" ResultSet#getMetadata().getColumnLabel(1): "y" Postgre: ResultSet#getMetadata().getColumnName(1): "x" ResultSet#getMetadata().getColumnLabel(1): "x" ResultSet#getMetadata().getColumnName(1): "y" ResultSet#getMetadata().getColumnLabel(1): "y"{code} I didn't find which one is standard behavior from JDBC spec; Behavior of Calcite is consistent with Mysql; > ResultSetMetaData.getTableName should return empty string, not null, when > column does not map to a table > > > Key: CALCITE-3716 > URL: https://issues.apache.org/jira/browse/CALCITE-3716 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Jin Xing >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Per the [JDBC > spec|https://docs.oracle.com/javase/8/docs/api/java/sql/ResultSetMetaData.html#getTableName-int-], > {{ResultSetMetaData.getTableName}} should return empty string, not null, > when column does not map to a table. Similarly getCatalogName, getSchemaName, > getColumnName. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3716) ResultSetMetaData.getTableName should return empty string, not null, when column does not map to a table
[ https://issues.apache.org/jira/browse/CALCITE-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17030352#comment-17030352 ] Jin Xing commented on CALCITE-3716: --- Thanks a lot Julian ~ I also want mention that the behavior of getColumnName is different between Sql systems; When I have below table and run sql by JDBC connect: {code:java} Table: test (col0 int) select a x, 1 y from test{code} ColumnName and ColumnLabel returned as below: {code:java} Mysql: ResultSet#getMetadata().getColumnName(1): "a" ResultSet#getMetadata().getColumnLabel(1): "x" ResultSet#getMetadata().getColumnName(1): "y" ResultSet#getMetadata().getColumnLabel(1): "y" Postgre: ResultSet#getMetadata().getColumnName(1): "x" ResultSet#getMetadata().getColumnLabel(1): "x" ResultSet#getMetadata().getColumnName(1): "y" ResultSet#getMetadata().getColumnLabel(1): "y"{code} I didn't find which one is standard behavior from JDBC spec; Behavior of Calcite is consistent with Mysql; > ResultSetMetaData.getTableName should return empty string, not null, when > column does not map to a table > > > Key: CALCITE-3716 > URL: https://issues.apache.org/jira/browse/CALCITE-3716 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Jin Xing >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Per the [JDBC > spec|https://docs.oracle.com/javase/8/docs/api/java/sql/ResultSetMetaData.html#getTableName-int-], > {{ResultSetMetaData.getTableName}} should return empty string, not null, > when column does not map to a table. Similarly getCatalogName, getSchemaName, > getColumnName. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3729) Filters failed to be pushed down when it's identical to join condition.
[ https://issues.apache.org/jira/browse/CALCITE-3729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17028284#comment-17028284 ] Jin Xing commented on CALCITE-3729: --- Thanks a lot for merging [~vladimirsitnikov] ! > Filters failed to be pushed down when it's identical to join condition. > --- > > Key: CALCITE-3729 > URL: https://issues.apache.org/jira/browse/CALCITE-3729 > Project: Calcite > Issue Type: Improvement >Reporter: Jin Xing >Assignee: Jin Xing >Priority: Major > Labels: pull-request-available > Fix For: 1.22.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > FilterJoinRule optimize below sql as > {code:java} > select * from sales.dept d inner join sales.emp e > on d.deptno = e.deptno and d.deptno > e.mgr > where d.deptno > e.mgr > LogicalProject(DEPTNO=[$0], NAME=[$1], EMPNO=[$2], ENAME=[$3], JOB=[$4], > MGR=[$5], HIREDATE=[$6], SAL=[$7], COMM=[$8], DEPTNO0=[$9], SLACKER=[$10]) > LogicalFilter(condition=[>($0, $5)]) > LogicalJoin(condition=[AND(=($0, $9), >($0, $5))], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, SALES, DEPT]]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > {code} > > The outer Filter failed to be pushed down into Join and removed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (CALCITE-3763) RelBuilder.aggregate should prune unused fields from the input, if the input is a Project
[ https://issues.apache.org/jira/browse/CALCITE-3763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jin Xing reassigned CALCITE-3763: - Assignee: Jin Xing > RelBuilder.aggregate should prune unused fields from the input, if the input > is a Project > - > > Key: CALCITE-3763 > URL: https://issues.apache.org/jira/browse/CALCITE-3763 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Jin Xing >Priority: Major > > {{RelBuilder.aggregate}} should prune unused fields from the input, if the > input is a {{Project}}. > Pruning fields during the planning process is desirable, but often cannot do > it - we are applying a {{RelOptRule}} that has to return the same fields, or > we don't want to add an extra Project do so the pruning. But when we are in > {{RelBuilder.aggregate}} and the input is a Project, neither of those > limitations apply. We already have a Project, we are just making it narrower; > and we know what fields the {{Aggregate}} will produce. > For example, > {code:sql} > SELECT deptno, SUM(sal) FILTER (WHERE b) > FROM ( > SELECT deptno, empno + 10, sal, job = 'CLERK' AS b > FROM emp) > GROUP BY deptno > {code} > becomes > {code:sql} > SELECT deptno, SUM(sal) FILTER (WHERE b) > FROM ( > SELECT deptno, sal, job = 'CLERK' AS b > FROM emp) > GROUP BY deptno > {code} > If there are no fields used, remove the {{Project}}. (A {{RelNode}} with no > fields is not allowed.) > {code:sql} > SELECT COUNT(*) AS C > FROM ( > SELECT deptno, empno + 10, sal, job = 'CLERK' AS b > FROM emp) > {code} > becomes > {code:sql} > SELECT COUNT(*) AS c > FROM emp > {code} > Add an option {{RelBuilder.Config.pruneInputOfAggregate}}, default true, so > that people can disable this rewrite if it causes problems. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3760) Rewriting non-deterministic function can break query semantics
[ https://issues.apache.org/jira/browse/CALCITE-3760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17027518#comment-17027518 ] Jin Xing commented on CALCITE-3760: --- {quote}As we discussed recently, it would be illegal to merge those Projects because of the UDF. {quote} Yes, in addition to rewriting SqlNode, determinism of operator also affects plan optimization. I think that's CALCITE-2348 try to fix. Firing certain rules on non-deterministic operators as normal might fail to guarantee plan equivalence. > Rewriting non-deterministic function can break query semantics > -- > > Key: CALCITE-3760 > URL: https://issues.apache.org/jira/browse/CALCITE-3760 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Jin Xing >Assignee: Jin Xing >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Calcite rewrite some *SqlFunctions* during validation. But whether the > function is deterministic is not considered. For a non-deterministic > operator, the rewriting can break semantics. Additionally there's no > interface for user to specify the determinism for a UDF/UDAF. > Say I have non-deterministic UDF & UDAF and run sql like below > {code:java} > select coalesce(udf(col0), 100) from foo; > select nullif(udaf(col0), 1024) from foo;{code} > They will be rewritten as > {code:java} > select case when udf(col0) is not null then udf(col0) else 100 end > from foo; > select case when udaf(col0)=1024 then null udaf(col0) > from foo{code} > As we can see that non-deterministic UDF & UDAF are called multiple times > after written. Thus the condition in WHEN clause might NOT be held all the > time. > We need to provide an interface for user to specify the determinism in > UDF/UDAF and consider whether a SqlNode is deterministic when rewriting. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (CALCITE-3760) Rewriting non-deterministic function can break query semantics
[ https://issues.apache.org/jira/browse/CALCITE-3760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17027507#comment-17027507 ] Jin Xing edited comment on CALCITE-3760 at 1/31/20 1:50 PM: Thanks a lot for your kind help [~julianhyde] ~ Actually in my production, we bypass this issue by asking our user to follow a contract – – non-deterministic UDF/UDAF should always called independently and not allowed to be nested in another SqlCall. We have legal and illegal cases like below {code:java} -- legal select coalesce(udf_col, 100) from (select udf(col) udf_col from foo) -- illegal select coalesce(udf(col), 100) from foo{code} {quote}So, maybe the best way is to use a Project on a Project {quote} I understand this comment as to add a Project for the non-deterministic UDF/UDAF, thus to guarantee the number of times to be evaluated. I think it will work correctly and keep the query semantics. But in current implementation, the rewriting of SqlNode works only for SqlCall[1] during SqlValidatorImpl#performUnconditionalRewrites. If a SqlFunction wants to customize rewriting logic, it just defines how itself is transformed – – no need to care where it locates. But If we want to rewrite and add the Project, the customizing rewriting logic will need to touch the outside SqlSelect, which I think will complicate the code. Additionally, SqlValidatorImpl#performUnconditionalRewrites runs before nodes are expanded and fully resolved. The rewriting might be hard. Think about below example: {code:java} select *, coalesce(udf(col), 100) from foo {code} If we simply rewrite it as {code:java} select *, case when x is not null then x else 100 end from ( select *, udf(c) from foo) {code} The semantics will be changed. I still propose to don't do rewriting when found non-deterministic, and for deterministic operators the behavior is as before. AFAIK, the reason Calcite rewrite functions like COALESCE and NULLIF is for better and simpler implementation (NULLIF doesn't have a self implementation yet.). Is there other reasons that I missed ? Thanks again for your help ! [1][https://github.com/apache/calcite/blob/c416c31fc376868bdd672afd84ec06dc75d56575/core/src/main/java/org/apache/calcite/sql/SqlOperator.java#L316] was (Author: jinxing6...@126.com): Thanks a lot for your kind help [~julianhyde] ~ Actually in my production, we bypass this issue by asking our user to follow a contract – – non-deterministic UDF/UDAF should always called independently and not allowed to be nested in another SqlCall. We have legal and illegal cases like below {code:java} -- legal select coalesce(udf_col, 100) from (select udf(col) udf_col from foo) -- illegal select coalesce(udf(col), 100) from foo{code} {quote}So, maybe the best way is to use a Project on a Project {quote} I understand this comment as to add a Project for the non-deterministic UDF/UDAF, thus to guarantee the number of times to be evaluated. I think it will work correctly and keep the query semantics. But in current implementation, the rewriting of SqlNode works only for SqlCall[1] during SqlValidatorImpl#performUnconditionalRewrites. If a SqlFunction wants to customize rewriting logic, it just defines how itself is transformed – – no need to care where it locates. But If we want to rewrite and add the Project, the customizing rewriting logic will need to touch the outside SqlSelect, which I think will complicate the code. Additionally, SqlValidatorImpl#performUnconditionalRewrites runs before nodes are expanded and fully resolved. The rewriting might be hard. Think about below example: {code:java} select *, coalesce(udf(col), 100) from foo {code} If we simply rewrite it as {code:java} select *, case when x is not null then x else 100 end from ( select *, udf(c) from foo) {code} The semantics will be changed. I still propose to don't do rewriting when found non-deterministic, and for deterministic operators the behavior is as before. AFAIK, the reason Calcite rewrite functions like COALESCE and NULLIF is for better and simpler implementation (NULLIF doesn't have a self implementation yet.). Is there other reasons that I missed ? [1][https://github.com/apache/calcite/blob/c416c31fc376868bdd672afd84ec06dc75d56575/core/src/main/java/org/apache/calcite/sql/SqlOperator.java#L316] > Rewriting non-deterministic function can break query semantics > -- > > Key: CALCITE-3760 > URL: https://issues.apache.org/jira/browse/CALCITE-3760 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Jin Xing >Assignee: Jin Xing >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Calcit
[jira] [Commented] (CALCITE-3760) Rewriting non-deterministic function can break query semantics
[ https://issues.apache.org/jira/browse/CALCITE-3760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17027507#comment-17027507 ] Jin Xing commented on CALCITE-3760: --- Thanks a lot for your kind help [~julianhyde] ~ Actually in my production, we bypass this issue by asking our user to follow a contract – – non-deterministic UDF/UDAF should always called independently and not allowed to be nested in another SqlCall. We have legal and illegal cases like below {code:java} -- legal select coalesce(udf_col, 100) from (select udf(col) udf_col from foo) -- illegal select coalesce(udf(col), 100) from foo{code} {quote}So, maybe the best way is to use a Project on a Project {quote} I understand this comment as to add a Project for the non-deterministic UDF/UDAF, thus to guarantee the number of times to be evaluated. I think it will work correctly and keep the query semantics. But in current implementation, the rewriting of SqlNode works only for SqlCall[1] during SqlValidatorImpl#performUnconditionalRewrites. If a SqlFunction wants to customize rewriting logic, it just defines how itself is transformed – – no need to care where it locates. But If we want to rewrite and add the Project, the customizing rewriting logic will need to touch the outside SqlSelect, which I think will complicate the code. Additionally, SqlValidatorImpl#performUnconditionalRewrites runs before nodes are expanded and fully resolved. The rewriting might be hard. Think about below example: {code:java} select *, coalesce(udf(col), 100) from foo {code} If we simply rewrite it as {code:java} select *, case when x is not null then x else 100 end from ( select *, udf(c) from foo) {code} The semantics will be changed. I still propose to don't do rewriting when found non-deterministic, and for deterministic operators the behavior is as before. AFAIK, the reason Calcite rewrite functions like COALESCE and NULLIF is for better and simpler implementation (NULLIF doesn't have a self implementation yet.). Is there other reasons that I missed ? [1][https://github.com/apache/calcite/blob/c416c31fc376868bdd672afd84ec06dc75d56575/core/src/main/java/org/apache/calcite/sql/SqlOperator.java#L316] > Rewriting non-deterministic function can break query semantics > -- > > Key: CALCITE-3760 > URL: https://issues.apache.org/jira/browse/CALCITE-3760 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Jin Xing >Assignee: Jin Xing >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Calcite rewrite some *SqlFunctions* during validation. But whether the > function is deterministic is not considered. For a non-deterministic > operator, the rewriting can break semantics. Additionally there's no > interface for user to specify the determinism for a UDF/UDAF. > Say I have non-deterministic UDF & UDAF and run sql like below > {code:java} > select coalesce(udf(col0), 100) from foo; > select nullif(udaf(col0), 1024) from foo;{code} > They will be rewritten as > {code:java} > select case when udf(col0) is not null then udf(col0) else 100 end > from foo; > select case when udaf(col0)=1024 then null udaf(col0) > from foo{code} > As we can see that non-deterministic UDF & UDAF are called multiple times > after written. Thus the condition in WHEN clause might NOT be held all the > time. > We need to provide an interface for user to specify the determinism in > UDF/UDAF and consider whether a SqlNode is deterministic when rewriting. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (CALCITE-3716) ResultSetMetaData.getTableName should return empty string, not null, when column does not map to a table
[ https://issues.apache.org/jira/browse/CALCITE-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17027285#comment-17027285 ] Jin Xing edited comment on CALCITE-3716 at 1/31/20 8:45 AM: {color:#172b4d}Thanks a lot [~julianhyde] for review ~{color} {quote}Does {{getColumnName}} need to be fixed also? {quote} * {color:#172b4d}I didn't find from {color}[JDBC spec|https://docs.oracle.com/javase/8/docs/api/java/sql/ResultSetMetaData.html#getColumnName-int-]{color:#172b4d} specifies that {color}*columnName*{color:#172b4d} should be empty string rather than null when not applicable.{color} * Constructor of ColumnMetaDatas[1] sets *columnName* by *label* (field name in row type) if the param passed in is null. I updated and replaced _assertEquals_ with _assertThat_ [1][https://github.com/apache/calcite-avatica/blob/dd65a2b18b8c35cfccf1c47b6be87ea7db3ad658/core/src/main/java/org/apache/calcite/avatica/ColumnMetaData.java#L122] was (Author: jinxing6...@126.com): {color:#172b4d}Thanks a lot [~julianhyde] for review ~{color} * {color:#172b4d}I didn't find from {color}[JDBC spec|https://docs.oracle.com/javase/8/docs/api/java/sql/ResultSetMetaData.html#getColumnName-int-]{color:#172b4d} specifies that {color}*columnName*{color:#172b4d} should be empty string rather than null when not applicable.{color} * Constructor of ColumnMetaDatas[1] sets *columnName* by *label* (field name in row type) if the param passed in is null. * I updated and replaced _assertEquals_ with _assertThat_ [1][https://github.com/apache/calcite-avatica/blob/dd65a2b18b8c35cfccf1c47b6be87ea7db3ad658/core/src/main/java/org/apache/calcite/avatica/ColumnMetaData.java#L122] > ResultSetMetaData.getTableName should return empty string, not null, when > column does not map to a table > > > Key: CALCITE-3716 > URL: https://issues.apache.org/jira/browse/CALCITE-3716 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Jin Xing >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Per the [JDBC > spec|https://docs.oracle.com/javase/8/docs/api/java/sql/ResultSetMetaData.html#getTableName-int-], > {{ResultSetMetaData.getTableName}} should return empty string, not null, > when column does not map to a table. Similarly getCatalogName, getSchemaName, > getColumnName. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (CALCITE-3716) ResultSetMetaData.getTableName should return empty string, not null, when column does not map to a table
[ https://issues.apache.org/jira/browse/CALCITE-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17027285#comment-17027285 ] Jin Xing edited comment on CALCITE-3716 at 1/31/20 8:39 AM: {color:#172b4d}Thanks a lot [~julianhyde] for review ~{color} * {color:#172b4d}I didn't find from {color}[JDBC spec|https://docs.oracle.com/javase/8/docs/api/java/sql/ResultSetMetaData.html#getColumnName-int-]{color:#172b4d} specifies that {color}*columnName*{color:#172b4d} should be empty string rather than null when not applicable.{color} * Constructor of ColumnMetaDatas[1] sets *columnName* by *label* if the param passed in is null. * I updated and replaced _assertEquals_ with _assertThat_ [1][https://github.com/apache/calcite-avatica/blob/dd65a2b18b8c35cfccf1c47b6be87ea7db3ad658/core/src/main/java/org/apache/calcite/avatica/ColumnMetaData.java#L122] was (Author: jinxing6...@126.com): {color:#172b4d}Thanks a lot [~julianhyde] for review ~{color} * {color:#172b4d}I didn't find from {color}[JDBC spec|https://docs.oracle.com/javase/8/docs/api/java/sql/ResultSetMetaData.html#getColumnName-int-]{color:#172b4d} specifies that {color}*columnName*{color:#172b4d} should be empty string rather than null.{color} * Constructor of ColumnMetaDatas[1] sets *columnName* by *label* if the param passed in is null. * I updated and replaced _assertEquals_ with _assertThat_ [1][https://github.com/apache/calcite-avatica/blob/dd65a2b18b8c35cfccf1c47b6be87ea7db3ad658/core/src/main/java/org/apache/calcite/avatica/ColumnMetaData.java#L122] > ResultSetMetaData.getTableName should return empty string, not null, when > column does not map to a table > > > Key: CALCITE-3716 > URL: https://issues.apache.org/jira/browse/CALCITE-3716 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Jin Xing >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Per the [JDBC > spec|https://docs.oracle.com/javase/8/docs/api/java/sql/ResultSetMetaData.html#getTableName-int-], > {{ResultSetMetaData.getTableName}} should return empty string, not null, > when column does not map to a table. Similarly getCatalogName, getSchemaName, > getColumnName. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (CALCITE-3716) ResultSetMetaData.getTableName should return empty string, not null, when column does not map to a table
[ https://issues.apache.org/jira/browse/CALCITE-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17027285#comment-17027285 ] Jin Xing edited comment on CALCITE-3716 at 1/31/20 8:39 AM: {color:#172b4d}Thanks a lot [~julianhyde] for review ~{color} * {color:#172b4d}I didn't find from {color}[JDBC spec|https://docs.oracle.com/javase/8/docs/api/java/sql/ResultSetMetaData.html#getColumnName-int-]{color:#172b4d} specifies that {color}*columnName*{color:#172b4d} should be empty string rather than null when not applicable.{color} * Constructor of ColumnMetaDatas[1] sets *columnName* by *label* (field name in row type) if the param passed in is null. * I updated and replaced _assertEquals_ with _assertThat_ [1][https://github.com/apache/calcite-avatica/blob/dd65a2b18b8c35cfccf1c47b6be87ea7db3ad658/core/src/main/java/org/apache/calcite/avatica/ColumnMetaData.java#L122] was (Author: jinxing6...@126.com): {color:#172b4d}Thanks a lot [~julianhyde] for review ~{color} * {color:#172b4d}I didn't find from {color}[JDBC spec|https://docs.oracle.com/javase/8/docs/api/java/sql/ResultSetMetaData.html#getColumnName-int-]{color:#172b4d} specifies that {color}*columnName*{color:#172b4d} should be empty string rather than null when not applicable.{color} * Constructor of ColumnMetaDatas[1] sets *columnName* by *label* if the param passed in is null. * I updated and replaced _assertEquals_ with _assertThat_ [1][https://github.com/apache/calcite-avatica/blob/dd65a2b18b8c35cfccf1c47b6be87ea7db3ad658/core/src/main/java/org/apache/calcite/avatica/ColumnMetaData.java#L122] > ResultSetMetaData.getTableName should return empty string, not null, when > column does not map to a table > > > Key: CALCITE-3716 > URL: https://issues.apache.org/jira/browse/CALCITE-3716 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Jin Xing >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Per the [JDBC > spec|https://docs.oracle.com/javase/8/docs/api/java/sql/ResultSetMetaData.html#getTableName-int-], > {{ResultSetMetaData.getTableName}} should return empty string, not null, > when column does not map to a table. Similarly getCatalogName, getSchemaName, > getColumnName. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3716) ResultSetMetaData.getTableName should return empty string, not null, when column does not map to a table
[ https://issues.apache.org/jira/browse/CALCITE-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17027285#comment-17027285 ] Jin Xing commented on CALCITE-3716: --- {color:#172b4d}Thanks a lot [~julianhyde] for review ~{color} * {color:#172b4d}I didn't find from {color}[JDBC spec|https://docs.oracle.com/javase/8/docs/api/java/sql/ResultSetMetaData.html#getColumnName-int-]{color:#172b4d} specifies that {color}*columnName*{color:#172b4d} should be empty string rather than null.{color} * Constructor of ColumnMetaDatas[1] sets *columnName* by *label* if the param passed in is null. * I updated and replaced _assertEquals_ with _assertThat_ [1][https://github.com/apache/calcite-avatica/blob/dd65a2b18b8c35cfccf1c47b6be87ea7db3ad658/core/src/main/java/org/apache/calcite/avatica/ColumnMetaData.java#L122] > ResultSetMetaData.getTableName should return empty string, not null, when > column does not map to a table > > > Key: CALCITE-3716 > URL: https://issues.apache.org/jira/browse/CALCITE-3716 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Jin Xing >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Per the [JDBC > spec|https://docs.oracle.com/javase/8/docs/api/java/sql/ResultSetMetaData.html#getTableName-int-], > {{ResultSetMetaData.getTableName}} should return empty string, not null, > when column does not map to a table. Similarly getCatalogName, getSchemaName, > getColumnName. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (CALCITE-3760) Rewriting non-deterministic function can break query semantics
[ https://issues.apache.org/jira/browse/CALCITE-3760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17027196#comment-17027196 ] Jin Xing edited comment on CALCITE-3760 at 1/31/20 5:06 AM: Hi, [~julianhyde] [~amaliujia] Thanks a lot for feedback ~ Yes, a *LET* clause would be very helpful. It allows us to store the result of a sub-expression, e.g. result generated from a non-deterministic udf/udaf, and use it in subsequent clauses. Thus to ensure non-deterministic expressions are evaluated the consistent number of times. It's already supported by some vendors [1]. But I would prefer the rewriting within scope of common and standard sql, a common scenario is we always want to convert expression back to Sql string and run in jdbc convention. A non-common clause might bring obstacle to run the sql in other dialects. So I propose to don't do the rewrites when found non-deterministic. {quote}Related issues are re-ordering of the branches of AND and OR conditions, and behavior when an expression throws. {quote} Currently RexSimplify already takes determinism of expression into consideration (there might be space to improve). A missed part is to add an interface for udf/udaf to specify whether it's deterministic. [1]https://docs.couchbase.com/server/current/n1ql/n1ql-language-reference/let.html was (Author: jinxing6...@126.com): Hi, [~julianhyde] [~amaliujia] Thanks a lot for feedback ~ Yes, a LET clause would be very helpful. It allows us to store the result of a sub-expression, e.g. result generated from a non-deterministic udf/udaf, and use it in subsequent clauses. Thus to ensure non-deterministic expressions are evaluated the consistent number of times. It's already supported by some vendors [1]. But I would prefer the rewriting within scope of sql standard, a common scenario is we always want to convert expression back to Sql string and run in jdbc convention. A non-standard clause might bring obstacle to run the sql in other dialects. So I propose to don't do the rewrites when found non-deterministic. {quote}Related issues are re-ordering of the branches of AND and OR conditions, and behavior when an expression throws. {quote} Currently RexSimplify already takes determinism of expression into consideration (there might be space to improve). A missed part is to add an interface for udf/udaf to specify whether it's deterministic. [1]https://docs.couchbase.com/server/current/n1ql/n1ql-language-reference/let.html > Rewriting non-deterministic function can break query semantics > -- > > Key: CALCITE-3760 > URL: https://issues.apache.org/jira/browse/CALCITE-3760 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Jin Xing >Assignee: Jin Xing >Priority: Major > > Calcite rewrite some *SqlFunctions* during validation. But whether the > function is deterministic is not considered. For a non-deterministic > operator, the rewriting can break semantics. Additionally there's no > interface for user to specify the determinism for a UDF/UDAF. > Say I have non-deterministic UDF & UDAF and run sql like below > {code:java} > select coalesce(udf(col0), 100) from foo; > select nullif(udaf(col0), 1024) from foo;{code} > They will be rewritten as > {code:java} > select case when udf(col0) is not null then udf(col0) else 100 end > from foo; > select case when udaf(col0)=1024 then null udaf(col0) > from foo{code} > As we can see that non-deterministic UDF & UDAF are called multiple times > after written. Thus the condition in WHEN clause might NOT be held all the > time. > We need to provide an interface for user to specify the determinism in > UDF/UDAF and consider whether a SqlNode is deterministic when rewriting. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3760) Rewriting non-deterministic function can break query semantics
[ https://issues.apache.org/jira/browse/CALCITE-3760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17027196#comment-17027196 ] Jin Xing commented on CALCITE-3760: --- Hi, [~julianhyde] [~amaliujia] Thanks a lot for feedback ~ Yes, a LET clause would be very helpful. It allows us to store the result of a sub-expression, e.g. result generated from a non-deterministic udf/udaf, and use it in subsequent clauses. Thus to ensure non-deterministic expressions are evaluated the consistent number of times. It's already supported by some vendors [1]. But I would prefer the rewriting within scope of sql standard, a common scenario is we always want to convert expression back to Sql string and run in jdbc convention. A non-standard clause might bring obstacle to run the sql in other dialects. So I propose to don't do the rewrites when found non-deterministic. {quote}Related issues are re-ordering of the branches of AND and OR conditions, and behavior when an expression throws. {quote} Currently RexSimplify already takes determinism of expression into consideration (there might be space to improve). A missed part is to add an interface for udf/udaf to specify whether it's deterministic. [1]https://docs.couchbase.com/server/current/n1ql/n1ql-language-reference/let.html > Rewriting non-deterministic function can break query semantics > -- > > Key: CALCITE-3760 > URL: https://issues.apache.org/jira/browse/CALCITE-3760 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Jin Xing >Assignee: Jin Xing >Priority: Major > > Calcite rewrite some *SqlFunctions* during validation. But whether the > function is deterministic is not considered. For a non-deterministic > operator, the rewriting can break semantics. Additionally there's no > interface for user to specify the determinism for a UDF/UDAF. > Say I have non-deterministic UDF & UDAF and run sql like below > {code:java} > select coalesce(udf(col0), 100) from foo; > select nullif(udaf(col0), 1024) from foo;{code} > They will be rewritten as > {code:java} > select case when udf(col0) is not null then udf(col0) else 100 end > from foo; > select case when udaf(col0)=1024 then null udaf(col0) > from foo{code} > As we can see that non-deterministic UDF & UDAF are called multiple times > after written. Thus the condition in WHEN clause might NOT be held all the > time. > We need to provide an interface for user to specify the determinism in > UDF/UDAF and consider whether a SqlNode is deterministic when rewriting. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CALCITE-3760) Rewriting non-deterministic function can break query semantics
[ https://issues.apache.org/jira/browse/CALCITE-3760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jin Xing updated CALCITE-3760: -- Summary: Rewriting non-deterministic function can break query semantics (was: Rewriting function without considering determinism can break query semantics) > Rewriting non-deterministic function can break query semantics > -- > > Key: CALCITE-3760 > URL: https://issues.apache.org/jira/browse/CALCITE-3760 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Jin Xing >Assignee: Jin Xing >Priority: Major > > Calcite rewrite some *SqlFunctions* during validation. But whether the > function is deterministic is not considered. For a non-deterministic > operator, the rewriting can break semantics. Additionally there's no > interface for user to specify the determinism for a UDF/UDAF. > Say I have non-deterministic UDF & UDAF and run sql like below > {code:java} > select coalesce(udf(col0), 100) from foo; > select nullif(udaf(col0), 1024) from foo;{code} > They will be rewritten as > {code:java} > select case when udf(col0) is not null then udf(col0) else 100 end > from foo; > select case when udaf(col0)=1024 then null udaf(col0) > from foo{code} > As we can see that non-deterministic UDF & UDAF are called multiple times > after written. Thus the condition in WHEN clause might NOT be held all the > time. > We need to provide an interface for user to specify the determinism in > UDF/UDAF and consider whether a SqlNode is deterministic when rewriting. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3760) Rewriting function without considering determinism can break query semantics
Jin Xing created CALCITE-3760: - Summary: Rewriting function without considering determinism can break query semantics Key: CALCITE-3760 URL: https://issues.apache.org/jira/browse/CALCITE-3760 Project: Calcite Issue Type: Bug Components: core Reporter: Jin Xing Assignee: Jin Xing Calcite rewrite some *SqlFunctions* during validation. But whether the function is deterministic is not considered. For a non-deterministic operator, the rewriting can break semantics. Additionally there's no interface for user to specify the determinism for a UDF/UDAF. Say I have non-deterministic UDF & UDAF and run sql like below {code:java} select coalesce(udf(col0), 100) from foo; select nullif(udaf(col0), 1024) from foo;{code} They will be rewritten as {code:java} select case when udf(col0) is not null then udf(col0) else 100 end from foo; select case when udaf(col0)=1024 then null udaf(col0) from foo{code} As we can see that non-deterministic UDF & UDAF are called multiple times after written. Thus the condition in WHEN clause might NOT be held all the time. We need to provide an interface for user to specify the determinism in UDF/UDAF and consider whether a SqlNode is deterministic when rewriting. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3716) ResultSetMetaData.getTableName should return empty string, not null, when column does not map to a table
[ https://issues.apache.org/jira/browse/CALCITE-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025103#comment-17025103 ] Jin Xing commented on CALCITE-3716: --- Hi, Julian ~ I made a PR and try to fix this issue. Please take a look when you have time ~ > ResultSetMetaData.getTableName should return empty string, not null, when > column does not map to a table > > > Key: CALCITE-3716 > URL: https://issues.apache.org/jira/browse/CALCITE-3716 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Jin Xing >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Per the [JDBC > spec|https://docs.oracle.com/javase/8/docs/api/java/sql/ResultSetMetaData.html#getTableName-int-], > {{ResultSetMetaData.getTableName}} should return empty string, not null, > when column does not map to a table. Similarly getCatalogName, getSchemaName, > getColumnName. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3758) FilterTableScanRule generate wrong mapping for filter condition when underlying is BindableTableScan
Jin Xing created CALCITE-3758: - Summary: FilterTableScanRule generate wrong mapping for filter condition when underlying is BindableTableScan Key: CALCITE-3758 URL: https://issues.apache.org/jira/browse/CALCITE-3758 Project: Calcite Issue Type: Improvement Components: core Reporter: Jin Xing Assignee: Jin Xing When FilterTableScanRule push filter condition into ProjectableFilterableTable, the filter condition should be adjusted to be based on the internal table of BindableTableScan. This line[1] is rectified as *Mappings.source* [1][https://github.com/apache/calcite/blob/571731b80a58eb095ebac7123285c375e7afff90/core/src/main/java/org/apache/calcite/rel/rules/FilterTableScanRule.java#L124] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CALCITE-2010) Cannot plan query that is UNION ALL applied to VALUES
[ https://issues.apache.org/jira/browse/CALCITE-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jin Xing updated CALCITE-2010: -- Attachment: graphviz (1).svg > Cannot plan query that is UNION ALL applied to VALUES > - > > Key: CALCITE-2010 > URL: https://issues.apache.org/jira/browse/CALCITE-2010 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Priority: Major > Attachments: graphviz (1).svg > > > Cannot plan query that is UNION ALL applied to VALUES. For example, in > JdbcTest: > {code} > @Test public void testUnionAllValues() { > CalciteAssert.hr() > .query("select x, y from (values (1, 2)) as t(x, y)\n" > + "union all\n" > + "select a + b, a - b from (values (3, 4), (5, 6)) as u(a, b)") > .returnsUnordered(); > } > {code} > gives > {noformat} > java.sql.SQLException: Error while executing SQL "select x, y from (values > (1, 2)) as t(x, y) > union all > select a + b, a - b from (values (3, 4), (5, 6)) as u(a, b)": Node > [rel#26:Subset#4.ENUMERABLE.[]] could not be implemented; planner state: > Root: rel#26:Subset#4.ENUMERABLE.[] > Original rel: > LogicalUnion(subset=[rel#26:Subset#4.ENUMERABLE.[]], all=[true]): rowcount = > 3.0, cumulative cost = {3.0 rows, 3.0 cpu, 0.0 io}, id = 21 > LogicalProject(subset=[rel#17:Subset#1.NONE.[]], X=[$0], Y=[$1]): rowcount > = 1.0, cumulative cost = {1.0 rows, 2.0 cpu, 0.0 io}, id = 16 > LogicalValues(subset=[rel#15:Subset#0.NONE.[]], tuples=[[{ 1, 2 }]]): > rowcount = 1.0, cumulative cost = {1.0 rows, 1.0 cpu, 0.0 io}, id = 1 > LogicalProject(subset=[rel#20:Subset#3.NONE.[0]], EXPR$0=[+($0, $1)], > EXPR$1=[-($0, $1)]): rowcount = 2.0, cumulative cost = {2.0 rows, 4.0 cpu, > 0.0 io}, id = 19 > LogicalValues(subset=[rel#18:Subset#2.NONE.[]], tuples=[[{ 3, 4 }, { 5, 6 > }]]): rowcount = 2.0, cumulative cost = {2.0 rows, 1.0 cpu, 0.0 io}, id = 4 > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (CALCITE-2885) SqlValidatorImpl fails when processing an InferTypes.FIRST_KNOWN function containing a function with a dynamic parameter as first operand
[ https://issues.apache.org/jira/browse/CALCITE-2885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17023793#comment-17023793 ] Jin Xing edited comment on CALCITE-2885 at 1/26/20 12:58 PM: - Hi, Ruben ~ I also came across this issue. *FIRST_KNOWN* infers operand types by calling *deriveType* on operands. In your case (+_select 2 * ? = 4_+), when deriving operand types for *EQUALS* function, type of the dynamic param in *MULTIPLY* is not inferred yet, thus the deriving failed for validation[1][2]. To my best knowledge, type validation might be not needed when infer operand types – – we will do operand type validation anyway after operand type inference. Currently *SqlOperator#checkOperandTypes* already has a param[3] to control whether to swallow the exception if type checking failed. I'm thinking that shall we have a switch to control whether to bypass type validation in *SqlValidatorImpl#deriveType* ? We can disable the type validation during operand type inference and enable afterwards. But note that this change is not trivial – – public methods will be changed to add param for switching. [1][https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/sql/SqlOperator.java#L534] [2][https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/sql/SqlOperator.java#L448] [3][https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/sql/SqlOperator.java#L450] was (Author: jinxing6...@126.com): Hi, Ruben ~ I also came across this issue. *FIRST_KNOWN* infers operand types by calling *deriveType* on operands. In your case (+_select 2 * ? = 4_+), when deriving operand types for *EQUALS* function, type of the dynamic param in *MULTIPLY* is not inferred yet, thus the deriving failed for validation[1][2]. To my best knowledge, type validation might be not needed when infer operand types – – we will do operand type validation anyway after operand type inference. I'm thinking that shall we have a switch to control whether to bypass type validation in *SqlValidatorImpl#deriveType* ? We can disable the type validation during operand type inference and enable afterwards. But note that this change is not trivial – – public methods will be changed to add param for switching. [1][https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/sql/SqlOperator.java#L534] [2][https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/sql/SqlOperator.java#L448] > SqlValidatorImpl fails when processing an InferTypes.FIRST_KNOWN function > containing a function with a dynamic parameter as first operand > - > > Key: CALCITE-2885 > URL: https://issues.apache.org/jira/browse/CALCITE-2885 > Project: Calcite > Issue Type: Bug >Affects Versions: 1.18.0 >Reporter: Ruben Q L >Priority: Major > > Problem can be reproduced by adding following tests (e.g. to > SqlValidatorDynamicTest.java): > {code:java} > @Test public void testDynamicParameter1() throws Exception { > final String sql = "select 4 = 2*?"; > sql(sql).ok(); > } > @Test public void testDynamicParameter2() throws Exception { > final String sql = "select 2*? = 4"; > sql(sql).ok(); > } > {code} > The first test will run successfully, but the second one (which is the same > query reversing the equality operands) will fail with the exception: > {code} > org.apache.calcite.sql.validate.SqlValidatorException: Cannot apply '*' to > arguments of type ' * '. Supported form(s): ' * > ' ' * ' ' * > ' > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-2885) SqlValidatorImpl fails when processing an InferTypes.FIRST_KNOWN function containing a function with a dynamic parameter as first operand
[ https://issues.apache.org/jira/browse/CALCITE-2885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17023793#comment-17023793 ] Jin Xing commented on CALCITE-2885: --- Hi, Ruben ~ I also came across this issue. *FIRST_KNOWN* infers operand types by calling *deriveType* on operands. In your case (+_select 2 * ? = 4_+), when deriving operand types for *EQUALS* function, type of the dynamic param in *MULTIPLY* is not inferred yet, thus the deriving failed for validation[1][2]. To my best knowledge, type validation might be not needed when infer operand types – – we will do operand type validation anyway after operand type inference. I'm thinking that shall we have a switch to control whether to bypass type validation in *SqlValidatorImpl#deriveType* ? We can disable the type validation during operand type inference and enable afterwards. But note that this change is not trivial – – public methods will be changed to add param for switching. [1][https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/sql/SqlOperator.java#L534] [2][https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/sql/SqlOperator.java#L448] > SqlValidatorImpl fails when processing an InferTypes.FIRST_KNOWN function > containing a function with a dynamic parameter as first operand > - > > Key: CALCITE-2885 > URL: https://issues.apache.org/jira/browse/CALCITE-2885 > Project: Calcite > Issue Type: Bug >Affects Versions: 1.18.0 >Reporter: Ruben Q L >Priority: Major > > Problem can be reproduced by adding following tests (e.g. to > SqlValidatorDynamicTest.java): > {code:java} > @Test public void testDynamicParameter1() throws Exception { > final String sql = "select 4 = 2*?"; > sql(sql).ok(); > } > @Test public void testDynamicParameter2() throws Exception { > final String sql = "select 2*? = 4"; > sql(sql).ok(); > } > {code} > The first test will run successfully, but the second one (which is the same > query reversing the equality operands) will fail with the exception: > {code} > org.apache.calcite.sql.validate.SqlValidatorException: Cannot apply '*' to > arguments of type ' * '. Supported form(s): ' * > ' ' * ' ' * > ' > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3731) Add IF function for BigQuery, Hive and Spark Dialects
[ https://issues.apache.org/jira/browse/CALCITE-3731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17016823#comment-17016823 ] Jin Xing commented on CALCITE-3731: --- [~vaibhavjain-dm] You can refer to org.apache.calcite.sql.fun.SqlNullifFunction > Add IF function for BigQuery, Hive and Spark Dialects > - > > Key: CALCITE-3731 > URL: https://issues.apache.org/jira/browse/CALCITE-3731 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: vaibhav jain >Priority: Major > Labels: pull-request-available > Time Spent: 3h > Remaining Estimate: 0h > > Adding support for *IF* function for BigQuery, Hive and Spark Dialects. > *Syntax*- IF(condition, true_result, false_result) > If *condition* is true, returns *true_result*, else returns *false_result*. > Document links: > [BigQuery|https://cloud.google.com/bigquery/docs/reference/standard-sql/conditional_expressions] > [Hive > |https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-ConditionalFunctions] > [Spark Sql|https://spark.apache.org/docs/2.3.1/api/sql/index.html#if] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3593) RelToSqlConverter changes target of ambiguous HAVING clause with a Project on Filter on Aggregate
[ https://issues.apache.org/jira/browse/CALCITE-3593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17014815#comment-17014815 ] Jin Xing commented on CALCITE-3593: --- Thanks a lot, [~julianhyde] ~ > RelToSqlConverter changes target of ambiguous HAVING clause with a Project on > Filter on Aggregate > - > > Key: CALCITE-3593 > URL: https://issues.apache.org/jira/browse/CALCITE-3593 > Project: Calcite > Issue Type: Bug > Components: jdbc-adapter >Reporter: Steven Talbot >Assignee: Jin Xing >Priority: Major > Labels: pull-request-available > Fix For: 1.22.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Best shown with the shell of a test in RelToSqlConverter test. > The following SQL on BigQuery > {code:java} > select product_id - 1000 as product_id > from ( > select product_id, avg(gross_weight) as agw > from (SELECT 1 as product_id, 70 as net_weight, 170 as gross_weight) as > product > where net_weight < 100 > group by product_id having product_id > 0){code} > produces one result, because the having filter applies to the product id > before subtraction, of course. > Running it through the machinery in that test > (`sql(query).withBigQuery().ok(expected)`) translates it to: > {noformat} > SELECT product_id - 1000 AS product_id > from (SELECT 1 as product_id, 70 as net_weight, 170 as gross_weight) as > product > WHERE net_weight < 100 > GROUP BY product_id > HAVING product_id > 0{noformat} > This changes the meaning of the query: now the HAVING is on the > after-subtraction product_id and you get no results, rather than the one > result. > Note that this is _not_ true in HyperSQL, as it has different semantics > around the HAVING namespace. > {noformat} > select "product_id" - 1000 as "product_id" > from ( > select "product_id", avg("gross_weight") as agw > from (SELECT 1 as "product_id", 70 as "net_weight", 170 as "gross_weight" > FROM (VALUES(0))) as product > where "net_weight" < 100 > group by "product_id" having "product_id" > 0){noformat} > becomes > {noformat} > SELECT "product_id" - 1000 AS "product_id" > from (SELECT 1 as "product_id", 70 as "net_weight", 170 as "gross_weight" > FROM (VALUES(0))) as product > WHERE "net_weight" < 100 > GROUP BY "product_id" > HAVING "product_id" > 0{noformat} > But the meaning is preserved: both return a row. > I'm not enough of a SQL standards expert to know which one is being more > compliant, but presumably both would have to be supported via some sort of > flag? > I think the fix would be to force the subselect on dialects such as BigQuery > that have this behavior. Probably something that looks a lot like > [https://github.com/apache/calcite/blob/3530daaa8cad43aad6845b6c79e4bc1ca0e72f5f/core/src/main/java/org/apache/calcite/rel/rel2sql/SqlImplementor.java#L1043-L1047] > The test, of course, looks like pretty silly SQL no one would ever write, but > the point is this is what's generated when you have > {noformat} > Project f(x) as x > Filter g(x) > Aggregate {}, ...{noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (CALCITE-3716) ResultSetMetaData.getTableName should return empty string, not null, when column does not map to a table
[ https://issues.apache.org/jira/browse/CALCITE-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jin Xing reassigned CALCITE-3716: - Assignee: Jin Xing > ResultSetMetaData.getTableName should return empty string, not null, when > column does not map to a table > > > Key: CALCITE-3716 > URL: https://issues.apache.org/jira/browse/CALCITE-3716 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Jin Xing >Priority: Major > > Per the [JDBC > spec|https://docs.oracle.com/javase/8/docs/api/java/sql/ResultSetMetaData.html#getTableName-int-], > {{ResultSetMetaData.getTableName}} should return empty string, not null, > when column does not map to a table. Similarly getCatalogName, getSchemaName, > getColumnName. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3729) Filters failed to be pushed down when it's identical to join condition.
Jin Xing created CALCITE-3729: - Summary: Filters failed to be pushed down when it's identical to join condition. Key: CALCITE-3729 URL: https://issues.apache.org/jira/browse/CALCITE-3729 Project: Calcite Issue Type: Improvement Reporter: Jin Xing Assignee: Jin Xing FilterJoinRule optimize below sql as {code:java} select * from sales.dept d inner join sales.emp e on d.deptno = e.deptno and d.deptno > e.mgr where d.deptno > e.mgr LogicalProject(DEPTNO=[$0], NAME=[$1], EMPNO=[$2], ENAME=[$3], JOB=[$4], MGR=[$5], HIREDATE=[$6], SAL=[$7], COMM=[$8], DEPTNO0=[$9], SLACKER=[$10]) LogicalFilter(condition=[>($0, $5)]) LogicalJoin(condition=[AND(=($0, $9), >($0, $5))], joinType=[inner]) LogicalTableScan(table=[[CATALOG, SALES, DEPT]]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code} The outer Filter failed to be pushed down into Join and removed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3725) RelMetadataTest fails with NPE due to unsafe RelMetadataQuery.instance call
[ https://issues.apache.org/jira/browse/CALCITE-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17014034#comment-17014034 ] Jin Xing commented on CALCITE-3725: --- Hi, [~vladimirsitnikov] I proposed a fix for this issue, please take a look when you have time ~ > RelMetadataTest fails with NPE due to unsafe RelMetadataQuery.instance call > --- > > Key: CALCITE-3725 > URL: https://issues.apache.org/jira/browse/CALCITE-3725 > Project: Calcite > Issue Type: Bug > Components: core >Affects Versions: 1.21.0 >Reporter: Vladimir Sitnikov >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > {noformat} > org.apache.calcite.test.RelMetadataTest > > testColumnUniquenessForMinusWithConstantColumns() FAILED > java.lang.NullPointerException > at java.util.Objects.requireNonNull(Objects.java:203) > at > org.apache.calcite.rel.metadata.RelMetadataQuery.(RelMetadataQuery.java:141) > at > org.apache.calcite.rel.metadata.RelMetadataQuery.(RelMetadataQuery.java:106) > at > org.apache.calcite.rel.metadata.RelMetadataQuery.instance(RelMetadataQuery.java:174) > at > org.apache.calcite.test.RelMetadataTest.testColumnUniquenessForMinusWithConstantColumns(RelMetadataTest.java:1057) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (CALCITE-3505) Infinite matching of FilterProjectTransposeRule causes stackoverflow
[ https://issues.apache.org/jira/browse/CALCITE-3505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16979910#comment-16979910 ] Jin Xing edited comment on CALCITE-3505 at 1/13/20 5:08 AM: [~hyuan] I commented in CALCITE-3124 and gave my understanding about the root cause. The root cause of this issue is different from CALCITE-3124 but similar with CALCITE-2223. Yes, [~zabetak] . It's caused by cycle between RelSubsets. The infinite loop happens by below steps. Step-1: The Sql and original plan is as below {code:java} select * from b1, b2 where b1.i = 10 and b2.i = 10 and b1.i = b2.i LogicalProject(i=[$0], j=[$1], k=[$2], i0=[$3], j0=[$4], k0=[$5]) LogicalFilter(condition=[AND(=($0, 10), =($3, 10), =($0, $3))]) LogicalJoin(condition=[true], joinType=[inner]) LogicalTableScan(table=[[s, b1]]) LogicalTableScan(table=[[s, b2]]) {code} Step-2: Thanks to FilterJoinRule, ProjectRemoveRule & JoinCommuteRule, the original plan can be transformed to be equivalents and collect as RelSubset-M: {code:java} RelSubset-M: LogicalProject(i=[$0], j=[$1], k=[$2], i0=[$3], j0=[$4], k0=[$5]) --> OriginalPlan LogicalFilter(condition=[AND(=($0, 10), =($3, 10), =($0, $3))]) LogicalJoin(condition=[true], joinType=[inner]) LogicalTableScan(table=[[s, b1]]) LogicalTableScan(table=[[s, b2]]) LogicalFilter(condition=[AND(=($0,10),=($3, 10),=($0,$3))]) --> OriginalPlan-equiv0 LogicalJoin(condition=[true], joinType=[inner]) LogicalTableScan(table=[[s, b1]]) LogicalTableScan(table=[[s, b2]]) LogicalFilter(condition=[=($0, $3))]) --> OriginalPlan-equiv1 LogicalJoin(condition=[true], joinType=[inner]) LogicalFilter(condition=[=($0, 10)]) LogicalTableScan(table=[[s, b1]]) LogicalFilter(condition=[=($0, 10)]) LogicalTableScan(table=[[s, b2]]) LogicalProject(i=[$3],j=[$4],k=[$5],i0=[$0],j0=[$1],k0=[$2]) --> OriginalPlan-equiv2 RelSubset-N {code} Step-3: During the transformation in Step-2, we will get a RelSubset-N as below: {code:java} RelSubset-N: LogicalFilter(condition=[=($3, $0))]) --> Plan0 LogicalJoin(condition=[true], joinType=[inner]) LogicalFilter(condition=[=($0, 10)]) LogicalTableScan(table=[[s, b2]]) LogicalFilter(condition=[=($0, 10)]) LogicalTableScan(table=[[s, b1]]) LogicalProject(i=[$3], j=[$4], k=[$5], i0=[$0], j0=[$1], k0=[$2]) --> Plan1 RelSubset-M {code} >From here, we will notice that RelSubset-M and RelSubset-N references each >other and form a cycle. # LogicalProject in OriginalPlan-equiv2 and LogicalFilter(condition=[=($3, $0)] in Plan0 is transformed by ProjectFilterTransposeRule and generate a new Filter in RelSubset-M # The LogicalProject in Plan1 and the new generated Filter in RelSubset-M will be matched by ProjectFilterTransposeRule again and generate another new Filter. # Above two steps will alternate infinitely was (Author: jinxing6...@126.com): [~hyuan] I commented in CALCITE-3124 and gave my understanding about the root cause. The root cause of this issue is different from CALCITE-3124 but similar with CALCITE-2223. Yes, [~zabetak] . It's caused by cycle between RelSubsets. The infinite loop happens by below steps. Step-1: The Sql and original plan is as below {code:java} select * from b1, b2 where b1.i = 10 and b2.i = 10 and b1.i = b2.i LogicalProject(i=[$0], j=[$1], k=[$2], i0=[$3], j0=[$4], k0=[$5]) LogicalFilter(condition=[AND(=($0, 10), =($3, 10), =($0, $3))]) LogicalJoin(condition=[true], joinType=[inner]) LogicalTableScan(table=[[s, b1]]) LogicalTableScan(table=[[s, b2]]) {code} Step-2: Thanks to FilterJoinRule, ProjectRemoveRule & JoinCommuteRule, the original plan can be transformed to be equivalents and collect as RelSubset-M: {code:java} RelSubset-M: LogicalProject(i=[$0], j=[$1], k=[$2], i0=[$3], j0=[$4], k0=[$5]) --> OriginalPlan LogicalFilter(condition=[AND(=($0, 10), =($3, 10), =($0, $3))]) LogicalJoin(condition=[true], joinType=[inner]) LogicalTableScan(table=[[s, b1]]) LogicalTableScan(table=[[s, b2]]) LogicalFilter(condition=[AND(=($0, 10), =($3, 10), =($0, $3))]) --> OriginalPlan-equiv0 LogicalJoin(condition=[true], joinType=[inner]) LogicalTableScan(table=[[s, b1]]) LogicalTableScan(table=[[s, b2]]) LogicalFilter(condition=[=($0, $3))]) --> OriginalPlan-equiv1 LogicalJoin(condition=[true], joinType=[inner]) LogicalFilter(condition=[=($0, 10)]) LogicalTableScan(table=[[s, b1]]) LogicalFilter(condition=[=($0, 10)]) LogicalTableScan(table=[[s, b2]]) LogicalProject(i=[$3], j=[$4], k=[$5], i0=[$0], j0=[$1], k0=[$2]) --> OriginalPlan-equiv2 RelSubset-N {code} Step-3: During the transformation
[jira] [Created] (CALCITE-3727) Deduce nullability of RelDataTypeField for Join
Jin Xing created CALCITE-3727: - Summary: Deduce nullability of RelDataTypeField for Join Key: CALCITE-3727 URL: https://issues.apache.org/jira/browse/CALCITE-3727 Project: Calcite Issue Type: Improvement Reporter: Jin Xing If I have two tables and run a sql like below {code:java} table: emps empno int deptno int table:depts deptno int select emps.deptno deptno, ... from emps inner join depts on emps.deptno = depts.deptno{code} Its' a very common case that we build star model like above. Currently row type of the RelNode plan have "isNullable = true" of "deptno" column, i.e. Calcite doesn't look into join condition "emps.deptno = depts.deptno" and deduce that the result deptno should have "isNullable=false" -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3667) EnumerableMergeJoin should not use take-while enumerator
[ https://issues.apache.org/jira/browse/CALCITE-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17010288#comment-17010288 ] Jin Xing commented on CALCITE-3667: --- Hi, Julian ~ I think I didn't make myself clear ~ What I want to do in PR-1717 is actually Enumerables.where (I missed this method and brought in ConditionalEnumerator). We generate a cartesian enumerator for each single matched key pair from LHS and RHS in MergeJoinEnumerator#advance, and filter the enumerator by predicate. That's why I said "the complexity is decided by the duplication of the keys": 1. If lots of duplicated keys exist in LHS or RHS, a single cartesian enumerator can contain large number of rows 2. If most of the keys from LHS and RHS are distinct, cartesian enumerator contains smaller number of rows. We cannot use TakeWhileEnumerator and shouldn't stop when the predicate evaluates to be false. Because the cartesian enumerator cannot guarantee that rows which satisfy predicate are consecutive and start from the beginning. PR-1702 fixes this Jira by *mergeJoin(...).where(...)*. It's good for me. So I closed PR-1717. Thanks a lot for your help, Julian :):) Best, Jin > EnumerableMergeJoin should not use take-while enumerator > - > > Key: CALCITE-3667 > URL: https://issues.apache.org/jira/browse/CALCITE-3667 > Project: Calcite > Issue Type: Improvement >Reporter: Jin Xing >Assignee: Jin Xing >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > Currently EnumerableMergeJoin use take-while enumerator [1] to emit values > satisfy predicate. However take-while enumerator stops the enumeration at > once when predicate is failed, which is not correct and we should finish the > enumeration and extract all the qualified values. > [1] > https://github.com/apache/calcite/blob/master/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L3896 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3667) EnumerableMergeJoin should not use take-while enumerator
[ https://issues.apache.org/jira/browse/CALCITE-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009310#comment-17009310 ] Jin Xing commented on CALCITE-3667: --- Hi, [~julianhyde] ~ Thanks again looking into this, and I really appreciate your help and shepherd ~ The original implementation of MergeJoinEnumerator#advance(before CALCITE-3258) is as below {code:java} advance() { ... ... cartesians = Linq4j.product( ImmutableList.of(Linq4j.enumerator(lefts), Linq4j.enumerator(rights))); return true; } {code} In which, matched keys from LHS and RHS are collected into *lefts* and *rights*, i.e. keys in *lefts* and *rights* are all the same. And then we generate cartesians from the rows of LHS and RHS with the same key. After PR-1717, the change is as below: {code:java} advance() { ... ... cartesians = new ConditionalEnumerator<>( Linq4j.product( ImmutableList.of(Linq4j.enumerator(lefts), Linq4j.enumerator(rights))), v -> predicate.apply((TSource) v.get(0), (TInner) v.get(1))); return true; } {code} In which, I wrapped a *ConditionalEnumerator* and filtering rows with *predicate*. Since I'm filtering based on rows with the same key each time, the complexity is decided by the duplication of the keys. 1. If all rows shares the single same key, yes, the complexity is O(m * n); 2. If all rows have distinct keys, the complexity is O(m + n) > EnumerableMergeJoin should not use take-while enumerator > - > > Key: CALCITE-3667 > URL: https://issues.apache.org/jira/browse/CALCITE-3667 > Project: Calcite > Issue Type: Improvement >Reporter: Jin Xing >Assignee: Jin Xing >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > Currently EnumerableMergeJoin use take-while enumerator [1] to emit values > satisfy predicate. However take-while enumerator stops the enumeration at > once when predicate is failed, which is not correct and we should finish the > enumeration and extract all the qualified values. > [1] > https://github.com/apache/calcite/blob/master/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L3896 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3155) AssertionError in RelToSqlConverter when visiting empty Values node
[ https://issues.apache.org/jira/browse/CALCITE-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008857#comment-17008857 ] Jin Xing commented on CALCITE-3155: --- Hi [~danny0405] and [~MusbahELFIL] PR-1286 add a test cases like below {code:java} @Test public void testUnionAllWithNoOperands() { String query = "select A.\"department_id\" " + "from \"foodmart\".\"employee\" A " + " where A.\"department_id\" = ( select min( A.\"department_id\") from \"foodmart\".\"department\" B where 1=2 )"; final String expected = "SELECT \"employee\".\"department_id\"\n" + "FROM \"foodmart\".\"employee\"\n" + "INNER JOIN (SELECT \"t1\".\"department_id\" AS \"department_id0\"," + " MIN(\"t1\".\"department_id\")\n" + "FROM (SELECT *\nFROM (VALUES (NULL, NULL))" + " AS \"t\" (\"department_id\", \"department_description\")" + "\nWHERE 1 = 0) AS \"t\"," + "\n(SELECT \"department_id\"\nFROM \"foodmart\".\"employee\"" + "\nGROUP BY \"department_id\") AS \"t1\"" + "\nGROUP BY \"t1\".\"department_id\") AS \"t3\" " + "ON \"employee\".\"department_id\" = \"t3\".\"department_id0\"" + " AND \"employee\".\"department_id\" = MIN(\"t1\".\"department_id\")"; sql(query).ok(expected); } {code} Seems the *expected* sql is not executable with join condition *employee.department_id= MIN(t1department_id)* If I run below sql: {code:java} select * from test t1 join test t2 on t1.b=min(t2.b); {code} Postgre will complain {code:java} ERROR: aggregate functions are not allowed in JOIN conditions {code} Mysql will complain {code:java} Invalid use of group function {code} > AssertionError in RelToSqlConverter when visiting empty Values node > --- > > Key: CALCITE-3155 > URL: https://issues.apache.org/jira/browse/CALCITE-3155 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Musbah EL FIL >Assignee: Danny Chen >Priority: Minor > Labels: pull-request-available > Fix For: 1.21.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Given the query: > *select A."department_id" from "foodmart"."employee" A where > A."department_id" = ( select min( A."department_id") from > "foodmart"."department" B where 1=2 )* > The conversion from SqlNode to RelNode is correct, however when going from > RelNode to SqlNode, an exception was being thrown. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3710) MaterializationService#defineMaterialization should inherit connection properties
[ https://issues.apache.org/jira/browse/CALCITE-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008600#comment-17008600 ] Jin Xing commented on CALCITE-3710: --- Hi [~vladimirsitnikov] I think this duplicates to CALCITE-3549. > MaterializationService#defineMaterialization should inherit connection > properties > - > > Key: CALCITE-3710 > URL: https://issues.apache.org/jira/browse/CALCITE-3710 > Project: Calcite > Issue Type: Bug > Components: core >Affects Versions: 1.21.0 >Reporter: Vladimir Sitnikov >Priority: Major > > Currently, defineMaterialization uses > {code:java} > final CalciteConnection connection = > CalciteMetaImpl.connect(schema.root(), null); > {code} > so it does not inherit the connection properties (quoting, caseSensitive, > etc). > Steps to reproduce: add {{.with(Lex.JAVA)}} to MaterializationTest (e.g. > testPrePopulated) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (CALCITE-3667) EnumerableMergeJoin should not use take-while enumerator
[ https://issues.apache.org/jira/browse/CALCITE-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17007901#comment-17007901 ] Jin Xing edited comment on CALCITE-3667 at 1/4/20 3:37 AM: --- Hi Julian ~ Thanks for look into this ~ I think PR-1717 doesn't change the complexity. 1. MergeJoinEnumerator#advance collect values with left keys and right keys matched; 2. Based on the result from step-1, ConditionalEnumerator filter the values with *nonEquiPredicate* The logic is similar to *mergeJoin(...).where(...)*, which is proposed by [1] . I'll close PR-1717, and go by fix from PR-1702 [1] https://github.com/apache/calcite/pull/1702/commits/686d985d4a45b1306d4d397ad51e6cc03fe3ba4d was (Author: jinxing6...@126.com): Hi Julian ~ Thanks for look into this ~ I think PR-1717 doesn't change the complexity. 1. MergeJoinEnumerator#advance collect values with left keys and right keys matched; 2. Based on the result from step-1, ConditionalEnumerator filter the values with *nonEquiPredicate* The logic is similar to *mergeJoin(...).where(...)*, which is proposed by [1] . I'll close this JIRA, and go by fix from PR-1702 [1] https://github.com/apache/calcite/pull/1702/commits/686d985d4a45b1306d4d397ad51e6cc03fe3ba4d > EnumerableMergeJoin should not use take-while enumerator > - > > Key: CALCITE-3667 > URL: https://issues.apache.org/jira/browse/CALCITE-3667 > Project: Calcite > Issue Type: Improvement >Reporter: Jin Xing >Assignee: Jin Xing >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Currently EnumerableMergeJoin use take-while enumerator [1] to emit values > satisfy predicate. However take-while enumerator stops the enumeration at > once when predicate is failed, which is not correct and we should finish the > enumeration and extract all the qualified values. > [1] > https://github.com/apache/calcite/blob/master/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L3896 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3667) EnumerableMergeJoin should not use take-while enumerator
[ https://issues.apache.org/jira/browse/CALCITE-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17007901#comment-17007901 ] Jin Xing commented on CALCITE-3667: --- Hi Julian ~ Thanks for look into this ~ I think PR-1717 doesn't change the complexity. 1. MergeJoinEnumerator#advance collect values with left keys and right keys matched; 2. Based on the result from step-1, ConditionalEnumerator filter the values with *nonEquiPredicate* The logic is similar to *mergeJoin(...).where(...)*, which is proposed by [1] . I'll close this JIRA, and go by fix from PR-1702 [1] https://github.com/apache/calcite/pull/1702/commits/686d985d4a45b1306d4d397ad51e6cc03fe3ba4d > EnumerableMergeJoin should not use take-while enumerator > - > > Key: CALCITE-3667 > URL: https://issues.apache.org/jira/browse/CALCITE-3667 > Project: Calcite > Issue Type: Improvement >Reporter: Jin Xing >Assignee: Jin Xing >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Currently EnumerableMergeJoin use take-while enumerator [1] to emit values > satisfy predicate. However take-while enumerator stops the enumeration at > once when predicate is failed, which is not correct and we should finish the > enumeration and extract all the qualified values. > [1] > https://github.com/apache/calcite/blob/master/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L3896 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3285) EnumerableMergeJoin should support non-equi join conditions
[ https://issues.apache.org/jira/browse/CALCITE-3285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17007899#comment-17007899 ] Jin Xing commented on CALCITE-3285: --- Hi, [~vladimirsitnikov] ~ Thanks a lot for the fix PR-1702 ~ Now I understand that the result type of *mergeJoin* is as below {code:java} ImmutableList.of(leftResult.physType, rightResult.physType) {code} And we can filter the values using predicate. Fixing by *mergeJoin(...).where(...)* can keep the code backward compatibility and make the code clean. > EnumerableMergeJoin should support non-equi join conditions > --- > > Key: CALCITE-3285 > URL: https://issues.apache.org/jira/browse/CALCITE-3285 > Project: Calcite > Issue Type: Improvement >Reporter: Haisheng Yuan >Assignee: Jin Xing >Priority: Major > Labels: pull-request-available > Fix For: 1.22.0 > > Time Spent: 2h 50m > Remaining Estimate: 0h > > Calcite should be able to generate EnumerableMergeJoin with non-equi join > conditions, as long as there are equi-join conditions. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3285) EnumerableMergeJoin should support non-equi join conditions
[ https://issues.apache.org/jira/browse/CALCITE-3285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17006842#comment-17006842 ] Jin Xing commented on CALCITE-3285: --- Hi [~vladimirsitnikov] Yes, you are right, it's a bug. I filed a JIRA -- CALCITE-3667 and proposed a fix [1]. It'd be great if you can take a look when you have time ~ Best, Jin https://github.com/apache/calcite/pull/1717 > EnumerableMergeJoin should support non-equi join conditions > --- > > Key: CALCITE-3285 > URL: https://issues.apache.org/jira/browse/CALCITE-3285 > Project: Calcite > Issue Type: Improvement >Reporter: Haisheng Yuan >Assignee: Jin Xing >Priority: Major > Labels: pull-request-available > Fix For: 1.22.0 > > Time Spent: 2h 50m > Remaining Estimate: 0h > > Calcite should be able to generate EnumerableMergeJoin with non-equi join > conditions, as long as there are equi-join conditions. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CALCITE-3667) EnumerableMergeJoin should not use take-while enumerator
[ https://issues.apache.org/jira/browse/CALCITE-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jin Xing updated CALCITE-3667: -- Description: Currently EnumerableMergeJoin use take-while enumerator [1] to emit values satisfy predicate. However take-while enumerator stops the enumeration at once when predicate is failed, which is not correct and we should finish the enumeration and extract all the qualified values. [1] https://github.com/apache/calcite/blob/master/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L3896 > EnumerableMergeJoin should not use take-while enumerator > - > > Key: CALCITE-3667 > URL: https://issues.apache.org/jira/browse/CALCITE-3667 > Project: Calcite > Issue Type: Improvement >Reporter: Jin Xing >Assignee: Jin Xing >Priority: Major > > Currently EnumerableMergeJoin use take-while enumerator [1] to emit values > satisfy predicate. However take-while enumerator stops the enumeration at > once when predicate is failed, which is not correct and we should finish the > enumeration and extract all the qualified values. > [1] > https://github.com/apache/calcite/blob/master/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L3896 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3667) EnumerableMergeJoin should not use take-while enumerator
Jin Xing created CALCITE-3667: - Summary: EnumerableMergeJoin should not use take-while enumerator Key: CALCITE-3667 URL: https://issues.apache.org/jira/browse/CALCITE-3667 Project: Calcite Issue Type: Improvement Reporter: Jin Xing Assignee: Jin Xing -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3285) EnumerableMergeJoin should support non-equi join conditions
[ https://issues.apache.org/jira/browse/CALCITE-3285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17006820#comment-17006820 ] Jin Xing commented on CALCITE-3285: --- Thanks for ping me [~vladimirsitnikov] I will verify soon ~ > EnumerableMergeJoin should support non-equi join conditions > --- > > Key: CALCITE-3285 > URL: https://issues.apache.org/jira/browse/CALCITE-3285 > Project: Calcite > Issue Type: Improvement >Reporter: Haisheng Yuan >Assignee: Jin Xing >Priority: Major > Labels: pull-request-available > Fix For: 1.22.0 > > Time Spent: 2h 50m > Remaining Estimate: 0h > > Calcite should be able to generate EnumerableMergeJoin with non-equi join > conditions, as long as there are equi-join conditions. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3664) Sort in subquery is lost when converting SqlNode to Relnode
[ https://issues.apache.org/jira/browse/CALCITE-3664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17006813#comment-17006813 ] Jin Xing commented on CALCITE-3664: --- Is it because collation of emp already satisfied the *order by ename*, thus no need to create a Sort operator > Sort in subquery is lost when converting SqlNode to Relnode > --- > > Key: CALCITE-3664 > URL: https://issues.apache.org/jira/browse/CALCITE-3664 > Project: Calcite > Issue Type: Bug >Reporter: Wang Yanlin >Assignee: Wang Yanlin >Priority: Major > > Lost sort in subquery when converting a SqlNode to Relnode, making its > semantics changed. > The test case to reproduce > {code:java} > // SqlToRelConverterTest > @Test public void testSortInSubquery() { > final String sql = "select ename from (select ename from emp order by > ename) a"; > sql(sql).ok(); > } > {code} > This test case will success with this plan. > {code:java} > > > > > > > > > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (CALCITE-3593) RelToSqlConverter changes target of ambiguous HAVING clause with a Project on Filter on Aggregate
[ https://issues.apache.org/jira/browse/CALCITE-3593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jin Xing reassigned CALCITE-3593: - Assignee: Jin Xing > RelToSqlConverter changes target of ambiguous HAVING clause with a Project on > Filter on Aggregate > - > > Key: CALCITE-3593 > URL: https://issues.apache.org/jira/browse/CALCITE-3593 > Project: Calcite > Issue Type: Bug >Reporter: Steven Talbot >Assignee: Jin Xing >Priority: Major > > Best shown with the shell of a test in RelToSqlConverter test. > The following SQL on BigQuery > {code:java} > select product_id - 1000 as product_id > from ( > select product_id, avg(gross_weight) as agw > from (SELECT 1 as product_id, 70 as net_weight, 170 as gross_weight) as > product > where net_weight < 100 > group by product_id having product_id > 0){code} > produces one result, because the having filter applies to the product id > before subtraction, of course. > Running it through the machinery in that test > (`sql(query).withBigQuery().ok(expected)`) translates it to: > {noformat} > SELECT product_id - 1000 AS product_id > from (SELECT 1 as product_id, 70 as net_weight, 170 as gross_weight) as > product > WHERE net_weight < 100 > GROUP BY product_id > HAVING product_id > 0{noformat} > This changes the meaning of the query: now the HAVING is on the > after-subtraction product_id and you get no results, rather than the one > result. > Note that this is _not_ true in HyperSQL, as it has different semantics > around the HAVING namespace. > {noformat} > select "product_id" - 1000 as "product_id" > from ( > select "product_id", avg("gross_weight") as agw > from (SELECT 1 as "product_id", 70 as "net_weight", 170 as "gross_weight" > FROM (VALUES(0))) as product > where "net_weight" < 100 > group by "product_id" having "product_id" > 0){noformat} > becomes > {noformat} > SELECT "product_id" - 1000 AS "product_id" > from (SELECT 1 as "product_id", 70 as "net_weight", 170 as "gross_weight" > FROM (VALUES(0))) as product > WHERE "net_weight" < 100 > GROUP BY "product_id" > HAVING "product_id" > 0{noformat} > But the meaning is preserved: both return a row. > I'm not enough of a SQL standards expert to know which one is being more > compliant, but presumably both would have to be supported via some sort of > flag? > I think the fix would be to force the subselect on dialects such as BigQuery > that have this behavior. Probably something that looks a lot like > [https://github.com/apache/calcite/blob/3530daaa8cad43aad6845b6c79e4bc1ca0e72f5f/core/src/main/java/org/apache/calcite/rel/rel2sql/SqlImplementor.java#L1043-L1047] > The test, of course, looks like pretty silly SQL no one would ever write, but > the point is this is what's generated when you have > {noformat} > Project f(x) as x > Filter g(x) > Aggregate {}, ...{noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CALCITE-3658) TableModify of Update contains correlated variable by mistake.
[ https://issues.apache.org/jira/browse/CALCITE-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jin Xing updated CALCITE-3658: -- Description: UPDATE clause like below {code:java} update emp set empno = empno + 1 {code} will be converted to {code:java} LogicalTableModify(table=[[CATALOG, SALES, EMP]], operation=[UPDATE], updateColumnList=[[EMPNO]], sourceExpressionList=[[+($cor0.EMPNO, 1)]], flattened=[true]) LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], EXPR$0=[+($0, 1)]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code} However the correlated variables $cor0.EMPNO should not exist in *sourceExpressionList*. It also brings trouble when convert TableModify back to Sql string. was: UPDATE clause like below {code:java} update emp set empno = empno + 1 {code} will be converted to {code:java} LogicalTableModify(table=[[CATALOG, SALES, EMP]], operation=[UPDATE], updateColumnList=[[EMPNO]], sourceExpressionList=[[+($cor0.EMPNO, 1)]], flattened=[true]) LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], EXPR$0=[+($0, 1)]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code} However the correlated variables should not exist in *sourceExpressionList*. It also brings trouble when convert TableModify back to Sql string. > TableModify of Update contains correlated variable by mistake. > -- > > Key: CALCITE-3658 > URL: https://issues.apache.org/jira/browse/CALCITE-3658 > Project: Calcite > Issue Type: Improvement >Reporter: Jin Xing >Priority: Major > > UPDATE clause like below > {code:java} > update emp set empno = empno + 1 > {code} > will be converted to > {code:java} > LogicalTableModify(table=[[CATALOG, SALES, EMP]], operation=[UPDATE], > updateColumnList=[[EMPNO]], sourceExpressionList=[[+($cor0.EMPNO, 1)]], > flattened=[true]) > LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], > SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], EXPR$0=[+($0, 1)]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > {code} > However the correlated variables $cor0.EMPNO should not exist in > *sourceExpressionList*. > It also brings trouble when convert TableModify back to Sql string. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CALCITE-3658) TableModify of Update contains correlated variable by mistake.
[ https://issues.apache.org/jira/browse/CALCITE-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jin Xing updated CALCITE-3658: -- Description: UPDATE clause like below {code:java} update emp set empno = empno + 1 {code} will be converted to {code:java} LogicalTableModify(table=[[CATALOG, SALES, EMP]], operation=[UPDATE], updateColumnList=[[EMPNO]], sourceExpressionList=[[+($cor0.EMPNO, 1)]], flattened=[true]) LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], EXPR$0=[+($0, 1)]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code} However the correlated variables should not exist in *sourceExpressionList*. It also brings trouble when convert TableModify back to Sql string. > TableModify of Update contains correlated variable by mistake. > -- > > Key: CALCITE-3658 > URL: https://issues.apache.org/jira/browse/CALCITE-3658 > Project: Calcite > Issue Type: Improvement >Reporter: Jin Xing >Priority: Major > > UPDATE clause like below > {code:java} > update emp set empno = empno + 1 > {code} > will be converted to > {code:java} > LogicalTableModify(table=[[CATALOG, SALES, EMP]], operation=[UPDATE], > updateColumnList=[[EMPNO]], sourceExpressionList=[[+($cor0.EMPNO, 1)]], > flattened=[true]) > LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], > SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], EXPR$0=[+($0, 1)]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > {code} > However the correlated variables should not exist in *sourceExpressionList*. > It also brings trouble when convert TableModify back to Sql string. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CALCITE-3658) TableModify of Update contains correlated variable by mistake.
Jin Xing created CALCITE-3658: - Summary: TableModify of Update contains correlated variable by mistake. Key: CALCITE-3658 URL: https://issues.apache.org/jira/browse/CALCITE-3658 Project: Calcite Issue Type: Improvement Reporter: Jin Xing -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (CALCITE-3644) Calc on the Intersect in target is not being matched
[ https://issues.apache.org/jira/browse/CALCITE-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17005949#comment-17005949 ] Jin Xing edited comment on CALCITE-3644 at 12/31/19 6:31 AM: - Could you please add the corresponding RelNode of query and mv in JIRA description and explain why the matching failed ? was (Author: jinxing6...@126.com): Could you please add the corresponding RelNode of query and mv and explain why the matching failed ? > Calc on the Intersect in target is not being matched > > > Key: CALCITE-3644 > URL: https://issues.apache.org/jira/browse/CALCITE-3644 > Project: Calcite > Issue Type: Wish >Reporter: xzh_dz >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > {code:java} > @Test public void testIntersectToCalcOnIntersect() { > final String mv = "" > + "select \"deptno\",\"name\" from \"emps\"\n" > + "intersect all\n" > + "select \"deptno\",\"name\" from \"depts\""; > String mv1 = "select \"name\", \"deptno\" from (" + mv + ")"; > final String query = "" > + "select \"name\",\"deptno\" from \"depts\"\n" > + "intersect all\n" > + "select \"name\",\"deptno\" from \"emps\""; > checkMaterialize(mv1, query, true); > } > {code} > error: > {code:java} > java.lang.AssertionError: > Expected: a string containing "EnumerableTableScan(table=[[hr, m0]])" > but: was "PLAN=EnumerableIntersect(all=[true])\n > EnumerableCalc(expr#0..3=[{inputs}], name=[$t1], deptno=[$t0])\n > EnumerableTableScan(table=[[hr, depts]])\n > EnumerableCalc(expr#0..4=[{inputs}], name=[$t2], deptno=[$t1])\n > EnumerableTableScan(table=[[hr, emps]])\n\n" > > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18) > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) > at > org.apache.calcite.test.CalciteAssert.lambda$checkResultContains$7(CalciteAssert.java:429) > at > org.apache.calcite.test.CalciteAssert.assertQuery(CalciteAssert.java:544) > at > org.apache.calcite.test.CalciteAssert$AssertQuery.lambda$returns$1(CalciteAssert.java:1514) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3644) Calc on the Intersect in target is not being matched
[ https://issues.apache.org/jira/browse/CALCITE-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17005949#comment-17005949 ] Jin Xing commented on CALCITE-3644: --- Could you please add the corresponding RelNode of query and mv and explain why the matching failed ? > Calc on the Intersect in target is not being matched > > > Key: CALCITE-3644 > URL: https://issues.apache.org/jira/browse/CALCITE-3644 > Project: Calcite > Issue Type: Wish >Reporter: xzh_dz >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > {code:java} > @Test public void testIntersectToCalcOnIntersect() { > final String mv = "" > + "select \"deptno\",\"name\" from \"emps\"\n" > + "intersect all\n" > + "select \"deptno\",\"name\" from \"depts\""; > String mv1 = "select \"name\", \"deptno\" from (" + mv + ")"; > final String query = "" > + "select \"name\",\"deptno\" from \"depts\"\n" > + "intersect all\n" > + "select \"name\",\"deptno\" from \"emps\""; > checkMaterialize(mv1, query, true); > } > {code} > error: > {code:java} > java.lang.AssertionError: > Expected: a string containing "EnumerableTableScan(table=[[hr, m0]])" > but: was "PLAN=EnumerableIntersect(all=[true])\n > EnumerableCalc(expr#0..3=[{inputs}], name=[$t1], deptno=[$t0])\n > EnumerableTableScan(table=[[hr, depts]])\n > EnumerableCalc(expr#0..4=[{inputs}], name=[$t2], deptno=[$t1])\n > EnumerableTableScan(table=[[hr, emps]])\n\n" > > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18) > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) > at > org.apache.calcite.test.CalciteAssert.lambda$checkResultContains$7(CalciteAssert.java:429) > at > org.apache.calcite.test.CalciteAssert.assertQuery(CalciteAssert.java:544) > at > org.apache.calcite.test.CalciteAssert$AssertQuery.lambda$returns$1(CalciteAssert.java:1514) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3606) batch insert failed
[ https://issues.apache.org/jira/browse/CALCITE-3606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17003595#comment-17003595 ] Jin Xing commented on CALCITE-3606: --- Hi Ran ~ I proposed a PR to fix the case you provided ~ Please check ~ Jin > batch insert failed > --- > > Key: CALCITE-3606 > URL: https://issues.apache.org/jira/browse/CALCITE-3606 > Project: Calcite > Issue Type: Wish > Components: core >Affects Versions: 1.21.0 >Reporter: Ran Cao >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > when I try to execute sql like (insert into example_table (column1,column2) > values (value1,value2),(value1,value2)), it will failed with error message > like this: column "EXPR$0" of relation "example_table" does not exist. I find > the reason is that when converting SqlNode(insert sql) to > RelNode(TableModify), one of the steps is to change the column that stored in > RelDataType from the fake column name (like "EXPR$0") to the real column name > (like "id"). But when the values part in sql is more than one , the step > above-mentioned will skip because the RelNode is instance of LogicalUnion > instead of Project, the code refered to org.apache.calcite.tools.RelBuilder > line 1461: > if (input instanceof Project && fieldNames != null) { > // change the column name > } > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3624) Getting java.lang.NullPointerException at org.apache.calcite.schema.Schemas.queryable(Schemas.java:211)
[ https://issues.apache.org/jira/browse/CALCITE-3624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002316#comment-17002316 ] Jin Xing commented on CALCITE-3624: --- Could you please format your code by annotation like below ? {code:java} some codes. {code} > Getting java.lang.NullPointerException at > org.apache.calcite.schema.Schemas.queryable(Schemas.java:211) > --- > > Key: CALCITE-3624 > URL: https://issues.apache.org/jira/browse/CALCITE-3624 > Project: Calcite > Issue Type: Bug >Reporter: anjali shrishrimal >Priority: Minor > > I am creating a RelNode using RelBuilder. > > @Test > public void testRun() throws Exception > { > FrameworkConfig frameworkConfig = config().build(); > final RelBuilder builder = RelBuilder.create(frameworkConfig); > String[] name = new String[]\{"ITEM", "ITEM"}; > RelNode scan = builder.scan(name).build(); > TranslatableTable items = > (TranslatableTable)frameworkConfig.getDefaultSchema().getSubSchema("ITEM").getTable("ITEM"); > scan = items.toRel(ViewExpanders.simpleContext(scan.getCluster()), > scan.getTable()); > > RelNode level0RelNode = > builder.push(scan) > .filter( > builder.call(SqlStdOperatorTable.AND, > builder.call(SqlStdOperatorTable.EQUALS, builder.field("ITEM_ID"), > builder.literal("AM_TestItem_21Aug17_0614")), > builder.call(SqlStdOperatorTable.EQUALS, builder.field("OBJECT_TYPE"), > builder.literal("Item")) > ) > ).build(); > > System.out.println(level0RelNode); > try (PreparedStatement preparedStatement = RelRunners.run(level0RelNode)) > { System.out.println(""); > System.out.println(preparedStatement.toString()); > System.out.println(""); > String s = CalciteAssert.toString(preparedStatement.executeQuery()); > System.out.println(""); > System.out.println(s); > } > } > > > public static Frameworks.ConfigBuilder config() > { > CalciteSchema rootSchema1 = CalciteSchema.createRootSchema(true); > rootSchema1 = rootSchema1.add("ITEM", new MySchema()); > /* MySchema consists of Map of MyTable which extends AbstractQueryableTable > and implements TranslatableTable */ > > return Frameworks.newConfigBuilder().defaultSchema(rootSchema1.plus()); > } > > > I am getting following exception while executing a query. > > java.sql.SQLException: exception while executing query: > nulljava.sql.SQLException: exception while executing query: null at > org.apache.calcite.avatica.Helper.createException(Helper.java:56) at > org.apache.calcite.avatica.Helper.createException(Helper.java:41) at > org.apache.calcite.avatica.AvaticaConnection.executeQueryInternal(AvaticaConnection.java:577) > at > org.apache.calcite.avatica.AvaticaPreparedStatement.executeQuery(AvaticaPreparedStatement.java:137) > at org.apache.calcite.test.TeamcenterTest.testRun2(TeamcenterTest.java:372) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at > org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at > org.junit.runners.ParentRunner.run(ParentRunner.java:363) at > org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:86) > at > org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:539) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:761) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:461) > at > org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main
[jira] [Commented] (CALCITE-3621) JDBC adapter can't push down sort to DB
[ https://issues.apache.org/jira/browse/CALCITE-3621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17001957#comment-17001957 ] Jin Xing commented on CALCITE-3621: --- # Why the change in [RelToSqlConverter.java|https://github.com/apache/calcite/pull/1680/files/3c0dd3f7cd750cd9af2035726a131464bbac229f#diff-79ccdc597ccf355912fea1a60683eea2] is necessary ? # How about update the change in JdbcSortRule as {code:java} final RelTraitSet inputTraitSet = sort.getInput().getTraitSet().replace(out); {code} > JDBC adapter can't push down sort to DB > --- > > Key: CALCITE-3621 > URL: https://issues.apache.org/jira/browse/CALCITE-3621 > Project: Calcite > Issue Type: Bug > Components: jdbc-adapter >Affects Versions: 1.21.0 >Reporter: Lei Jiang >Assignee: Lei Jiang >Priority: Major > Labels: pull-request-available > Fix For: 1.22.0 > > Time Spent: 20m > Remaining Estimate: 0h > > JDBC adapter can't push down sort to DB > {code:java} > select ename from scott.emp order by empno > {code} > {code:java} > PLAN=EnumerableSort(sort0=[$1], dir0=[ASC]) > JdbcToEnumerableConverter > JdbcProject(ENAME=[$1], EMPNO=[$0]) > JdbcTableScan(table=[[SCOTT, EMP]]) > {code} > It should be: > {code:java} > PLAN=JdbcToEnumerableConverter > JdbcSort(sort0=[$1], dir0=[ASC]) > JdbcProject(ENAME=[$1], EMPNO=[$0]) > JdbcTableScan(table=[[SCOTT, EMP]]) > {code} > I think the root cause is JdbcSortRule, it convert input's trait to "JDBC, > {color:#ff}[1]{color}". that is, input's relset will add a "JDBC, [1]" > subset. But there is nothing rule can convert that input to a rel with "JDBC, > {color:#ff}[1]{color}", so EnumerableSort win. > {code:java} > public RelNode convert(Sort sort, boolean convertInputTraits) { > final RelTraitSet traitSet = sort.getTraitSet().replace(out); > final RelNode input; > if (convertInputTraits) { > input = convert(sort.getInput(), traitSet); > } else { > input = sort.getInput(); > } > return new JdbcSort(sort.getCluster(), traitSet, > input, sort.getCollation(), sort.offset, sort.fetch); > } > {code} > This is my a part of change: convert input's trait to "JDBC, []" > {code:java} > public RelNode convert(Sort sort, boolean convertInputTraits) { > final RelTraitSet traitSet = sort.getTraitSet().replace(out); > //update > RelTraitSet inputTraitSet = traitSet.replace(RelCollations.EMPTY); > final RelNode input; > if (convertInputTraits) { > //update > input = convert(sort.getInput(), inputTraitSet); > } else { > input = sort.getInput(); > } > return new JdbcSort(sort.getCluster(), traitSet, > input, sort.getCollation(), sort.offset, sort.fetch); > } > {code} > I have updated some classes to reslove this issue > When JdbcSort is used, my solution will trigger a regression issue > JdbcTest#testSelfJoinDifferentColumns(): Project->Sort->Join(left.A, > right.A), Join.left and Join.right have same field A. My solution: select > list of Sort contains explicit column name instead of * > > {code:java} > RelToSqlConverter#visit(Sort e) > ... > Result x = visitChild(0, e.getInput()); > Builder builder = x.builder(e, Clause.ORDER_BY); > //my update > if (builder.select.getSelectList() == null) { > final List selectList = Expressions.list(); > for (RelDataTypeField field : e.getRowType().getFieldList()) { > addSelect(selectList, builder.context.field(field.getIndex()), > e.getRowType()); > } > builder.select.setSelectList(new SqlNodeList(selectList, POS)); > } > //end > ...{code} > Before: > SELECT "t2"."full_name", "t2"."last_name{color:#ff}0{color}" AS > "last_name" > FROM (SELECT {color:#ff}*{color} > FROM (SELECT "last_name" > FROM "foodmart"."employee") AS "t" > INNER JOIN (SELECT "full_name", "first_name", "last_name" > FROM "foodmart"."employee") AS "t0" ON "t"."last_name" = "t0"."first_name" > ORDER BY "t0"."last_name" NULLS LAST > After: > SELECT "t2"."full_name", "t2"."last_name0" AS "last_name" > FROM (SELECT {color:#ff}"t"."last_name", "t0"."full_name", > "t0"."first_name", "t0"."last_name" AS "last_name0"{color} > FROM (SELECT "last_name" > FROM "foodmart"."employee") AS "t" > INNER JOIN (SELECT "full_name", "first_name", "last_name" > FROM "foodmart"."employee") AS "t0" ON "t"."last_name" = "t0"."first_name" > ORDER BY "t0"."last_name" NULLS LAST > LIMIT 3) AS "t2" -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3387) Query with GROUP BY and JOIN ... USING wrongly fails with "Column 'DEPTNO' is ambiguous" error
[ https://issues.apache.org/jira/browse/CALCITE-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000630#comment-17000630 ] Jin Xing commented on CALCITE-3387: --- Thanks for ping me, Chunwei ~ I just added minor comments. > Query with GROUP BY and JOIN ... USING wrongly fails with "Column 'DEPTNO' is > ambiguous" error > -- > > Key: CALCITE-3387 > URL: https://issues.apache.org/jira/browse/CALCITE-3387 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Chunwei Lei >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > Query with GROUP BY and JOIN ... USING wrongly fails with "Column 'DEPTNO' is > ambiguous" error. Here is the query: > {code} > select deptno, count(*) > from emp > join dept using (deptno) > group by deptno; > {code} > Because of USING, the two deptno fields should be merged into one, and > therefore {{deptno}} is not ambiguous. That query works in Oracle. Also note > that > {code}select deptno > from emp > join dept using (deptno) > {code} > works correctly in Calcite, and has since CALCITE-2227 was fixed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3609) calite for hive struct Ineffective optimization
[ https://issues.apache.org/jira/browse/CALCITE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16999168#comment-16999168 ] Jin Xing commented on CALCITE-3609: --- Seems the patch is not against latest code base. This issue is already fixed by [http://issues.apache.org/jira/browse/CALCITE-2464] > calite for hive struct Ineffective optimization > > > Key: CALCITE-3609 > URL: https://issues.apache.org/jira/browse/CALCITE-3609 > Project: Calcite > Issue Type: Bug > Components: core >Affects Versions: 1.10.0, 1.16.0 >Reporter: hezhang >Priority: Blocker > Fix For: 1.16.0 > > Attachments: calcite-3608.patch > > > student table: > {code:java} > CREATE TABLE `student`( > `id` int, > `info` struct) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > WITH SERDEPROPERTIES ( > 'colelction.delim'=':', > 'field.delim'=',', > 'serialization.format'=',', > 'serialization.null.format'='NULL') > {code} > the sql: > > {code:java} > select * from student where info is not null; > {code} > result: > > {code:java} > 1 {"name":"zhou","age":30} > 2 {"name":"yan","age":30} > 3 {"name":"chen","age":20} > 4 {"name":"li","age":80} > NULLNULL > NULL{"name":null,"age":null} > {code} > cause:calcite Ineffective optimization > {code:java} > HiveProject(id=[$0], info=[$1]) > HiveTableScan(table=[[default.student]], table:alias=[student]) > {code} > > after fix RelRecordType isNullable,the plan is > {code:java} > HiveProject(id=[$0], info=[$1]) > HiveFilter(condition=[IS NOT NULL($1)]) > HiveTableScan(table=[[default.student]], table:alias=[student]) > {code} > then the result is: > {code:java} > 1 {"name":"zhou","age":30} > 2 {"name":"yan","age":30} > 3 {"name":"chen","age":20} > 4 {"name":"li","age":80} > NULL{"name":null,"age":null} > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3495) RelDecorrelator generate plan with different semantics when handle Aggregate
[ https://issues.apache.org/jira/browse/CALCITE-3495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16998820#comment-16998820 ] Jin Xing commented on CALCITE-3495: --- Rather than failing the decorrelation when COUNT, do we consider just disable decorrelation when subquery is an Aggregate? Because user have the flexibility to define UDAF, it's hard to predicate the Agg behavior or enumerate all types of Agg functions. Best, Jin > RelDecorrelator generate plan with different semantics when handle Aggregate > > > Key: CALCITE-3495 > URL: https://issues.apache.org/jira/browse/CALCITE-3495 > Project: Calcite > Issue Type: Bug >Reporter: Jin Xing >Assignee: Wang Yanlin >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Given below Sql and data > {code:java} > SELECT deptno FROM dept A where deptno in > (SELECT count(1) FROM emp B where A.deptno = B.deptno) > - > Data of dept: > deptno > 0 > 1 > > Data of emp: > deptno > 1{code} > From the Sql semantics by nest-loop join, we will expect the result as > {code:java} > Result: > deptno > 0 > 1{code} > However the decorrelated plan is as below: > {code:java} > LogicalProject(DEPTNO=[$0]) > LogicalJoin(condition=[=($0, $3)], joinType=[inner]) > LogicalTableScan(table=[[CATALOG, SALES, DEPT]]) > LogicalFilter(condition=[=($1, $0)]) > LogicalProject(EXPR$0=[$1], DEPTNO=[$0]) > LogicalAggregate(group=[{0}], EXPR$0=[COUNT()]) > LogicalProject(DEPTNO=[$7], $f0=[1]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > {code} > The result of this plan is as below: > {code:java} > Result: > deptno > 1 > {code} > As we can see the two above results are different. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-3606) batch insert failed
[ https://issues.apache.org/jira/browse/CALCITE-3606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16998815#comment-16998815 ] Jin Xing commented on CALCITE-3606: --- Hi, Ran Cao ~ Yes, seems current RelBuilder#projectNamed failed to rename the fields. If we run below tests {code:java} @Test public void testDEV() { final RelBuilder builder = RelBuilder.create(config().build()); RelNode rel = builder .values(new String[] {"x"}, 0) .values(new String[] {"y"}, 1) .union(true) .projectNamed( ImmutableList.of(builder.field(0)), ImmutableList.of("col0"), false) .build(); System.out.println(rel.getRowType()); System.out.println(RelOptUtil.toString(rel)); }{code} We get {code:java} RecordType(INTEGER x) LogicalUnion(all=[true]) LogicalValues(tuples=[[{ 0 }]]) LogicalValues(tuples=[[{ 1 }]]) {code} As we can see, the row type is not as expected. For the exception you mentioned – – _*column "EXPR$0" of relation "example_table" does not exist,*_ would you please elaborate by stacktrace or unit test? Best, Jin > batch insert failed > --- > > Key: CALCITE-3606 > URL: https://issues.apache.org/jira/browse/CALCITE-3606 > Project: Calcite > Issue Type: Wish > Components: core >Affects Versions: 1.21.0 >Reporter: Ran Cao >Priority: Major > > when I try to execute sql like (insert into example_table (column1,column2) > values (value1,value2),(value1,value2)), it will failed with error message > like this: column "EXPR$0" of relation "example_table" does not exist. I find > the reason is that when converting SqlNode(insert sql) to > RelNode(TableModify), one of the steps is to change the column that stored in > RelDataType from the fake column name (like "EXPR$0") to the real column name > (like "id"). But when the values part in sql is more than one , the step > above-mentioned will skip because the RelNode is instance of LogicalUnion > instead of Project, the code refered to org.apache.calcite.tools.RelBuilder > line 1461: > if (input instanceof Project && fieldNames != null) { > // change the column name > } > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (CALCITE-3387) Query with GROUP BY and JOIN ... USING wrongly fails with "Column 'DEPTNO' is ambiguous" error
[ https://issues.apache.org/jira/browse/CALCITE-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997997#comment-16997997 ] Jin Xing edited comment on CALCITE-3387 at 12/17/19 10:37 AM: -- Given tables like below {code:java} emp: empno deptno clubno dept: deptno business location: clubno sport {code} If user write sql (multi-join) like {code:java} select empno, deptno, clubno, business, sport from emp join dept using (deptno) join club using (clubno){code} What is correct behavior for Calcite ? It works find on Mysql and PostgreSql With PR-1655, it throws Exception for ambiguous column. Do we need to support such case ? was (Author: jinxing6...@126.com): Given tables like below {code:java} emp: empno deptno clubno dept: deptno business location: clubno sport {code} If user write sql (multi-join) like {code:java} select empno, deptno, clubno, business, sport from emp join dept using (deptno) join club using (clubno){code} What is correct behavior for Calcite ? It works find on Mysql With PR-1655, it throws Exception for ambiguous column. Do we need to support such case ? > Query with GROUP BY and JOIN ... USING wrongly fails with "Column 'DEPTNO' is > ambiguous" error > -- > > Key: CALCITE-3387 > URL: https://issues.apache.org/jira/browse/CALCITE-3387 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Chunwei Lei >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Query with GROUP BY and JOIN ... USING wrongly fails with "Column 'DEPTNO' is > ambiguous" error. Here is the query: > {code} > select deptno, count(*) > from emp > join dept using (deptno) > group by deptno; > {code} > Because of USING, the two deptno fields should be merged into one, and > therefore {{deptno}} is not ambiguous. That query works in Oracle. Also note > that > {code}select deptno > from emp > join dept using (deptno) > {code} > works correctly in Calcite, and has since CALCITE-2227 was fixed. -- This message was sent by Atlassian Jira (v8.3.4#803005)