Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
HanumathRao commented on code in PR #3640: URL: https://github.com/apache/calcite/pull/3640#discussion_r1500185054 ## core/src/test/java/org/apache/calcite/test/JdbcTest.java: ## @@ -8270,6 +8270,75 @@ private void checkGetTimestamp(Connection con) throws SQLException { .returns("EXPR$0=[1, 1.1]\n"); } + /** Test case for + * https://issues.apache.org/jira/browse/CALCITE-6032;>[CALCITE-6032] + * NullPointerException in Reldecorrelator for a Multi level correlated subquery. */ + @Test void testMultiLevelDecorrelation() throws Exception { +String hsqldbMemUrl = "jdbc:hsqldb:mem:."; +Connection baseConnection = DriverManager.getConnection(hsqldbMemUrl); +Statement baseStmt = baseConnection.createStatement(); +baseStmt.execute("create table invoice (inv_id integer, col1\n" ++ "integer, inv_amt integer)"); +baseStmt.execute("create table item(item_id integer, item_amt\n" ++ "integer, item_col1 integer, item_col2 integer, item_col3\n" ++ "integer,item_col4 integer )"); +baseStmt.execute("INSERT INTO invoice VALUES (1, 1, 1)"); +baseStmt.execute("INSERT INTO invoice VALUES (2, 2, 2)"); +baseStmt.execute("INSERT INTO invoice VALUES (3, 3, 3)"); +baseStmt.execute("INSERT INTO item values (1, 1, 1, 1, 1, 1)"); +baseStmt.execute("INSERT INTO item values (2, 2, 2, 2, 2, 2)"); +baseStmt.close(); +baseConnection.commit(); + +Properties info = new Properties(); +info.put("model", +"inline:" ++ "{\n" ++ " version: '1.0',\n" ++ " defaultSchema: 'BASEJDBC',\n" ++ " schemas: [\n" ++ " {\n" ++ " type: 'jdbc',\n" ++ " name: 'BASEJDBC',\n" ++ " jdbcDriver: '" + jdbcDriver.class.getName() + "',\n" ++ " jdbcUrl: '" + hsqldbMemUrl + "',\n" ++ " jdbcCatalog: null,\n" ++ " jdbcSchema: null\n" ++ " }\n" ++ " ]\n" ++ "}"); + +Connection calciteConnection = +DriverManager.getConnection("jdbc:calcite:", info); + +String statement = "SELECT Sum(invoice.inv_amt * (\n" Review Comment: Thanks @jinfengni I will look into CALCITE-5390 to understand a little bit more on the difference. ## core/src/test/java/org/apache/calcite/test/JdbcTest.java: ## @@ -8270,6 +8270,75 @@ private void checkGetTimestamp(Connection con) throws SQLException { .returns("EXPR$0=[1, 1.1]\n"); } + /** Test case for + * https://issues.apache.org/jira/browse/CALCITE-6032;>[CALCITE-6032] + * NullPointerException in Reldecorrelator for a Multi level correlated subquery. */ + @Test void testMultiLevelDecorrelation() throws Exception { +String hsqldbMemUrl = "jdbc:hsqldb:mem:."; +Connection baseConnection = DriverManager.getConnection(hsqldbMemUrl); +Statement baseStmt = baseConnection.createStatement(); +baseStmt.execute("create table invoice (inv_id integer, col1\n" ++ "integer, inv_amt integer)"); +baseStmt.execute("create table item(item_id integer, item_amt\n" ++ "integer, item_col1 integer, item_col2 integer, item_col3\n" ++ "integer,item_col4 integer )"); +baseStmt.execute("INSERT INTO invoice VALUES (1, 1, 1)"); +baseStmt.execute("INSERT INTO invoice VALUES (2, 2, 2)"); +baseStmt.execute("INSERT INTO invoice VALUES (3, 3, 3)"); +baseStmt.execute("INSERT INTO item values (1, 1, 1, 1, 1, 1)"); +baseStmt.execute("INSERT INTO item values (2, 2, 2, 2, 2, 2)"); +baseStmt.close(); +baseConnection.commit(); + +Properties info = new Properties(); +info.put("model", +"inline:" ++ "{\n" ++ " version: '1.0',\n" ++ " defaultSchema: 'BASEJDBC',\n" ++ " schemas: [\n" ++ " {\n" ++ " type: 'jdbc',\n" ++ " name: 'BASEJDBC',\n" ++ " jdbcDriver: '" + jdbcDriver.class.getName() + "',\n" ++ " jdbcUrl: '" + hsqldbMemUrl + "',\n" ++ " jdbcCatalog: null,\n" ++ " jdbcSchema: null\n" ++ " }\n" ++ " ]\n" ++ "}"); + +Connection calciteConnection = +DriverManager.getConnection("jdbc:calcite:", info); + +String statement = "SELECT Sum(invoice.inv_amt * (\n" Review Comment: Thanks @jinfengni I will look into CALCITE-5390 to understand a little bit more on the difference between these issues. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
jinfengni commented on code in PR #3640: URL: https://github.com/apache/calcite/pull/3640#discussion_r1499859569 ## core/src/test/java/org/apache/calcite/test/JdbcTest.java: ## @@ -8270,6 +8270,75 @@ private void checkGetTimestamp(Connection con) throws SQLException { .returns("EXPR$0=[1, 1.1]\n"); } + /** Test case for + * https://issues.apache.org/jira/browse/CALCITE-6032;>[CALCITE-6032] + * NullPointerException in Reldecorrelator for a Multi level correlated subquery. */ + @Test void testMultiLevelDecorrelation() throws Exception { +String hsqldbMemUrl = "jdbc:hsqldb:mem:."; +Connection baseConnection = DriverManager.getConnection(hsqldbMemUrl); +Statement baseStmt = baseConnection.createStatement(); +baseStmt.execute("create table invoice (inv_id integer, col1\n" ++ "integer, inv_amt integer)"); +baseStmt.execute("create table item(item_id integer, item_amt\n" ++ "integer, item_col1 integer, item_col2 integer, item_col3\n" ++ "integer,item_col4 integer )"); +baseStmt.execute("INSERT INTO invoice VALUES (1, 1, 1)"); +baseStmt.execute("INSERT INTO invoice VALUES (2, 2, 2)"); +baseStmt.execute("INSERT INTO invoice VALUES (3, 3, 3)"); +baseStmt.execute("INSERT INTO item values (1, 1, 1, 1, 1, 1)"); +baseStmt.execute("INSERT INTO item values (2, 2, 2, 2, 2, 2)"); +baseStmt.close(); +baseConnection.commit(); + +Properties info = new Properties(); +info.put("model", +"inline:" ++ "{\n" ++ " version: '1.0',\n" ++ " defaultSchema: 'BASEJDBC',\n" ++ " schemas: [\n" ++ " {\n" ++ " type: 'jdbc',\n" ++ " name: 'BASEJDBC',\n" ++ " jdbcDriver: '" + jdbcDriver.class.getName() + "',\n" ++ " jdbcUrl: '" + hsqldbMemUrl + "',\n" ++ " jdbcCatalog: null,\n" ++ " jdbcSchema: null\n" ++ " }\n" ++ " ]\n" ++ "}"); + +Connection calciteConnection = +DriverManager.getConnection("jdbc:calcite:", info); + +String statement = "SELECT Sum(invoice.inv_amt * (\n" Review Comment: My view is it would be better to use a "minimal" query to reproduce the issue, and verify the fix, in particular in unit testcases. Adding more components in the query will make it hard to understand what kind of features this query uses that actually hit the issue. For instance, the following simplified query would hit the same stack trace: ``` SELECT SUM(p."min_scale" * ( select max(e."salary") from FOODMART."employee" e where e."position_id" = p."position_id" AND e."supervisor_id" = (select MAX(e2."supervisor_id") from FOODMART."employee" e2 WHERE e."position_id" <= p."position_id") )) from FOODMART."position" p; Caused by: java.lang.NullPointerException: cm.mapCorToCorRel.get($cor2) at java.base/java.util.Objects.requireNonNull(Objects.java:336) at org.apache.calcite.sql2rel.RelDecorrelator.getCorRel(RelDecorrelator.java:929) at org.apache.calcite.sql2rel.RelDecorrelator.createValueGenerator(RelDecorrelator.java:818) at org.apache.calcite.sql2rel.RelDecorrelator.decorrelateInputWithValueGenerator(RelDecorrelator.java:1029) at org.apache.calcite.sql2rel.RelDecorrelator.maybeAddValueGenerator(RelDecorrelator.java:948) at org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:1154) at org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:1120) ``` So, the issue would raise when mult-level nested subquery is referring to a column in outer query block. This may explain why the issue may be different from CALCITE-5390, where it uses a one-level nested subquery. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
mihaibudiu merged PR #3640: URL: https://github.com/apache/calcite/pull/3640 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
HanumathRao commented on code in PR #3640: URL: https://github.com/apache/calcite/pull/3640#discussion_r1499616372 ## core/src/test/java/org/apache/calcite/test/JdbcTest.java: ## @@ -8270,6 +8270,75 @@ private void checkGetTimestamp(Connection con) throws SQLException { .returns("EXPR$0=[1, 1.1]\n"); } + /** Test case for + * https://issues.apache.org/jira/browse/CALCITE-6032;>[CALCITE-6032] + * NullPointerException in Reldecorrelator for a Multi level correlated subquery. */ + @Test void testMultiLevelDecorrelation() throws Exception { +String hsqldbMemUrl = "jdbc:hsqldb:mem:."; +Connection baseConnection = DriverManager.getConnection(hsqldbMemUrl); +Statement baseStmt = baseConnection.createStatement(); +baseStmt.execute("create table invoice (inv_id integer, col1\n" ++ "integer, inv_amt integer)"); +baseStmt.execute("create table item(item_id integer, item_amt\n" ++ "integer, item_col1 integer, item_col2 integer, item_col3\n" ++ "integer,item_col4 integer )"); +baseStmt.execute("INSERT INTO invoice VALUES (1, 1, 1)"); +baseStmt.execute("INSERT INTO invoice VALUES (2, 2, 2)"); +baseStmt.execute("INSERT INTO invoice VALUES (3, 3, 3)"); +baseStmt.execute("INSERT INTO item values (1, 1, 1, 1, 1, 1)"); +baseStmt.execute("INSERT INTO item values (2, 2, 2, 2, 2, 2)"); +baseStmt.close(); +baseConnection.commit(); + +Properties info = new Properties(); +info.put("model", +"inline:" ++ "{\n" ++ " version: '1.0',\n" ++ " defaultSchema: 'BASEJDBC',\n" ++ " schemas: [\n" ++ " {\n" ++ " type: 'jdbc',\n" ++ " name: 'BASEJDBC',\n" ++ " jdbcDriver: '" + jdbcDriver.class.getName() + "',\n" ++ " jdbcUrl: '" + hsqldbMemUrl + "',\n" ++ " jdbcCatalog: null,\n" ++ " jdbcSchema: null\n" ++ " }\n" ++ " ]\n" ++ "}"); + +Connection calciteConnection = +DriverManager.getConnection("jdbc:calcite:", info); + +String statement = "SELECT Sum(invoice.inv_amt * (\n" Review Comment: Thank you, @jinfengni, for your review. While it's possible to reproduce the issue with the current schema, I opted to employ the same test case to ensure its resolution. Please inform me if you'd like to include this test case alongside the one I've added (although it may be redundant). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
jinfengni commented on code in PR #3640: URL: https://github.com/apache/calcite/pull/3640#discussion_r1499590342 ## core/src/test/java/org/apache/calcite/test/JdbcTest.java: ## @@ -8270,6 +8270,75 @@ private void checkGetTimestamp(Connection con) throws SQLException { .returns("EXPR$0=[1, 1.1]\n"); } + /** Test case for + * https://issues.apache.org/jira/browse/CALCITE-6032;>[CALCITE-6032] + * NullPointerException in Reldecorrelator for a Multi level correlated subquery. */ + @Test void testMultiLevelDecorrelation() throws Exception { +String hsqldbMemUrl = "jdbc:hsqldb:mem:."; +Connection baseConnection = DriverManager.getConnection(hsqldbMemUrl); +Statement baseStmt = baseConnection.createStatement(); +baseStmt.execute("create table invoice (inv_id integer, col1\n" ++ "integer, inv_amt integer)"); +baseStmt.execute("create table item(item_id integer, item_amt\n" ++ "integer, item_col1 integer, item_col2 integer, item_col3\n" ++ "integer,item_col4 integer )"); +baseStmt.execute("INSERT INTO invoice VALUES (1, 1, 1)"); +baseStmt.execute("INSERT INTO invoice VALUES (2, 2, 2)"); +baseStmt.execute("INSERT INTO invoice VALUES (3, 3, 3)"); +baseStmt.execute("INSERT INTO item values (1, 1, 1, 1, 1, 1)"); +baseStmt.execute("INSERT INTO item values (2, 2, 2, 2, 2, 2)"); +baseStmt.close(); +baseConnection.commit(); + +Properties info = new Properties(); +info.put("model", +"inline:" ++ "{\n" ++ " version: '1.0',\n" ++ " defaultSchema: 'BASEJDBC',\n" ++ " schemas: [\n" ++ " {\n" ++ " type: 'jdbc',\n" ++ " name: 'BASEJDBC',\n" ++ " jdbcDriver: '" + jdbcDriver.class.getName() + "',\n" ++ " jdbcUrl: '" + hsqldbMemUrl + "',\n" ++ " jdbcCatalog: null,\n" ++ " jdbcSchema: null\n" ++ " }\n" ++ " ]\n" ++ "}"); + +Connection calciteConnection = +DriverManager.getConnection("jdbc:calcite:", info); + +String statement = "SELECT Sum(invoice.inv_amt * (\n" Review Comment: Can we simply the query to reproduce the issue? Remove the unnecessary part that is not related to de-correlation logic. I tried the following query, using `foodmart`, and seems it hit the similar stack trace. ``` SELECT SUM(p."min_scale" * ( select max(e."salary" + e."salary") from FOODMART."employee" e where e."position_id" = p."position_id" AND e."supervisor_id" = (select MAX(e2."supervisor_id") from FOODMART."employee" e2 WHERE e."position_id" <= p."position_id" AND e."last_name" = e2."last_name") )), count(*) as cnt from FOODMART."position" p; ``` ``` Error: Error while executing SQL "SELECT SUM(p."min_scale" * ( select max(e."salary" + e."salary") from FOODMART."employee" e where e."position_id" = p."position_id" AND e."supervisor_id" = (select MAX(e2."supervisor_id") from FOODMART."employee" e2 WHERE e."position_id" <= p."position_id" AND e."last_name" = e2."last_name") )), count(*) as cnt from FOODMART."position" p": cm.mapCorToCorRel.get($cor2) (state=,code=0) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
jinfengni commented on code in PR #3640: URL: https://github.com/apache/calcite/pull/3640#discussion_r1499577112 ## core/src/test/java/org/apache/calcite/test/JdbcTest.java: ## @@ -8270,6 +8270,75 @@ private void checkGetTimestamp(Connection con) throws SQLException { .returns("EXPR$0=[1, 1.1]\n"); } + /** Test case for + * https://issues.apache.org/jira/browse/CALCITE-6032;>[CALCITE-6032] + * NullPointerException in Reldecorrelator for a Multi level correlated subquery. */ + @Test void testMultiLevelDecorrelation() throws Exception { +String hsqldbMemUrl = "jdbc:hsqldb:mem:."; +Connection baseConnection = DriverManager.getConnection(hsqldbMemUrl); +Statement baseStmt = baseConnection.createStatement(); +baseStmt.execute("create table invoice (inv_id integer, col1\n" ++ "integer, inv_amt integer)"); +baseStmt.execute("create table item(item_id integer, item_amt\n" Review Comment: Will it be better to use some existing table/schemas to reproduce this problem? For instance, the `hr` or `foodmart` schema? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
sonarcloud[bot] commented on PR #3640: URL: https://github.com/apache/calcite/pull/3640#issuecomment-1958997551 ## [![Quality Gate Passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/qg-passed-20px.png 'Quality Gate Passed')](https://sonarcloud.io/dashboard?id=apache_calcite=3640) **Quality Gate passed** Issues ![](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/passed-16px.png '') [9 New issues](https://sonarcloud.io/project/issues?id=apache_calcite=3640=false=true) Measures ![](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/passed-16px.png '') [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_calcite=3640=false=true) ![](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/passed-16px.png '') [98.2% Coverage on New Code](https://sonarcloud.io/component_measures?id=apache_calcite=3640=new_coverage=list) ![](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/passed-16px.png '') [0.0% Duplication on New Code](https://sonarcloud.io/component_measures?id=apache_calcite=3640=new_duplicated_lines_density=list) [See analysis details on SonarCloud](https://sonarcloud.io/dashboard?id=apache_calcite=3640) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
HanumathRao commented on PR #3640: URL: https://github.com/apache/calcite/pull/3640#issuecomment-1958706633 > I still have one more improvement suggestion. But you can also probably also squash the commits in preparation for merging. Thanks. Fixed the javadoc comment and squashed and re-based to latest. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
mihaibudiu commented on code in PR #3640: URL: https://github.com/apache/calcite/pull/3640#discussion_r1498027961 ## core/src/main/java/org/apache/calcite/plan/RelOptUtil.java: ## @@ -286,6 +286,22 @@ public static Set getVariablesUsed(RelNode rel) { return visitor.vuv.variables; } + /** + * Returns a set of variables used by the given list of sub-queries and its descendants. + * + * Internally it uses the {@link org.apache.calcite.plan.RelOptUtil#getVariablesUsed} + * to get all the variables used in the relational expression of a given sub-query. + * + * The item type is same as {@link org.apache.calcite.rex.RexCorrelVariable#id}. Review Comment: Javadoc can document type parameters, use that format: https://www.baeldung.com/java-javadoc-generic-type-parameters -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
sonarcloud[bot] commented on PR #3640: URL: https://github.com/apache/calcite/pull/3640#issuecomment-1956009772 ## [![Quality Gate Passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/qg-passed-20px.png 'Quality Gate Passed')](https://sonarcloud.io/dashboard?id=apache_calcite=3640) **Quality Gate passed** Issues ![](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/passed-16px.png '') [8 New issues](https://sonarcloud.io/project/issues?id=apache_calcite=3640=false=true) Measures ![](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/passed-16px.png '') [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_calcite=3640=false=true) ![](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/passed-16px.png '') [98.2% Coverage on New Code](https://sonarcloud.io/component_measures?id=apache_calcite=3640=new_coverage=list) ![](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/passed-16px.png '') [0.0% Duplication on New Code](https://sonarcloud.io/component_measures?id=apache_calcite=3640=new_duplicated_lines_density=list) [See analysis details on SonarCloud](https://sonarcloud.io/dashboard?id=apache_calcite=3640) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
HanumathRao commented on PR #3640: URL: https://github.com/apache/calcite/pull/3640#issuecomment-1955934690 > This looks good, I left a few comments. I am approving, hoping that you will address them. If anyone else wants to review this, it's great, otherwise we should plan to merge it. Perhaps you can wait a couple more days before squashing the commits. Thanks for the review. I fixed the review comments. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
HanumathRao commented on code in PR #3640: URL: https://github.com/apache/calcite/pull/3640#discussion_r1496913804 ## core/src/main/java/org/apache/calcite/plan/RelOptUtil.java: ## @@ -286,6 +286,22 @@ public static Set getVariablesUsed(RelNode rel) { return visitor.vuv.variables; } + /** + * Returns a set of variables used by the given list of sub-queries and its descendants. + * + * Internally it uses the {@link org.apache.calcite.plan.RelOptUtil#getVariablesUsed} + * to get all the variables used in the relational expression of a given sub-query. + * + * The item type is same as {@link org.apache.calcite.rex.RexCorrelVariable#id}. Review Comment: I used the same comment format as that of getVariablesUsed(RelNode rel) as both the functions return same output, In this context item type is a type parameter for Set<> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
mihaibudiu commented on code in PR #3640: URL: https://github.com/apache/calcite/pull/3640#discussion_r1496781556 ## core/src/main/java/org/apache/calcite/plan/RelOptUtil.java: ## @@ -286,6 +286,22 @@ public static Set getVariablesUsed(RelNode rel) { return visitor.vuv.variables; } + /** + * Returns a set of variables used by the given list of sub-queries and its descendants. + * + * Internally it uses the {@link org.apache.calcite.plan.RelOptUtil#getVariablesUsed} Review Comment: In general I would refrain from describing *implementation* in JavaDoc, unless it is very relevant. JavaDoc is about what the function does, not how. This comment can be a regular comment inside the function, although the function is pretty clear as is. ## core/src/test/java/org/apache/calcite/test/JdbcTest.java: ## @@ -8217,6 +8217,75 @@ private void checkGetTimestamp(Connection con) throws SQLException { .returns("EXPR$0=[1, 1.1]\n"); } + /** Test case for + * https://issues.apache.org/jira/browse/CALCITE-6032;>[CALCITE-6032] + * NullPointerException in Reldecorrelator for a Multi level correlated subquery. */ + @Test void testMultiLevelDecorrelation() throws Exception { +String hsqldbMemUrl = "jdbc:hsqldb:mem:."; +Connection baseConnection = DriverManager.getConnection(hsqldbMemUrl); +Statement baseStmt = baseConnection.createStatement(); +baseStmt.execute("create table invoice (c5633_485 integer, c5633_503\n" Review Comment: this code looks like it's machine generated. This makes it hard to read by a human. ## core/src/main/java/org/apache/calcite/plan/RelOptUtil.java: ## @@ -286,6 +286,22 @@ public static Set getVariablesUsed(RelNode rel) { return visitor.vuv.variables; } + /** + * Returns a set of variables used by the given list of sub-queries and its descendants. + * + * Internally it uses the {@link org.apache.calcite.plan.RelOptUtil#getVariablesUsed} + * to get all the variables used in the relational expression of a given sub-query. + * + * The item type is same as {@link org.apache.calcite.rex.RexCorrelVariable#id}. Review Comment: I really don't know what item you are referring to here. ## core/src/main/java/org/apache/calcite/plan/RelOptUtil.java: ## @@ -286,6 +286,22 @@ public static Set getVariablesUsed(RelNode rel) { return visitor.vuv.variables; } + /** + * Returns a set of variables used by the given list of sub-queries and its descendants. Review Comment: "a" set is imprecise. I would say "the set" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
mihaibudiu commented on code in PR #3640: URL: https://github.com/apache/calcite/pull/3640#discussion_r1495188308 ## core/src/test/java/org/apache/calcite/test/enumerable/EnumerableCorrelateTest.java: ## @@ -142,10 +142,10 @@ class EnumerableCorrelateTest { tester(false, new HrSchema()) .query( "select empid, name from emps e where exists (select 1 from depts d where d.deptno=e.deptno)") -.explainContains("" -+ "EnumerableCalc(expr#0..5=[{inputs}], empid=[$t0], name=[$t2])\n" -+ " EnumerableCorrelate(correlation=[$cor0], joinType=[inner], requiredColumns=[{1}])\n" -+ "EnumerableTableScan(table=[[s, emps]])\n" +.explainContains("EnumerableCalc(expr#0..3=[{inputs}], empid=[$t0], name=[$t2])\n " Review Comment: Great, thank you, I was only looking at the last commit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
HanumathRao commented on code in PR #3640: URL: https://github.com/apache/calcite/pull/3640#discussion_r1495187202 ## core/src/test/java/org/apache/calcite/test/enumerable/EnumerableCorrelateTest.java: ## @@ -142,10 +142,10 @@ class EnumerableCorrelateTest { tester(false, new HrSchema()) .query( "select empid, name from emps e where exists (select 1 from depts d where d.deptno=e.deptno)") -.explainContains("" -+ "EnumerableCalc(expr#0..5=[{inputs}], empid=[$t0], name=[$t2])\n" -+ " EnumerableCorrelate(correlation=[$cor0], joinType=[inner], requiredColumns=[{1}])\n" -+ "EnumerableTableScan(table=[[s, emps]])\n" +.explainContains("EnumerableCalc(expr#0..3=[{inputs}], empid=[$t0], name=[$t2])\n " Review Comment: Sorry, I assumed that you have looked at the new test case added in JdbcTest (testMultiLevelDecorrelation). If there is no RelFieldTrimmer fix then this new testcase fails with same exception as reported in the JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
sonarcloud[bot] commented on PR #3640: URL: https://github.com/apache/calcite/pull/3640#issuecomment-1953398744 ## [![Quality Gate Passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/qg-passed-20px.png 'Quality Gate Passed')](https://sonarcloud.io/dashboard?id=apache_calcite=3640) **Quality Gate passed** Issues ![](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/passed-16px.png '') [8 New issues](https://sonarcloud.io/project/issues?id=apache_calcite=3640=false=true) Measures ![](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/passed-16px.png '') [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_calcite=3640=false=true) ![](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/passed-16px.png '') [98.2% Coverage on New Code](https://sonarcloud.io/component_measures?id=apache_calcite=3640=new_coverage=list) ![](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/passed-16px.png '') [0.0% Duplication on New Code](https://sonarcloud.io/component_measures?id=apache_calcite=3640=new_duplicated_lines_density=list) [See analysis details on SonarCloud](https://sonarcloud.io/dashboard?id=apache_calcite=3640) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
mihaibudiu commented on code in PR #3640: URL: https://github.com/apache/calcite/pull/3640#discussion_r1495174211 ## core/src/test/java/org/apache/calcite/test/enumerable/EnumerableCorrelateTest.java: ## @@ -142,10 +142,10 @@ class EnumerableCorrelateTest { tester(false, new HrSchema()) .query( "select empid, name from emps e where exists (select 1 from depts d where d.deptno=e.deptno)") -.explainContains("" -+ "EnumerableCalc(expr#0..5=[{inputs}], empid=[$t0], name=[$t2])\n" -+ " EnumerableCorrelate(correlation=[$cor0], joinType=[inner], requiredColumns=[{1}])\n" -+ "EnumerableTableScan(table=[[s, emps]])\n" +.explainContains("EnumerableCalc(expr#0..3=[{inputs}], empid=[$t0], name=[$t2])\n " Review Comment: The second half of my question stands. If the RelFieldTrimmer was wrong, is there a test case where you can show the difference? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
HanumathRao commented on code in PR #3640: URL: https://github.com/apache/calcite/pull/3640#discussion_r1495172973 ## core/src/test/java/org/apache/calcite/test/enumerable/EnumerableCorrelateTest.java: ## @@ -142,10 +142,10 @@ class EnumerableCorrelateTest { tester(false, new HrSchema()) .query( "select empid, name from emps e where exists (select 1 from depts d where d.deptno=e.deptno)") -.explainContains("" -+ "EnumerableCalc(expr#0..5=[{inputs}], empid=[$t0], name=[$t2])\n" -+ " EnumerableCorrelate(correlation=[$cor0], joinType=[inner], requiredColumns=[{1}])\n" -+ "EnumerableTableScan(table=[[s, emps]])\n" +.explainContains("EnumerableCalc(expr#0..3=[{inputs}], empid=[$t0], name=[$t2])\n " Review Comment: There is no change in the plan, it was just an indentation fix. I have reverted it in the latest commit just to avoid any confusion. Thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
mihaibudiu commented on code in PR #3640: URL: https://github.com/apache/calcite/pull/3640#discussion_r1495155596 ## core/src/test/java/org/apache/calcite/test/enumerable/EnumerableCorrelateTest.java: ## @@ -142,10 +142,10 @@ class EnumerableCorrelateTest { tester(false, new HrSchema()) .query( "select empid, name from emps e where exists (select 1 from depts d where d.deptno=e.deptno)") -.explainContains("" -+ "EnumerableCalc(expr#0..5=[{inputs}], empid=[$t0], name=[$t2])\n" -+ " EnumerableCorrelate(correlation=[$cor0], joinType=[inner], requiredColumns=[{1}])\n" -+ "EnumerableTableScan(table=[[s, emps]])\n" +.explainContains("EnumerableCalc(expr#0..3=[{inputs}], empid=[$t0], name=[$t2])\n " Review Comment: was the previous plan incorrect? If not, can you provide a test case for which the previous plan was incorrect, but it is now correct? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
HanumathRao commented on PR #3640: URL: https://github.com/apache/calcite/pull/3640#issuecomment-1937901907 > Thanks for your patience and the thorough explanation @HanumathRao , I really appreciate it. > > I understand a bit more the situation now... and I beg to differ. IMHO fix 2 is just "sweeping the problem under the rug", we could avoid the problem in `Prepare.java`, but the underlying issue still remains. Plus, before removing this [Chesterton's fence](https://en.wiktionary.org/wiki/Chesterton%27s_fence), we would need to understand the reasons why field trimming was included in there in the first place. Also, removing the trimming can have undesired consequences (e.g. some projects might have fields which are expensive to retrieve/compute, so they should only be part of the final plan if they are really used, otherwise they should be trimmed; so this change would lead to a performance regression). > > I think fix 1 is the right way to go (although I understand it will be harder). Field trimming is quite a powerful tool (not without some bugs), which can be used by downstream projects (either via SqlToRelConverter#trimUnusedFields or directly via RelFieldTrimmer, since both are public); so if we find a bug on it, I think it would be better to try to fix it. > Thanks for your patience and the thorough explanation @HanumathRao , I really appreciate it. > > I understand a bit more the situation now... and I beg to differ. IMHO fix 2 is just "sweeping the problem under the rug", we could avoid the problem in `Prepare.java`, but the underlying issue still remains. Plus, before removing this [Chesterton's fence](https://en.wiktionary.org/wiki/Chesterton%27s_fence), we would need to understand the reasons why field trimming was included in there in the first place. Also, removing the trimming can have undesired consequences (e.g. some projects might have fields which are expensive to retrieve/compute, so they should only be part of the final plan if they are really used, otherwise they should be trimmed; so this change would lead to a performance regression). > > I think fix 1 is the right way to go (although I understand it will be harder). Field trimming is quite a powerful tool (not without some bugs), which can be used by downstream projects (either via SqlToRelConverter#trimUnusedFields or directly via RelFieldTrimmer, since both are public); so if we find a bug on it, I think it would be better to try to fix it. Thank you, @rubenada, for your comments. I have addressed the issue in RelFieldTrimmer to ensure that correlated variables in the subqueries are not trimmed. Nevertheless, I maintain my stance that executing RelFieldTrimmer in this context might be unnecessary. Since RelFieldTrimmer is run as a separate program after the SubQueryRemoval Rule ([link](https://github.com/apache/calcite/blob/8fd9518887302af43e7e74cdb155474dcebfc9f5/core/src/main/java/org/apache/calcite/tools/Programs.java#L290)), calling it in Prepare.java may not yield any benefits. Kindly review the changes at your earliest convenience. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
sonarcloud[bot] commented on PR #3640: URL: https://github.com/apache/calcite/pull/3640#issuecomment-1937859542 ## [![Quality Gate Passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/qg-passed-20px.png 'Quality Gate Passed')](https://sonarcloud.io/dashboard?id=apache_calcite=3640) **Quality Gate passed** Issues ![](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/passed-16px.png '') [9 New issues](https://sonarcloud.io/project/issues?id=apache_calcite=3640=false=true) Measures ![](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/passed-16px.png '') [0 Security Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_calcite=3640=false=true) ![](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/passed-16px.png '') [98.2% Coverage on New Code](https://sonarcloud.io/component_measures?id=apache_calcite=3640=new_coverage=list) ![](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/passed-16px.png '') [0.0% Duplication on New Code](https://sonarcloud.io/component_measures?id=apache_calcite=3640=new_duplicated_lines_density=list) [See analysis details on SonarCloud](https://sonarcloud.io/dashboard?id=apache_calcite=3640) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
rubenada commented on PR #3640: URL: https://github.com/apache/calcite/pull/3640#issuecomment-1935544748 Thanks for your patience and the thorough explanation @HanumathRao , I really appreciate it. I understand a bit more the situation now... and I beg to differ. IMHO fix 2 is just "sweeping the problem under the rug", we could avoid the problem in `Prepare.java`, but the underlying issue still remains. Plus, before removing this [Chesterton's fence](https://en.wiktionary.org/wiki/Chesterton%27s_fence), we would need to understand the reasons why field trimming was included in there in the first place. Also, removing the trimming can have undesired consequences (e.g. some projects might have fields which are expensive to retrieve/compute, so they should only be part of the final plan if they are really used, otherwise they should be trimmed; so this change would lead to a performance regression). I think fix 1 is the right way to go (although I understand it will be harder). Field trimming is quite a powerful tool (not without some bugs), which can be used by downstream projects (either via SqlToRelConverter#trimUnusedFields or directly via RelFieldTrimmer, since both are public); so if we find a bug on it, I think it would be better to try to fix it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
HanumathRao commented on PR #3640: URL: https://github.com/apache/calcite/pull/3640#issuecomment-1935259456 > > The trimUnusedField ,in the prepareSql function call path, is called conditionally based on the configHolder's isTrimUnusedFields flag and due to this it results in **trimUnusedFields getting called before decorrelateQuery** is called. > > Maybe I'm missing something, but isn't `decorrelateQuery` called before `trimUnusedFields` in `prepareSql` ? (so either we trim after decorrelate, or we don't trim at all, depending on the configHolder) > > ``` > if (this.context.config().forceDecorrelate()) { > // Sub-query decorrelation. > root = root.withRel(decorrelate(sqlToRelConverter, sqlQuery, root.rel)); > } > > if (configHolder.get().isTrimUnusedFields()) { > // Trim unused fields. > root = trimUnusedFields(root); > > Hook.TRIMMED.run(root.rel); > } > ``` Yeah, your observation is right, and there's a subtle issue that requires attention in this case. Before the decorrelate function is invoked, the plan (for the added test in the PR) appears as follows: ``` LogicalAggregate(group=[{}], INVAMOUNT=[SUM($0)], INVCOUNT=[COUNT()]) LogicalProject($f0=[*($2, $SCALAR_QUERY({ LogicalAggregate(group=[{}], EXPR$0=[MAX($0)]) LogicalProject($f0=[+($0, $1)]) LogicalFilter(condition=[AND(IS NOT NULL($2), IS NOT NULL($3), =($4, $cor0.C5633_503), =($5, $SCALAR_QUERY({ LogicalAggregate(group=[{}], EXPR$0=[MAX($0)]) LogicalProject(C5633_2186=[$5]) LogicalFilter(condition=[AND(=($4, $cor1.C5633_2187), =($2, $cor1.C5633_2184), =($3, $cor1.C5633_2185), <=($5, $cor0.C5633_485))]) JdbcTableScan(table=[[BASEJDBC, T1704]]) })))], variablesSet=[[$cor1]]) JdbcTableScan(table=[[BASEJDBC, T1704]]) }))]) LogicalFilter(condition=[AND(<($2, 10), >($2, 0))]) JdbcTableScan(table=[[BASEJDBC, INVOICE]]) ``` The decorrelate function before the trimFields operation is essentially a no-op. This is because decorrelation only occurs when a Correlate node is created, which is determined by a triggering check. ``` if (!corelMap.hasCorrelation()) { return rootRel; } ``` The hasCorrelation function returns true only if mapCorToCorRel is nonEmpty. However, in this scenario, mapCorToCorRel is empty because no Correlate node exists. Therefore, the decorrelate function essentially does nothing. However, after the trimUnusedFields operation is executed on the tree, it prunes certain fields that are essential for correlated variables. Consequently, a new node, LogicalProject(C5633_505=[$2]), is introduced. ``` LogicalAggregate(group=[{}], INVAMOUNT=[SUM($0)], INVCOUNT=[COUNT()]) LogicalProject($f0=[*($0, $SCALAR_QUERY({ LogicalAggregate(group=[{}], EXPR$0=[MAX($0)]) LogicalProject($f0=[+($0, $1)]) LogicalFilter(condition=[AND(IS NOT NULL($2), IS NOT NULL($3), =($4, $cor0.C5633_503), =($5, $SCALAR_QUERY({ LogicalAggregate(group=[{}], EXPR$0=[MAX($0)]) LogicalProject(C5633_2186=[$5]) LogicalFilter(condition=[AND(=($4, $cor1.C5633_2187), =($2, $cor1.C5633_2184), =($3, $cor1.C5633_2185), <=($5, $cor0.C5633_485))]) JdbcTableScan(table=[[BASEJDBC, T1704]]) })))], variablesSet=[[$cor1]]) JdbcTableScan(table=[[BASEJDBC, T1704]]) }))]) LogicalFilter(condition=[SEARCH($0, Sarg[(0..10)])]) LogicalProject(C5633_505=[$2]). --- a new node introduced by trimUnusedFields JdbcTableScan(table=[[BASEJDBC, INVOICE]]) ``` The subsequent optimization of this tree leads to an exception because the tree is no longer correct. This issue is specific to the prepareSql path and is not observed during the plannerImpl rel path. [link](https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/prepare/PlannerImpl.java#L265). Two potential solutions have been identified: 1. Fix the RelTrimmer to avoid generating the problematic node. 2. Ensure consistency between both code paths by refraining from calling the RelTrimmer in the prepareSql path. I chose the second option, as it seems logical to call the RelTrimmer only after the SubQueryRemovalRule has been applied. Please let me know if it is clear and thank you for reviewing the changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
rubenada commented on PR #3640: URL: https://github.com/apache/calcite/pull/3640#issuecomment-1934283981 > The trimUnusedField ,in the prepareSql function call path, is called conditionally based on the configHolder's isTrimUnusedFields flag and due to this it results in **trimUnusedFields getting called before decorrelateQuery** is called. Maybe I'm missing something, but isn't `decorrelateQuery` called before `trimUnusedFields` in `prepareSql` ? (so either we trim after decorrelate, or we don't trim at all, depending on the configHolder) ``` if (this.context.config().forceDecorrelate()) { // Sub-query decorrelation. root = root.withRel(decorrelate(sqlToRelConverter, sqlQuery, root.rel)); } if (configHolder.get().isTrimUnusedFields()) { // Trim unused fields. root = trimUnusedFields(root); Hook.TRIMMED.run(root.rel); } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
HanumathRao commented on PR #3640: URL: https://github.com/apache/calcite/pull/3640#issuecomment-1932905283 > @HanumathRao could you please elaborate why we need in this PR the changes in `CassandraSchema.java` and `Prepare.java`? Is is related to the decorrelation problem, or could they be part of a separate issue? 1. Changes to Prepare.java file, Yes in a subtle way. The trimUnusedField ,in the prepareSql function call path, is called conditionally based on the configHolder's isTrimUnusedFields flag and due to this it results in trimUnusedFields getting called before decorrelateQuery is called. However, the same functionality in PlannerImpl.rel code path (most calcite users seem to use it), trimUnusedFields is set to false and this makes sure the decorrelateQuery is called before the trimUnusedFields. In both the cases the RelFieldTrimmer program is called in the optimize phase of the query, so essentially fields will get trimmed, but in the case of Prepare (which I believe is less used by users) doing so will result in a tree which will cause other issues due to lack of comprehensive handling of correlate subqueries. Essentially I am trying to make both the paths to behave same. 2. The changes in CassandraSchema.java are done because earlier CassandraSchema Test code loaded the materializations using TRIMMED Hook. As this code path is no more available, I changed it to load on CONVERTED Hook. This way the materializations are available for the optimizations, otherwise the materializations are not loaded and MaterializedView optimization is skipped. Hope the above description made it clear. Please let me know if you have any further questions/suggestions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
rubenada commented on PR #3640: URL: https://github.com/apache/calcite/pull/3640#issuecomment-1932326838 @HanumathRao could you please elaborate why we need in this PR the changes in `CassandraSchema.java` and `Prepare.java`? Is is related to the decorrelation problem, or could they be part of a separate issue? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
HanumathRao commented on PR #3640: URL: https://github.com/apache/calcite/pull/3640#issuecomment-1924399074 > @HanumathRao I'll try to take a look as soon as I have a bit of time. In the meanwhile, could you please check if your patch also fixes the cases mentioned in [CALCITE-5390](https://issues.apache.org/jira/browse/CALCITE-5390) ? (also NPE related to decorrelation). If it doesn't, no problem, I'd like just to confirm if [CALCITE-6032](https://issues.apache.org/jira/browse/CALCITE-6032) and [CALCITE-5390](https://issues.apache.org/jira/browse/CALCITE-5390) are / aren't due to same root cause. Thanks @rubenada for the reply. I tried out the query in CALCITE-5390 (https://issues.apache.org/jira/browse/CALCITE-5390) and they are not the same root cause. I can take a look at it later (post this JIRA merge), if no one is working on it. thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
rubenada commented on PR #3640: URL: https://github.com/apache/calcite/pull/3640#issuecomment-1924192157 @HanumathRao I'll try to take a look as soon as I have a bit of time. In the meanwhile, could you please check if your patch also fixes the cases mentioned in CALCITE-5390 ? (also NPE related to decorrelation). If it doesn't, no problem, I'd like just to confirm if CALCITE-6032 and CALCITE-5390 are / aren't due to same root cause. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
HanumathRao commented on PR #3640: URL: https://github.com/apache/calcite/pull/3640#issuecomment-1924164888 @mihaibudiu, Please let me know if you have any further questions or clarifications on these changes. @libenchao and @rubenada, a friendly reminder to review the PR changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
HanumathRao commented on PR #3640: URL: https://github.com/apache/calcite/pull/3640#issuecomment-1907201003 @mihaibudiu, Please let me know if you have any further questions or clarifications on these changes. @libenchao and @rubenada, a friendly reminder to review the PR changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
HanumathRao commented on PR #3640: URL: https://github.com/apache/calcite/pull/3640#issuecomment-1901480077 @mihaibudiu Appreciate the prompt review. I've addressed the testcase by including a Javadoc comment explaining the failure. Additionally, I've added comments in the code where relevant. @libenchao and @rubenada, your review of the proposed fix would be greatly appreciated. Given your previous examination of the code fix for JIRA [CALCITE-5683](https://issues.apache.org/jira/browse/CALCITE-5683), your insights will be valuable in ensuring the quality of the solution. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
HanumathRao commented on code in PR #3640: URL: https://github.com/apache/calcite/pull/3640#discussion_r1459968138 ## core/src/main/java/org/apache/calcite/prepare/Prepare.java: ## @@ -293,13 +293,6 @@ public PreparedResult prepareSql( root = root.withRel(decorrelate(sqlToRelConverter, sqlQuery, root.rel)); } -if (configHolder.get().isTrimUnusedFields()) { Review Comment: The TrimUnusedFields function currently lacks comprehensive handling for cases involving correlation variables. It appears that the trimUnusedFields in the Prepare call was retained for legacy use cases, where a config object with trimUnusedFields can be set to True. However, upon inspecting the code at [link](https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/prepare/PlannerImpl.java#L265), trimUnusedFields is set to False, and a series of optimizations are applied to the RelNodeTree. Subsequently, the TrimUnusedFields Program is executed on the RelNodeTree in a later part of the code path. Therefore, the proposed change aims to synchronize these two code paths. It's worth noting that addressing this issue is complex due to the interplay of seemingly disparate changes that contribute to encountering various problems. While I appreciate your questions and engagement, I believe consulting an expert in this domain would be beneficial to ensure the correctness of my proposal and explore any potential alternative solutions to resolve this issue. Thank you for your understanding. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
mihaibudiu commented on code in PR #3640: URL: https://github.com/apache/calcite/pull/3640#discussion_r1459925492 ## core/src/main/java/org/apache/calcite/prepare/Prepare.java: ## @@ -293,13 +293,6 @@ public PreparedResult prepareSql( root = root.withRel(decorrelate(sqlToRelConverter, sqlQuery, root.rel)); } -if (configHolder.get().isTrimUnusedFields()) { Review Comment: You are fixing a bug. If I understand correctly the bug happens here, because trimUnusedFields returns an incorrect plan. Is the bug in trimUnusedFields? Or does trimUnusedFields need some invariants before being called, and the invariants are not established in this point? If the bug is in trimUnusedFields, this change only hides the bug but does not really fix it. That is my first question. I am not sure what you mean by "preferred". Preferences are subjective, but whether this is a bug or not is not a preference. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
mihaibudiu commented on code in PR #3640: URL: https://github.com/apache/calcite/pull/3640#discussion_r1459925492 ## core/src/main/java/org/apache/calcite/prepare/Prepare.java: ## @@ -293,13 +293,6 @@ public PreparedResult prepareSql( root = root.withRel(decorrelate(sqlToRelConverter, sqlQuery, root.rel)); } -if (configHolder.get().isTrimUnusedFields()) { Review Comment: You are fixing a bug. If I understand correctly the bug happens here, because trimUnusedFields returns an incorrect plan. Is the bug in trimUnusedFields? Or does trimUnusedFields need some invariants being called, and the invariants are not established in this point? If the bug is in trimUnusedFields, this change only hides the bug but does not really fix it. That is my first question. I am not sure what you mean by "preferred". Preferences are subjective, but whether this is a bug or not is not a preference. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
HanumathRao commented on PR #3640: URL: https://github.com/apache/calcite/pull/3640#issuecomment-1901245191 > There are lots of issues open related to correlated subqueries, but this fix does not seem to address them. This fix is to address specific case discussed in this JIRA. This PR is not for handling all the correlated subquery issues which might not be relevant to the issue reported in this JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
HanumathRao commented on code in PR #3640: URL: https://github.com/apache/calcite/pull/3640#discussion_r1459921376 ## core/src/main/java/org/apache/calcite/prepare/Prepare.java: ## @@ -293,13 +293,6 @@ public PreparedResult prepareSql( root = root.withRel(decorrelate(sqlToRelConverter, sqlQuery, root.rel)); } -if (configHolder.get().isTrimUnusedFields()) { Review Comment: As explained it in the PR description, calling trimUnusedFields is different in different paths (In the Prepare path we call the trimUnusedFields first before optimization but seems like the preferred way as implemented in PlannerImpl is to skip the trimUnusedFields first and call later. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
HanumathRao commented on code in PR #3640: URL: https://github.com/apache/calcite/pull/3640#discussion_r1459919352 ## core/src/test/resources/sql/sub-query.iq: ## @@ -2225,7 +2225,7 @@ where exists # following plan (two scans of EMP table instead of three). EnumerableCalc(expr#0..2=[{inputs}], ENAME=[$t1]) EnumerableHashJoin(condition=[=($2, $3)], joinType=[semi]) -EnumerableCalc(expr#0..7=[{inputs}], expr#8=[IS NOT NULL($t3)], expr#9=[CAST($t3):INTEGER], expr#10=[0], expr#11=[CASE($t8, $t9, $t10)], proj#0..1=[{exprs}], $f3=[$t11]) +EnumerableCalc(expr#0..7=[{inputs}], expr#8=[IS NOT NULL($t3)], expr#9=[CAST($t3):INTEGER], expr#10=[0], expr#11=[CASE($t8, $t9, $t10)], proj#0..1=[{exprs}], $f8=[$t11]) Review Comment: There is no change in the result, only plan field number is changed (isn't it ?) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
mihaibudiu commented on code in PR #3640: URL: https://github.com/apache/calcite/pull/3640#discussion_r1459918755 ## core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java: ## @@ -5491,18 +5491,24 @@ ImmutableList retrieveCursors() { builder.add(convertExpression(node)); } final ImmutableList list = builder.build(); +RelNode rel = root.rel; +CorrelationUse correlationUse = getCorrelationUse(this, root.rel); Review Comment: I am not familiar with this part of the code, so I need help to understand why this works. Maybe other reviewers won't need this help. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
HanumathRao commented on code in PR #3640: URL: https://github.com/apache/calcite/pull/3640#discussion_r1459918111 ## core/src/test/java/org/apache/calcite/test/JdbcTest.java: ## @@ -8217,6 +8217,72 @@ private void checkGetTimestamp(Connection con) throws SQLException { .returns("EXPR$0=[1, 1.1]\n"); } + @Test void testMultiLevelDecorrelation() throws Exception { Review Comment: Yeah sure, I can add with the reference to the CALCITE JIRA. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
HanumathRao commented on code in PR #3640: URL: https://github.com/apache/calcite/pull/3640#discussion_r1459917452 ## core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java: ## @@ -5491,18 +5491,24 @@ ImmutableList retrieveCursors() { builder.add(convertExpression(node)); } final ImmutableList list = builder.build(); +RelNode rel = root.rel; +CorrelationUse correlationUse = getCorrelationUse(this, root.rel); Review Comment: Sure, I can add a comment here, but I believe this function/pattern is used elsewhere so thought adding only here might not be relevant. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] [CALCITE-6032] Multilevel correlated query is failing in RelDecorrela… [calcite]
mihaibudiu commented on code in PR #3640: URL: https://github.com/apache/calcite/pull/3640#discussion_r1459864781 ## core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java: ## @@ -5491,18 +5491,24 @@ ImmutableList retrieveCursors() { builder.add(convertExpression(node)); } final ImmutableList list = builder.build(); +RelNode rel = root.rel; +CorrelationUse correlationUse = getCorrelationUse(this, root.rel); Review Comment: Could you add a comment explaining what happens here? ## core/src/main/java/org/apache/calcite/prepare/Prepare.java: ## @@ -293,13 +293,6 @@ public PreparedResult prepareSql( root = root.withRel(decorrelate(sqlToRelConverter, sqlQuery, root.rel)); } -if (configHolder.get().isTrimUnusedFields()) { Review Comment: This config feature seems to only be used here. So perhaps it can be removed completely? My question is whether the bug is in the trimUnusedFields method, or is it in invoking it here. ## core/src/test/java/org/apache/calcite/test/JdbcTest.java: ## @@ -8217,6 +8217,72 @@ private void checkGetTimestamp(Connection con) throws SQLException { .returns("EXPR$0=[1, 1.1]\n"); } + @Test void testMultiLevelDecorrelation() throws Exception { Review Comment: Should this function have a JavaDoc describing the issue fixed? ## core/src/test/java/org/apache/calcite/test/enumerable/EnumerableCorrelateTest.java: ## @@ -143,13 +143,13 @@ class EnumerableCorrelateTest { .query( "select empid, name from emps e where exists (select 1 from depts d where d.deptno=e.deptno)") .explainContains("" -+ "EnumerableCalc(expr#0..3=[{inputs}], empid=[$t0], name=[$t2])\n" -+ " EnumerableCorrelate(correlation=[$cor0], joinType=[inner], requiredColumns=[{1}])\n" -+ "EnumerableCalc(expr#0..4=[{inputs}], proj#0..2=[{exprs}])\n" -+ " EnumerableTableScan(table=[[s, emps]])\n" ++ "EnumerableCalc(expr#0..5=[{inputs}], empid=[$t0], name=[$t2])\n " Review Comment: Is this case even changed? ## core/src/test/java/org/apache/calcite/test/enumerable/EnumerableCorrelateTest.java: ## @@ -143,13 +143,13 @@ class EnumerableCorrelateTest { .query( "select empid, name from emps e where exists (select 1 from depts d where d.deptno=e.deptno)") .explainContains("" -+ "EnumerableCalc(expr#0..3=[{inputs}], empid=[$t0], name=[$t2])\n" -+ " EnumerableCorrelate(correlation=[$cor0], joinType=[inner], requiredColumns=[{1}])\n" -+ "EnumerableCalc(expr#0..4=[{inputs}], proj#0..2=[{exprs}])\n" -+ " EnumerableTableScan(table=[[s, emps]])\n" ++ "EnumerableCalc(expr#0..5=[{inputs}], empid=[$t0], name=[$t2])\n " Review Comment: Please move the spaces at the end of the previous line here. Same in the subsequent lines. ## core/src/test/resources/sql/sub-query.iq: ## @@ -2225,7 +2225,7 @@ where exists # following plan (two scans of EMP table instead of three). EnumerableCalc(expr#0..2=[{inputs}], ENAME=[$t1]) EnumerableHashJoin(condition=[=($2, $3)], joinType=[semi]) -EnumerableCalc(expr#0..7=[{inputs}], expr#8=[IS NOT NULL($t3)], expr#9=[CAST($t3):INTEGER], expr#10=[0], expr#11=[CASE($t8, $t9, $t10)], proj#0..1=[{exprs}], $f3=[$t11]) +EnumerableCalc(expr#0..7=[{inputs}], expr#8=[IS NOT NULL($t3)], expr#9=[CAST($t3):INTEGER], expr#10=[0], expr#11=[CASE($t8, $t9, $t10)], proj#0..1=[{exprs}], $f8=[$t11]) Review Comment: This seems to be a test for a prior bug. Are you saying that the result was incorrect and is correct now? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@calcite.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org