[jira] [Resolved] (CALCITE-4248) Deprecate SqlParser.ConfigBuilder

2020-09-17 Thread Julian Hyde (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde resolved CALCITE-4248.
--
Fix Version/s: 1.26.0
   Resolution: Fixed

Fixed in 
[76ff1913|https://github.com/apache/calcite/commit/76ff19135f8bf9ee19b47d999d23bc1e2ab75e15].

> Deprecate SqlParser.ConfigBuilder
> -
>
> Key: CALCITE-4248
> URL: https://issues.apache.org/jira/browse/CALCITE-4248
> Project: Calcite
>  Issue Type: Bug
>Reporter: Julian Hyde
>Assignee: Julian Hyde
>Priority: Major
> Fix For: 1.26.0
>
>
> Deprecate {{SqlParser.ConfigBuilder}}; for each property 'xxx', add a method 
> {{withXxx}} to {{SqlParser.Config}} that creates a copy of the {{Config}}. 
> Therefore {{Config}} can be used everywhere that {{ConfigBuilder}} used to be 
> used.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-4230) When parsing SQL in BigQuery dialect, split quoted table names that contain dots

2020-09-17 Thread Julian Hyde (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde resolved CALCITE-4230.
--
Fix Version/s: 1.26.0
   Resolution: Fixed

Fixed in 
[5c012c59|https://github.com/apache/calcite/commit/5c012c5985c0064a8f0742cfb63f80b235c2e9fa].

> When parsing SQL in BigQuery dialect, split quoted table names that contain 
> dots
> 
>
> Key: CALCITE-4230
> URL: https://issues.apache.org/jira/browse/CALCITE-4230
> Project: Calcite
>  Issue Type: Bug
>  Components: babel
>Reporter: Julian Hyde
>Assignee: Julian Hyde
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.26.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When parsing SQL in BigQuery dialect, split quoted table names that contain 
> dots. The following three are equivalent:
> {code:java}
> select * from `bigquery-public-data`.`samples`.`natality`;
> select * from `bigquery-public-data.samples.natality`;
> select * from bigquery-public-data.samples.natality;
> {code}
> After this change, Babel will return the same parse tree for all 3 queries. 
> (The third will require CALCITE-4246.)
> To control this feature, in {{interface SqlConformance}}, add method 
> {{boolean splitQuotedTableName()}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-4246) When parsing SQL in BigQuery dialect, allow unquoted table names to contain hyphens

2020-09-17 Thread Julian Hyde (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde resolved CALCITE-4246.
--
Fix Version/s: 1.26.0
   Resolution: Fixed

Fixed in 
[df571c9c|https://github.com/apache/calcite/commit/df571c9c629d35c3e91e70194854c609b25381d2].

> When parsing SQL in BigQuery dialect, allow unquoted table names to contain 
> hyphens
> ---
>
> Key: CALCITE-4246
> URL: https://issues.apache.org/jira/browse/CALCITE-4246
> Project: Calcite
>  Issue Type: Bug
>  Components: babel
>Reporter: Julian Hyde
>Assignee: Julian Hyde
>Priority: Major
> Fix For: 1.26.0
>
>
> When parsing SQL in BigQuery dialect, allow unquoted table names to contain 
> hyphens. This feature was [added in May 
> 2020|https://www.reddit.com/r/bigquery/comments/fgk31y/new_in_bigquery_no_more_backticks_around_table/].
> For these purposes, "table names" are considered to be the first identifier 
> after FROM, JOIN, INSERT, UPDATE, DELETE, TABLE without an intervening SELECT 
> or SET. (Yeah, the rules are rather rigid, but to implement it we have to 
> shift lexical states in the parser, and we can only do that based on what 
> tokens we've seen. Hopefully the rules will suffice.)
> Valid identifiers with hyphens:
> * {{select * from bigquery-public-data.foo}}
> * {{insert into bigquery-public-data.foo values (1)}}
> * {{delete from bigquery-public-data.foo}}
> * {{select * from dept cross join bigquery-public-data.foo}}
> Invalid identifiers with hyphens:
> * {{select foo-bar from dept}} # column names cannot have hyphens
> * {{select * from dept as foo-bar}} # table aliases cannot have hyphens
> * {{select foo-bar.deptno from dept as `foo-bar`}} # ditto
> * {{select * from foo.bar-baz.bump}} # only first part of table name
> In all of the above 'invalid' cases, you can solve by enclosing the 
> identifier with back-ticks, e.g. {{select `foo-bar` from dept}}.
> To control this feature, in {{interface SqlConformance}}, add method 
> {{boolean allowHyphenInUnquotedTableName()}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-4254) ImmutableBeans should make an immutable copy of property values of type List, Set, or Map

2020-09-17 Thread Julian Hyde (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde resolved CALCITE-4254.
--
Fix Version/s: 1.26.0
   Resolution: Fixed

> ImmutableBeans should make an immutable copy of property values of type List, 
> Set, or Map
> -
>
> Key: CALCITE-4254
> URL: https://issues.apache.org/jira/browse/CALCITE-4254
> Project: Calcite
>  Issue Type: Bug
>Reporter: Julian Hyde
>Assignee: Julian Hyde
>Priority: Major
> Fix For: 1.26.0
>
>
> {{ImmutableBeans}} should make an immutable copy of values of type {{List}}, 
> {{Set}} or {{Map}} when they are provided via a setter method. For example, 
> this is how it should work:
> {code:java}
> MyBean bean = ImmutableBeans.create(MyBean.class);
> List evens = Arrays.asList(2, 4, 6);
> bean = bean.withNumbers(evens);
> assertThat(bean.getNumbers(), equalTo(evens));
> assertThat(bean.getNumbers(), isInstanceOf(ImmutableList.class));
> {code}
> Note that {{evens}} has been converted to an {{ImmutableList}} with the same 
> contents.
> You can control this behavior by setting {{makeImmutable = false}} when you 
> declare the property:
> {code:java}
> interface MyBean {
>   @ImmutableBean.Property(makeImmutable = false)
>   List getNumbers();
>   MyBean withNumbers(List numbers);
> }
> {code}
> By default, {{makeImmutable}} is true. This will change behavior of a very 
> few existing beans that were assuming that the value that was put in would be 
> the value that came out (see a couple of changes in {{VolcanoPlannerTest}}), 
> but for the vast majority, immutability is the desired behavior.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-4247) When parsing SQL in BigQuery dialect, character literals may be enclosed in single- or double-quotes, and use backslashes as escapes

2020-09-17 Thread Julian Hyde (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde resolved CALCITE-4247.
--
Fix Version/s: 1.26.0
   Resolution: Fixed

Fixed in 
[c1052b34|https://github.com/apache/calcite/commit/c1052b343724c4a95bcf25419b70bc6032e0846b].

> When parsing SQL in BigQuery dialect, character literals may be enclosed in 
> single- or double-quotes, and use backslashes as escapes
> 
>
> Key: CALCITE-4247
> URL: https://issues.apache.org/jira/browse/CALCITE-4247
> Project: Calcite
>  Issue Type: Bug
>  Components: babel
>Reporter: Julian Hyde
>Assignee: Julian Hyde
>Priority: Major
> Fix For: 1.26.0
>
>
> When parsing SQL in BigQuery dialect, character literals may be enclosed in 
> single- or double-quotes, and use backslashes as escapes. In standard SQL, 
> and most dialects, character literals are enclosed in single-quotes only, and 
> use single-quotes as the escape character.
> For example:
> {noformat}
> 'Let''s call him "Elvis"!' /* valid in Oracle, PostgreSQL, etc. */
> 'Let\'s call him "Elvis"!' /* valid in BigQuery */
> "Let's call him \"Elvis\"!" /* valid in BigQuery */
> {noformat}
> To control this feature, we add
> {code}
> enum CharLiteralStyle { STANDARD, BQ_SINGLE, BQ_DOUBLE }
> {code}
> and to {{interface SqlParser.Config}} we add method {{Set 
> charLiteralStyles()}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-4249) JDBC adapter cannot translate NOT LIKE in join condition

2020-09-17 Thread Julian Hyde (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde resolved CALCITE-4249.
--
Fix Version/s: 1.26.0
   Resolution: Fixed

Fixed in 
[6a3a7e49|https://github.com/apache/calcite/commit/6a3a7e49f99f2bfb4b3af00536bccbe19c53].

> JDBC adapter cannot translate NOT LIKE in join condition
> 
>
> Key: CALCITE-4249
> URL: https://issues.apache.org/jira/browse/CALCITE-4249
> Project: Calcite
>  Issue Type: Bug
>Reporter: Steven Talbot
>Assignee: Julian Hyde
>Priority: Major
> Fix For: 1.26.0
>
>
> JDBC adapter cannot translate NOT LIKE in join condition. Rel2SqlConverter 
> throws assertion error for NOT operator in join condition.
> The following test shows the bug if you put it in RelToSqlConverter
> {code:java}
> @Test void testJoinWithNotLikeConditionRel2Sql() {
>   final Function relFn = b -> b
>   .scan("EMP")
>   .scan("DEPT")
>   .join(JoinRelType.LEFT,
>   b.and(
>   b.call(SqlStdOperatorTable.EQUALS,
>   b.field(2, 0, "DEPTNO"),
>   b.field(2, 1, "DEPTNO")),
>   b.call(SqlStdOperatorTable.NOT,
>   b.call(SqlStdOperatorTable.LIKE,
>   b.field(2, 1, "DNAME"),
>   b.literal("ACCOUNTING_FOO"))
>   )
>   ))
>   .build();
>   final String expectedSql = "SELECT *\n"
>   + "FROM \"scott\".\"EMP\"\n"
>   + "LEFT JOIN \"scott\".\"DEPT\" "
>   + "ON \"EMP\".\"DEPTNO\" = \"DEPT\".\"DEPTNO\" "
>   + "AND \"DEPT\".\"DNAME\" NOT LIKE 'ACCOUNTING'";
>   relFn(relFn).ok(expectedSql);
> }
> {code}
> It blows up with the following stacktrace top:
> {noformat}
> java.lang.AssertionError: NOT(LIKE($9, 'ACCOUNTING_FOO'))
> at 
> org.apache.calcite.rel.rel2sql.SqlImplementor.convertConditionToSqlNode(SqlImplementor.java:350)
> at 
> org.apache.calcite.rel.rel2sql.SqlImplementor.convertConditionToSqlNode(SqlImplementor.java:286)
> at 
> org.apache.calcite.rel.rel2sql.RelToSqlConverter.visit(RelToSqlConverter.java:213)
> at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:524)
> at 
> org.apache.calcite.rel.rel2sql.RelToSqlConverter.dispatch(RelToSqlConverter.java:131)
> at 
> org.apache.calcite.rel.rel2sql.RelToSqlConverter.visitInput(RelToSqlConverter.java:139){noformat}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-4249) JDBC adapter cannot translate NOT LIKE in join condition

2020-09-17 Thread Julian Hyde (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde updated CALCITE-4249:
-
Description: 
JDBC adapter cannot translate NOT LIKE in join condition. Rel2SqlConverter 
throws assertion error for NOT operator in join condition.

The following test shows the bug if you put it in RelToSqlConverter
{code:java}
@Test void testJoinWithNotLikeConditionRel2Sql() {
  final Function relFn = b -> b
  .scan("EMP")
  .scan("DEPT")
  .join(JoinRelType.LEFT,
  b.and(
  b.call(SqlStdOperatorTable.EQUALS,
  b.field(2, 0, "DEPTNO"),
  b.field(2, 1, "DEPTNO")),
  b.call(SqlStdOperatorTable.NOT,
  b.call(SqlStdOperatorTable.LIKE,
  b.field(2, 1, "DNAME"),
  b.literal("ACCOUNTING_FOO"))
  )
  ))
  .build();
  final String expectedSql = "SELECT *\n"
  + "FROM \"scott\".\"EMP\"\n"
  + "LEFT JOIN \"scott\".\"DEPT\" "
  + "ON \"EMP\".\"DEPTNO\" = \"DEPT\".\"DEPTNO\" "
  + "AND \"DEPT\".\"DNAME\" NOT LIKE 'ACCOUNTING'";
  relFn(relFn).ok(expectedSql);
}
{code}
It blows up with the following stacktrace top:
{noformat}
java.lang.AssertionError: NOT(LIKE($9, 'ACCOUNTING_FOO'))
at 
org.apache.calcite.rel.rel2sql.SqlImplementor.convertConditionToSqlNode(SqlImplementor.java:350)
at 
org.apache.calcite.rel.rel2sql.SqlImplementor.convertConditionToSqlNode(SqlImplementor.java:286)
at 
org.apache.calcite.rel.rel2sql.RelToSqlConverter.visit(RelToSqlConverter.java:213)
at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:524)
at 
org.apache.calcite.rel.rel2sql.RelToSqlConverter.dispatch(RelToSqlConverter.java:131)
at 
org.apache.calcite.rel.rel2sql.RelToSqlConverter.visitInput(RelToSqlConverter.java:139){noformat}
 

  was:
The following test shows the bug if you put it in RelToSqlConverter
{code:java}
@Test void testJoinWithNotLikeConditionRel2Sql() {
  final Function relFn = b -> b
  .scan("EMP")
  .scan("DEPT")
  .join(JoinRelType.LEFT,
  b.and(
  b.call(SqlStdOperatorTable.EQUALS,
  b.field(2, 0, "DEPTNO"),
  b.field(2, 1, "DEPTNO")),
  b.call(SqlStdOperatorTable.NOT,
  b.call(SqlStdOperatorTable.LIKE,
  b.field(2, 1, "DNAME"),
  b.literal("ACCOUNTING_FOO"))
  )
  ))
  .build();
  final String expectedSql = "SELECT *\n"
  + "FROM \"scott\".\"EMP\"\n"
  + "LEFT JOIN \"scott\".\"DEPT\" "
  + "ON \"EMP\".\"DEPTNO\" = \"DEPT\".\"DEPTNO\" "
  + "AND \"DEPT\".\"DNAME\" NOT LIKE 'ACCOUNTING'";
  relFn(relFn).ok(expectedSql);
}
{code}
It blows up with the following stacktrace top:
{noformat}
java.lang.AssertionError: NOT(LIKE($9, 'ACCOUNTING_FOO'))
at 
org.apache.calcite.rel.rel2sql.SqlImplementor.convertConditionToSqlNode(SqlImplementor.java:350)
at 
org.apache.calcite.rel.rel2sql.SqlImplementor.convertConditionToSqlNode(SqlImplementor.java:286)
at 
org.apache.calcite.rel.rel2sql.RelToSqlConverter.visit(RelToSqlConverter.java:213)
at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:524)
at 
org.apache.calcite.rel.rel2sql.RelToSqlConverter.dispatch(RelToSqlConverter.java:131)
at 
org.apache.calcite.rel.rel2sql.RelToSqlConverter.visitInput(RelToSqlConverter.java:139){noformat}
 


> JDBC adapter cannot translate NOT LIKE in join condition
> 
>
> Key: CALCITE-4249
> URL: https://issues.apache.org/jira/browse/CALCITE-4249
> Project: Calcite
>  Issue Type: Bug
>Reporter: Steven Talbot
>Assignee: Julian Hyde
>Priority: Major
>
> JDBC adapter cannot translate NOT LIKE in join condition. Rel2SqlConverter 
> throws assertion error for NOT operator in join condition.
> The following test shows the bug if you put it in RelToSqlConverter
> {code:java}
> @Test void testJoinWithNotLikeConditionRel2Sql() {
>   final Function relFn = b -> b
>   .scan("EMP")
>   .scan("DEPT")
>   .join(JoinRelType.LEFT,
>   b.and(
>   b.call(SqlStdOperatorTable.EQUALS,
>   b.field(2, 0, "DEPTNO"),
>   b.field(2, 1, "DEPTNO")),
>   b.call(SqlStdOperatorTable.NOT,
>   b.call(SqlStdOperatorTable.LIKE,
>   b.field(2, 1, "DNAME"),
>   b.literal("ACCOUNTING_FOO"))
>   )
>   ))
>   .build();
>   final String expectedSql = "SELECT *\n"
>   + "FROM \"scott\".\"EMP\"\n"
>   + "LEFT JOIN \"scott\".\"DEPT\" "
>   + "ON \"EMP\".\"DEPTNO\" = \"DEPT\".\"DEPTNO\" "
>   + "AND \"DEPT\".\"DNAME\" NOT LIKE 

[jira] [Updated] (CALCITE-4249) JDBC adapter cannot translate NOT LIKE in join condition

2020-09-17 Thread Julian Hyde (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde updated CALCITE-4249:
-
Summary: JDBC adapter cannot translate NOT LIKE in join condition  (was: 
Assertion error for NOT operator in join condition in Rel2SqlConverter)

> JDBC adapter cannot translate NOT LIKE in join condition
> 
>
> Key: CALCITE-4249
> URL: https://issues.apache.org/jira/browse/CALCITE-4249
> Project: Calcite
>  Issue Type: Bug
>Reporter: Steven Talbot
>Assignee: Julian Hyde
>Priority: Major
>
> The following test shows the bug if you put it in RelToSqlConverter
> {code:java}
> @Test void testJoinWithNotLikeConditionRel2Sql() {
>   final Function relFn = b -> b
>   .scan("EMP")
>   .scan("DEPT")
>   .join(JoinRelType.LEFT,
>   b.and(
>   b.call(SqlStdOperatorTable.EQUALS,
>   b.field(2, 0, "DEPTNO"),
>   b.field(2, 1, "DEPTNO")),
>   b.call(SqlStdOperatorTable.NOT,
>   b.call(SqlStdOperatorTable.LIKE,
>   b.field(2, 1, "DNAME"),
>   b.literal("ACCOUNTING_FOO"))
>   )
>   ))
>   .build();
>   final String expectedSql = "SELECT *\n"
>   + "FROM \"scott\".\"EMP\"\n"
>   + "LEFT JOIN \"scott\".\"DEPT\" "
>   + "ON \"EMP\".\"DEPTNO\" = \"DEPT\".\"DEPTNO\" "
>   + "AND \"DEPT\".\"DNAME\" NOT LIKE 'ACCOUNTING'";
>   relFn(relFn).ok(expectedSql);
> }
> {code}
> It blows up with the following stacktrace top:
> {noformat}
> java.lang.AssertionError: NOT(LIKE($9, 'ACCOUNTING_FOO'))
> at 
> org.apache.calcite.rel.rel2sql.SqlImplementor.convertConditionToSqlNode(SqlImplementor.java:350)
> at 
> org.apache.calcite.rel.rel2sql.SqlImplementor.convertConditionToSqlNode(SqlImplementor.java:286)
> at 
> org.apache.calcite.rel.rel2sql.RelToSqlConverter.visit(RelToSqlConverter.java:213)
> at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:524)
> at 
> org.apache.calcite.rel.rel2sql.RelToSqlConverter.dispatch(RelToSqlConverter.java:131)
> at 
> org.apache.calcite.rel.rel2sql.RelToSqlConverter.visitInput(RelToSqlConverter.java:139){noformat}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4223) Introducing column statistics to RelOptTable

2020-09-17 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17198115#comment-17198115
 ] 

Julian Hyde commented on CALCITE-4223:
--

bq. Users can implement ColStatistics and add new methods. Then use unwarp() to 
get the customized ColStatistics.

That doesn't work, because your {{ColStatistics}} doesn't have an {{unwrap}} 
method.

bq. I think ... is much more straightforward and readable.

I don't think readability is the most important metric. The problem is to plug 
together providers, which is hard. A robust solution is bound to be somewhat 
complex.

bq. Besides,  does it mean that RelOptTable has to implement interfaces like 
BuiltinMetadata.size/BuiltinMetadata.DistinctRowCount in your proposal?

No. When I call unwrap on an object, it doesn't have to return itself. 
Typically it will return a lambda.

> Introducing column statistics to RelOptTable
> 
>
> Key: CALCITE-4223
> URL: https://issues.apache.org/jira/browse/CALCITE-4223
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Chunwei Lei
>Assignee: Chunwei Lei
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Many systems depend on column statistics to compute more accurate stats, such 
> as NDV, average column size, and so on. It would be nice if Calcite can 
> provide such an interface.
> Column statistics might include NDV, average/max column length, number of 
> nulls, number of trues, number of falses and so on. 
> What do you think?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4223) Introducing column statistics to RelOptTable

2020-09-17 Thread Chunwei Lei (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17198105#comment-17198105
 ] 

Chunwei Lei commented on CALCITE-4223:
--

Thank you for your review, Julian.
{quote}it does not easily allow people to add new kinds of metadata, and it 
does not accommodate differences in data structures that may have more 
information (e.g. a system that has a histogram that returns not just number of 
distinct values, but the number of distinct values between 100 and 1000)
{quote}
Users can implement ColStatistics and add new methods. Then use unwarp() to get 
the customized ColStatistics. Comparing 
{{table.unwrap(BuiltinMetadata.Size.class)}}, I think 
sum({{table.getColumnStatistics(col).getAvgColLen())}} is much more 
straightforward and readable.

Besides,  does it mean that {{RelOptTable}} has to implement interfaces like 
{{BuiltinMetadata.size/BuiltinMetadata.DistinctRowCount}} in your proposal?

> Introducing column statistics to RelOptTable
> 
>
> Key: CALCITE-4223
> URL: https://issues.apache.org/jira/browse/CALCITE-4223
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Chunwei Lei
>Assignee: Chunwei Lei
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Many systems depend on column statistics to compute more accurate stats, such 
> as NDV, average column size, and so on. It would be nice if Calcite can 
> provide such an interface.
> Column statistics might include NDV, average/max column length, number of 
> nulls, number of trues, number of falses and so on. 
> What do you think?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (CALCITE-4259) Support JDK 15

2020-09-17 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197191#comment-17197191
 ] 

Julian Hyde edited comment on CALCITE-4259 at 9/18/20, 3:35 AM:


[~danny0405], I see you added tests in CALCITE-3719 that use log4j Logger and 
Appender. Could these be switched to another logging framework?


was (Author: julianhyde):
[~danny0405], I see you added tests in CALCITE-3719 that use log4j Logger and 
Appender. Could these we switched to another logging framework?

> Support JDK 15
> --
>
> Key: CALCITE-4259
> URL: https://issues.apache.org/jira/browse/CALCITE-4259
> Project: Calcite
>  Issue Type: Bug
>Reporter: Julian Hyde
>Assignee: Julian Hyde
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.26.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> JDK 15 is now GA (released 2020/09/15). We should support it (and OpenJDK 15) 
> as a build- and run-time platform.
> Also:
> * Upgrade to latest Guava (guava-29.0-jre, released on 2020/04/13);
> * Upgrade log4j2 to latest (2.13.3, release 2020/05/10);
> * Remove log4j (we're on the last release, 1.2.17, which was 2012/05/26, but 
> I assume the project is now unsupported and dead).
> If there are any other important upgrades to be done before 1.26, please add 
> them here.
> Must fix in next release (1.26) or justify why not in this thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (CALCITE-4259) Support JDK 15

2020-09-17 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17198098#comment-17198098
 ] 

Julian Hyde edited comment on CALCITE-4259 at 9/18/20, 3:35 AM:


I'm not sure what the official CI tests. I test each JDK [8 .. 15] against a 
couple of random Guava versions in the range 19 to 29 every night. 
Guava-related errors are very rare; if it builds, it tends to run. The fact 
that the Cassandra adapter can't run on Guava 27 or higher is an exception to 
that rule.


was (Author: julianhyde):
I'm not sure what the official CI tests. I test each JDK (8 .. 15) against a 
couple of random Guava versions in the range 19 to 29 every night. 
Guava-related errors are very rare; if it builds, it tends to run. The fact 
that the Cassandra adapter can't run on Guava 27 or higher is an exception to 
that rule.

> Support JDK 15
> --
>
> Key: CALCITE-4259
> URL: https://issues.apache.org/jira/browse/CALCITE-4259
> Project: Calcite
>  Issue Type: Bug
>Reporter: Julian Hyde
>Assignee: Julian Hyde
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.26.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> JDK 15 is now GA (released 2020/09/15). We should support it (and OpenJDK 15) 
> as a build- and run-time platform.
> Also:
> * Upgrade to latest Guava (guava-29.0-jre, released on 2020/04/13);
> * Upgrade log4j2 to latest (2.13.3, release 2020/05/10);
> * Remove log4j (we're on the last release, 1.2.17, which was 2012/05/26, but 
> I assume the project is now unsupported and dead).
> If there are any other important upgrades to be done before 1.26, please add 
> them here.
> Must fix in next release (1.26) or justify why not in this thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4259) Support JDK 15

2020-09-17 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17198098#comment-17198098
 ] 

Julian Hyde commented on CALCITE-4259:
--

I'm not sure what the official CI tests. I test each JDK (8 .. 15) against a 
couple of random Guava versions in the range 19 to 29 every night. 
Guava-related errors are very rare; if it builds, it tends to run. The fact 
that the Cassandra adapter can't run on Guava 27 or higher is an exception to 
that rule.

> Support JDK 15
> --
>
> Key: CALCITE-4259
> URL: https://issues.apache.org/jira/browse/CALCITE-4259
> Project: Calcite
>  Issue Type: Bug
>Reporter: Julian Hyde
>Assignee: Julian Hyde
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.26.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> JDK 15 is now GA (released 2020/09/15). We should support it (and OpenJDK 15) 
> as a build- and run-time platform.
> Also:
> * Upgrade to latest Guava (guava-29.0-jre, released on 2020/04/13);
> * Upgrade log4j2 to latest (2.13.3, release 2020/05/10);
> * Remove log4j (we're on the last release, 1.2.17, which was 2012/05/26, but 
> I assume the project is now unsupported and dead).
> If there are any other important upgrades to be done before 1.26, please add 
> them here.
> Must fix in next release (1.26) or justify why not in this thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4259) Support JDK 15

2020-09-17 Thread Kenneth Knowles (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17198086#comment-17198086
 ] 

Kenneth Knowles commented on CALCITE-4259:
--

I am not very familiar with Calcite's testing setup. Do you have a test matrix 
for a few Java/Guava versions?

> Support JDK 15
> --
>
> Key: CALCITE-4259
> URL: https://issues.apache.org/jira/browse/CALCITE-4259
> Project: Calcite
>  Issue Type: Bug
>Reporter: Julian Hyde
>Assignee: Julian Hyde
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.26.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> JDK 15 is now GA (released 2020/09/15). We should support it (and OpenJDK 15) 
> as a build- and run-time platform.
> Also:
> * Upgrade to latest Guava (guava-29.0-jre, released on 2020/04/13);
> * Upgrade log4j2 to latest (2.13.3, release 2020/05/10);
> * Remove log4j (we're on the last release, 1.2.17, which was 2012/05/26, but 
> I assume the project is now unsupported and dead).
> If there are any other important upgrades to be done before 1.26, please add 
> them here.
> Must fix in next release (1.26) or justify why not in this thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-4262) In SqlToRelConverter, use RelBuilder for creating all relational expressions

2020-09-17 Thread Julian Hyde (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde updated CALCITE-4262:
-
Description: 
In {{SqlToRelConverter}}, use {{RelBuilder}} for creating all relational 
expressions. In CALCITE-4220 we switched to using {{RelBuilder}} for creating 
{{Aggregate}} (in some cases) but there are still calls to:
* {{LogicalFilter.create}} and {{FilterFactory.createFilter}}
* {{LogicalUnion.create}}
* {{LogicalValues.createOneRow}}
* {{MatchFactory.createMatch}}
* {{LogicalTableScan.create}}
* {{LogicalTableFunctionScan.create}}
* {{LogicalCorrelate.create}}
* {{LogicalUnion.create}}
* {{LogicalIntersect.create}}
* {{LogicalMinus.create}}
* {{LogicalTableModify.create}}
* {{JoinFactory.createJoin}}
* {{AggregateCall.create}}
* {{LogicalProject.create}}
* {{LogicalSort.create}}

The equivalent {{RelBuilder}} methods all exist. The biggest challenge is that 
the {{RelBuilder}} methods will do some optimizations (almost all improvements) 
that will cause plan changes.

> In SqlToRelConverter, use RelBuilder for creating all relational expressions
> 
>
> Key: CALCITE-4262
> URL: https://issues.apache.org/jira/browse/CALCITE-4262
> Project: Calcite
>  Issue Type: Bug
>Reporter: Julian Hyde
>Priority: Major
>
> In {{SqlToRelConverter}}, use {{RelBuilder}} for creating all relational 
> expressions. In CALCITE-4220 we switched to using {{RelBuilder}} for creating 
> {{Aggregate}} (in some cases) but there are still calls to:
> * {{LogicalFilter.create}} and {{FilterFactory.createFilter}}
> * {{LogicalUnion.create}}
> * {{LogicalValues.createOneRow}}
> * {{MatchFactory.createMatch}}
> * {{LogicalTableScan.create}}
> * {{LogicalTableFunctionScan.create}}
> * {{LogicalCorrelate.create}}
> * {{LogicalUnion.create}}
> * {{LogicalIntersect.create}}
> * {{LogicalMinus.create}}
> * {{LogicalTableModify.create}}
> * {{JoinFactory.createJoin}}
> * {{AggregateCall.create}}
> * {{LogicalProject.create}}
> * {{LogicalSort.create}}
> The equivalent {{RelBuilder}} methods all exist. The biggest challenge is 
> that the {{RelBuilder}} methods will do some optimizations (almost all 
> improvements) that will cause plan changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-4262) In SqlToRelConverter, use RelBuilder for creating all relational expressions

2020-09-17 Thread Julian Hyde (Jira)
Julian Hyde created CALCITE-4262:


 Summary: In SqlToRelConverter, use RelBuilder for creating all 
relational expressions
 Key: CALCITE-4262
 URL: https://issues.apache.org/jira/browse/CALCITE-4262
 Project: Calcite
  Issue Type: Bug
Reporter: Julian Hyde






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4210) Sub Query is 'On' clause does not correctly expand in SqlToRelConverter

2020-09-17 Thread James Starr (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197925#comment-17197925
 ] 

James Starr commented on CALCITE-4210:
--

[~julianhyde] I reworked createJoinUsing and createJoinNatural to return 
conditions instead of the whole rel node, and now createJoinOnCondition returns 
a pair of the right node and the condition.  I also made the inner class 
static.   

> Sub Query is 'On' clause does not correctly expand in SqlToRelConverter
> ---
>
> Key: CALCITE-4210
> URL: https://issues.apache.org/jira/browse/CALCITE-4210
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: James Starr
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Aggregation is dropped during decorreleation.
> {code:java}
> SELECT "employee"."department_id", "employee"."salary"
> FROM "department"
> LEFT JOIN "employee" ON "employee"."salary" = (
>   SELECT max("employee"."salary")
>   FROM "employee"
>   WHERE  "employee"."department_id" = "department"."department_id"
> )
> {code}
> {code:java}
> LogicalProject(department_id=[$9], salary=[$13])
>   LogicalJoin(condition=[=($13, $0)], joinType=[left])
> LogicalTableScan(table=[[foodmart, department]])
> LogicalTableScan(table=[[foodmart, employee]]){code}
> This may be related to CALCITE-4206.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4242) Wrong plan for nested NOT EXISTS subqueries

2020-09-17 Thread James Starr (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197921#comment-17197921
 ] 

James Starr commented on CALCITE-4242:
--

Sorry, about not validating the previous query on postgres.  The null 
conditional tripped me up.  The correct query for postgres would be:

{code:sql}
// Some comments here
public String getFoo()
 WITH
q_present AS (SELECT TRUE as present FROM q),
r_present_given_z AS (SELECT TRUE as present, z FROM r GROUP BY z),
q_present_given_r_present_given_z AS (
SELECT distinct TRUE AS present, r_present_given_z.present AS r_present, 
r_present_given_z.z AS r_z
FROM q_present
LEFT JOIN r_present_given_z ON TRUE
)
SELECT my_p.x
FROM P my_p
LEFT JOIN q_present_given_r_present_given_z ON 
q_present_given_r_present_given_z.r_z = my_p.x
WHERE NOT(
  (q_present_given_r_present_given_z.present IS NOT NULL)
  AND NOT (q_present_given_r_present_given_z.r_present IS NOT NULL)
);
{code}

{code:sql}
CREATE TABLE P(x INTEGER);
CREATE TABLE Q(y INTEGER);
CREATE TABLE R(z INTEGER);
INSERT INTO P VALUES (1);
INSERT INTO Q VALUES (1);

 x 
---
 1
(1 row)

DELETE FROM P;
DELETE FROM Q;
DELETE FROM R;
INSERT INTO P VALUES (1);
INSERT INTO P VALUES (1);
INSERT INTO P VALUES (0);
INSERT INTO P VALUES (0);
INSERT INTO Q VALUES (1);
INSERT INTO Q VALUES (0);
INSERT INTO R VALUES (1);
INSERT INTO R VALUES (0);

 x 
---
 1
 1
 0
 0
(4 rows)

DELETE FROM P;
DELETE FROM Q;
DELETE FROM R;
INSERT INTO P VALUES (1);

 x 
---
 1
(1 row)

{code}

The semi left join/distinct/aggregation handles the multiplicity.  All of the 
conditional need to hoisted to the top where clause.  If the joins are kept 
right-associative, then only top most query joined, in the example 
q_present_given_r_present_given_z, needs to have an aggregation.  However, I 
think this could result in cross product joins if there are 2 nested queries in 
a query that is already nested.  However, most likely the optimization rules 
should be able to rearrange the join such that it be right or left associative 
is not a question of performance, simply of correctness, assuming that the 
rewrite is in such a way that it does not block the rule. 

This query could be very expensive if it was right associative and the planner 
did not rearrange the joins.  
{code:sql}
CREATE TABLE t_1(t1_id INTEGER);
CREATE TABLE t_2(y INTEGER);
CREATE TABLE t_2_1(t1_id INTEGER);
CREATE TABLE t_2_2(t1_id INTEGER);

SELECT * FROM t_1
WHERE EXISTS (
  SELECT * FROM t_2
  WHERE EXISTS(SELECT * FROM t_2_1 WHERE t1.t1_id = t2_1.t1_id)
  AND EXISTS(SELECT * FROM t_2_2 WHERE t1.t1_id = t2_2.t1_id)
);
{code}

So I believe the algorithm works out to something like this:
{code:java}
//De-morgan the expression such that you have a normalized AND expression
List demorganSubExpression = demorganToAnd(subExpression);
List resultExpression = new ArrayList<>();
for(RexNode subExpresion:  demorganSubExpression){
  if(hasSubQuery(subExpression)) {
bb.hoiste(subExpression);
  } else {
resultExpression.add(subExpression);
  }
  return buildAndExpression(resultExpression);
{code}

With bb.hoiste traversing up the tree until until all elements are in scope.


> Wrong plan for nested NOT EXISTS subqueries
> ---
>
> Key: CALCITE-4242
> URL: https://issues.apache.org/jira/browse/CALCITE-4242
> Project: Calcite
>  Issue Type: Bug
>Reporter: Martin Raszyk
>Priority: Major
>
> Suppose we initialize an empty database as follows.
>  
> {code:java}
> CREATE TABLE P(x INTEGER);
> CREATE TABLE Q(y INTEGER);
> CREATE TABLE R(z INTEGER);
> INSERT INTO P VALUES (1);
> INSERT INTO Q VALUES (1);{code}
>  
> The following query is supposed to yield an empty table as the result.
>  
> {code:java}
> SELECT x FROM P
> WHERE NOT EXISTS (
>   SELECT y FROM Q
>   WHERE NOT EXISTS (
> SELECT z FROM R
> WHERE x = z
>   )
> ){code}
>  
> However, the query is parsed and converted to the following plan
> {code:java}
> LogicalProject(X=[$0])
>   LogicalFilter(condition=[IS NULL($2)])
> LogicalJoin(condition=[=($0, $1)], joinType=[left])
>   LogicalTableScan(table=[[Bug, P]])
>   LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
> LogicalProject(Z=[$1], $f0=[true])
>   LogicalFilter(condition=[IS NULL($2)])
> LogicalJoin(condition=[true], joinType=[left])
>   LogicalTableScan(table=[[Bug, Q]])
>   LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
> LogicalProject(Z=[$0], $f0=[true])
>   LogicalTableScan(table=[[Bug, R]])
> {code}
> that corresponds to the following SQL query
> {code:java}
> SELECT P.X
> FROM Bug.P
> LEFT JOIN (SELECT t0.Z, MIN(TRUE) AS $f1
> FROM Bug.Q
> LEFT JOIN (SELECT Z, MIN(TRUE) AS $f1
> FROM Bug.R
> GROUP BY Z) AS t0 ON TRUE
> WHERE t0.$f1 IS NULL
> GROUP BY 

[jira] [Commented] (CALCITE-4261) Join with three tables causes IllegalArgumentException in EnumerableBatchNestedLoopJoinRule

2020-09-17 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197765#comment-17197765
 ] 

Ruben Q L commented on CALCITE-4261:


The problem seems to be that, the second time 
{{EnumerableBatchNestedLoopJoinRule}} gets applied, it mixes the correlated 
variables and applies a visitor transformation on the wrong correlated 
variable. 

> Join with three tables causes IllegalArgumentException in 
> EnumerableBatchNestedLoopJoinRule
> ---
>
> Key: CALCITE-4261
> URL: https://issues.apache.org/jira/browse/CALCITE-4261
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Ruben Q L
>Assignee: Ruben Q L
>Priority: Major
> Fix For: 1.26.0
>
>
> Issue can be reproduced with the following test (to be added in 
> {{EnumerableBatchNestedLoopJoinTest}}):
> {code}
>   @Test void doubleInnerBatchJoinTestSQL() {
> tester(false, new JdbcTest.HrSchema())
> .query("select e.name, d.name as dept, l.name as location "
> + "from emps e join depts d on d.deptno <> e.salary "
> + "join locations l on e.empid <> l.empid and d.deptno = l.empid")
> .withHook(Hook.PLANNER, (Consumer) planner -> {
>   planner.removeRule(EnumerableRules.ENUMERABLE_CORRELATE_RULE);
>   
> planner.addRule(EnumerableRules.ENUMERABLE_BATCH_NESTED_LOOP_JOIN_RULE);
> })
> .explainContains("EnumerableBatchNestedLoopJoin")
> .returnsUnordered("name=Bill; dept=Sales; location=San Francisco",
> "name=Eric; dept=Sales; location=San Francisco",
> "name=Sebastian; dept=Sales; location=San Francisco",
> "name=Theodore; dept=Sales; location=San Francisco");
>   }
> {code}
> Which causes:
> {code}
> Error while executing SQL "explain plan for select e.name, d.name as dept, 
> l.name as location from emps e join depts d on d.deptno <> e.salary join 
> locations l on e.empid <> l.empid and d.deptno = l.empid"
> ...
> Caused by: java.lang.RuntimeException: Error while applying rule 
> EnumerableBatchNestedLoopJoinRule,
> ...
> Caused by: java.lang.IllegalArgumentException: Field #0: empid JavaType(int) 
> does not exist for expression $cor1801
>   at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:122)
> ...
>   at 
> org.apache.calcite.adapter.enumerable.EnumerableBatchNestedLoopJoinRule.onMatch(EnumerableBatchNestedLoopJoinRule.java:127)
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:229)
> ...
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-4261) Join with three tables causes IllegalArgumentException in EnumerableBatchNestedLoopJoinRule

2020-09-17 Thread Ruben Q L (Jira)
Ruben Q L created CALCITE-4261:
--

 Summary: Join with three tables causes IllegalArgumentException in 
EnumerableBatchNestedLoopJoinRule
 Key: CALCITE-4261
 URL: https://issues.apache.org/jira/browse/CALCITE-4261
 Project: Calcite
  Issue Type: Bug
  Components: core
Affects Versions: 1.25.0
Reporter: Ruben Q L
 Fix For: 1.26.0


Issue can be reproduced with the following test (to be added in 
{{EnumerableBatchNestedLoopJoinTest}}):
{code}
  @Test void doubleInnerBatchJoinTestSQL() {
tester(false, new JdbcTest.HrSchema())
.query("select e.name, d.name as dept, l.name as location "
+ "from emps e join depts d on d.deptno <> e.salary "
+ "join locations l on e.empid <> l.empid and d.deptno = l.empid")
.withHook(Hook.PLANNER, (Consumer) planner -> {
  planner.removeRule(EnumerableRules.ENUMERABLE_CORRELATE_RULE);
  
planner.addRule(EnumerableRules.ENUMERABLE_BATCH_NESTED_LOOP_JOIN_RULE);
})
.explainContains("EnumerableBatchNestedLoopJoin")
.returnsUnordered("name=Bill; dept=Sales; location=San Francisco",
"name=Eric; dept=Sales; location=San Francisco",
"name=Sebastian; dept=Sales; location=San Francisco",
"name=Theodore; dept=Sales; location=San Francisco");
  }
{code}

Which causes:
{code}
Error while executing SQL "explain plan for select e.name, d.name as dept, 
l.name as location from emps e join depts d on d.deptno <> e.salary join 
locations l on e.empid <> l.empid and d.deptno = l.empid"
...
Caused by: java.lang.RuntimeException: Error while applying rule 
EnumerableBatchNestedLoopJoinRule,
...
Caused by: java.lang.IllegalArgumentException: Field #0: empid JavaType(int) 
does not exist for expression $cor1801
at 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:122)
...
at 
org.apache.calcite.adapter.enumerable.EnumerableBatchNestedLoopJoinRule.onMatch(EnumerableBatchNestedLoopJoinRule.java:127)
at 
org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:229)
...
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (CALCITE-4261) Join with three tables causes IllegalArgumentException in EnumerableBatchNestedLoopJoinRule

2020-09-17 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L reassigned CALCITE-4261:
--

Assignee: Ruben Q L

> Join with three tables causes IllegalArgumentException in 
> EnumerableBatchNestedLoopJoinRule
> ---
>
> Key: CALCITE-4261
> URL: https://issues.apache.org/jira/browse/CALCITE-4261
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.25.0
>Reporter: Ruben Q L
>Assignee: Ruben Q L
>Priority: Major
> Fix For: 1.26.0
>
>
> Issue can be reproduced with the following test (to be added in 
> {{EnumerableBatchNestedLoopJoinTest}}):
> {code}
>   @Test void doubleInnerBatchJoinTestSQL() {
> tester(false, new JdbcTest.HrSchema())
> .query("select e.name, d.name as dept, l.name as location "
> + "from emps e join depts d on d.deptno <> e.salary "
> + "join locations l on e.empid <> l.empid and d.deptno = l.empid")
> .withHook(Hook.PLANNER, (Consumer) planner -> {
>   planner.removeRule(EnumerableRules.ENUMERABLE_CORRELATE_RULE);
>   
> planner.addRule(EnumerableRules.ENUMERABLE_BATCH_NESTED_LOOP_JOIN_RULE);
> })
> .explainContains("EnumerableBatchNestedLoopJoin")
> .returnsUnordered("name=Bill; dept=Sales; location=San Francisco",
> "name=Eric; dept=Sales; location=San Francisco",
> "name=Sebastian; dept=Sales; location=San Francisco",
> "name=Theodore; dept=Sales; location=San Francisco");
>   }
> {code}
> Which causes:
> {code}
> Error while executing SQL "explain plan for select e.name, d.name as dept, 
> l.name as location from emps e join depts d on d.deptno <> e.salary join 
> locations l on e.empid <> l.empid and d.deptno = l.empid"
> ...
> Caused by: java.lang.RuntimeException: Error while applying rule 
> EnumerableBatchNestedLoopJoinRule,
> ...
> Caused by: java.lang.IllegalArgumentException: Field #0: empid JavaType(int) 
> does not exist for expression $cor1801
>   at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:122)
> ...
>   at 
> org.apache.calcite.adapter.enumerable.EnumerableBatchNestedLoopJoinRule.onMatch(EnumerableBatchNestedLoopJoinRule.java:127)
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:229)
> ...
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-4258) SqlToRelConverter: SELECT 1 IS [NOT] DISTINCT FROM NULL fails with AssertionError

2020-09-17 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L resolved CALCITE-4258.

Resolution: Fixed

Fixed via 
https://github.com/apache/calcite/commit/ea2b32e11dc6a0744fc93b8d33d7dfa1da70e674

> SqlToRelConverter: SELECT 1 IS [NOT] DISTINCT FROM NULL fails with 
> AssertionError
> -
>
> Key: CALCITE-4258
> URL: https://issues.apache.org/jira/browse/CALCITE-4258
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Ruben Q L
>Assignee: Ruben Q L
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.26.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Problem can be reproduced with the following tests (in 
> core\src\test\resources\sql\misc.iq):
> {code}
> # [CALCITE-4258]
> SELECT 1 IS DISTINCT FROM NULL;
> ++
> | EXPR$0 |
> ++
> | true   |
> ++
> (1 row)
> !ok
> # [CALCITE-4258]
> SELECT 1 IS NOT DISTINCT FROM NULL;
> ++
> | EXPR$0 |
> ++
> | false  |
> ++
> (1 row)
> !ok
> {code}
> These tests fail with:
> {code}
> > java.lang.AssertionError: Conversion to relational algebra failed to 
> > preserve datatypes:
> > validated type:
> > RecordType(BOOLEAN NOT NULL EXPR$0) NOT NULL
> > converted type:
> > RecordType(BOOLEAN EXPR$0) NOT NULL
> > rel:
> > LogicalProject(EXPR$0=[null:BOOLEAN])
> >   LogicalValues(tuples=[[{ 0 }]])
> 44a47,109
> > at 
> > org.apache.calcite.sql2rel.SqlToRelConverter.checkConvertedType(SqlToRelConverter.java:466)
> > at 
> > org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:581)
> > at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:242)
> > at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:208)
> ...
> {code}
> These queries used to work fine until recently, so this regression must have 
> occurred not so long ago.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-4223) Introducing column statistics to RelOptTable

2020-09-17 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197526#comment-17197526
 ] 

Julian Hyde commented on CALCITE-4223:
--

I see in the PR you have created {{interface ColStatistics}} and added a method 
to {{RelOptTable}} to get it.

I said above, and still think, that this is not the right approach. It does not 
easily allow people to add new kinds of metadata, and it does not accommodate 
differences in data structures that may have more information (e.g. a system 
that has a histogram that returns not just number of distinct values, but the 
number of distinct values between 100 and 1000).

I introduced {{interface Statistics}} to make the simple case easy. It is not a 
template that we should try to extend.

Suppose you could query any metadata interface on a {{RelOptTable}} using 
{{unwrap}}. Then you can easily implement metadata. For example, in 
{{RelMdSize}}:

{code}
  public Double averageRowSize(TableScan scan, RelMetadataQuery mq) {
final RelOptTable table = scan.getTable();
final BuiltInMetadata.Size size =
table.unwrap(BuiltInMetadata.Size.class);
if (size != null) {
  return size.averageRowSize();
}
return null;
  }
{code}

I think that is much more elegant and straightforward.

Of course the implementor of the particular type of table will have to 
implement the necessary interfaces, but I don't think that will be hard.

> Introducing column statistics to RelOptTable
> 
>
> Key: CALCITE-4223
> URL: https://issues.apache.org/jira/browse/CALCITE-4223
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Chunwei Lei
>Assignee: Chunwei Lei
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Many systems depend on column statistics to compute more accurate stats, such 
> as NDV, average column size, and so on. It would be nice if Calcite can 
> provide such an interface.
> Column statistics might include NDV, average/max column length, number of 
> nulls, number of trues, number of falses and so on. 
> What do you think?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-4223) Introducing column statistics to RelOptTable

2020-09-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-4223:

Labels: pull-request-available  (was: )

> Introducing column statistics to RelOptTable
> 
>
> Key: CALCITE-4223
> URL: https://issues.apache.org/jira/browse/CALCITE-4223
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Chunwei Lei
>Assignee: Chunwei Lei
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Many systems depend on column statistics to compute more accurate stats, such 
> as NDV, average column size, and so on. It would be nice if Calcite can 
> provide such an interface.
> Column statistics might include NDV, average/max column length, number of 
> nulls, number of trues, number of falses and so on. 
> What do you think?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (CALCITE-4242) Wrong plan for nested NOT EXISTS subqueries

2020-09-17 Thread Martin Raszyk (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197422#comment-17197422
 ] 

Martin Raszyk edited comment on CALCITE-4242 at 9/17/20, 6:45 AM:
--

I tried to use `LEFT SEMI JOIN` in PostgreSQL version `psql (PostgreSQL) 12.4 
(Ubuntu 12.4-0ubuntu0.20.04.1)`, but I didn't succeed.

The query

 
{code:java}
SELECT x FROM P LEFT SEMI JOIN Q ON TRUE;
{code}
leads to syntax error.

 

The query
{code:java}
SELECT x FROM P SEMI JOIN Q ON TRUE;
{code}
is evaluated, but does not yield the proper multiplicities.

Finally, the query
{code:java}
SELECT my_p.x FROM P AS my_p SEMI JOIN Q ON TRUE;
{code}
leads to syntax error, too.

Moreover, I said that "one could keep the LEFT JOIN right-associative for 
nested NOT EXISTS subqueries *without data dependencies crossing two nesting 
levels*". In my very original query
{code:java}
SELECT x FROM P
WHERE NOT EXISTS (
  SELECT y FROM Q
  WHERE NOT EXISTS (
SELECT z FROM R
WHERE x = z
  )
)
{code}
this is not the case and your most recent query which I fixed as follows so 
that it evaluates
{code:java}
WITH
q_present AS (SELECT TRUE as present FROM q),
r_present_given_z AS (SELECT TRUE as present, z FROM r GROUP BY z),
q_present_given_r_present_given_z AS (
SELECT TRUE AS present, r_present_given_z.present AS r_present, 
r_present_given_z.z AS r_z
FROM q_present
LEFT JOIN r_present_given_z ON TRUE
)
SELECT my_p.x
FROM P my_p
LEFT JOIN q_present_given_r_present_given_z ON 
q_present_given_r_present_given_z.r_z = my_p.x
WHERE NOT(
  q_present_given_r_present_given_z.present
  AND NOT q_present_given_r_present_given_z.r_present
);
{code}
is not equivalent to my very original query with the same counter-example as 
your very first suggestion:
{code:java}
CREATE TABLE P(x INTEGER);
CREATE TABLE Q(y INTEGER);
CREATE TABLE R(z INTEGER);
INSERT INTO P VALUES (1);
{code}


was (Author: mraszyk):
I tried to use `LEFT SEMI JOIN` in PostgreSQL version `psql (PostgreSQL) 12.4 
(Ubuntu 12.4-0ubuntu0.20.04.1)`, but I didn't succeed.

The query

 
{code:java}
SELECT x FROM P LEFT SEMI JOIN Q ON TRUE;
{code}
leads to syntax error.

 

The query
{code:java}
SELECT x FROM P SEMI JOIN Q ON TRUE;
{code}
is evaluated, but does not yield the proper multiplicities.

Finally, the query
{code:java}
SELECT my_p.x FROM P AS my_p SEMI JOIN Q ON TRUE;
{code}
leads to syntax error, too.

Moreover, I said that "one could keep the LEFT JOIN right-associative for 
nested NOT EXISTS subqueries *without data dependencies crossing two nesting 
levels*". In my very original query
{code:java}
SELECT x FROM P
WHERE NOT EXISTS (
  SELECT y FROM Q
  WHERE NOT EXISTS (
SELECT z FROM R
WHERE x = z
  )
)
{code}
this is not the case and your most recent query which I fixed as follows so 
that it evaluates
{code:java}
WITH
q_present AS (SELECT TRUE as present FROM q),
r_present_given_z AS (SELECT TRUE as present, z FROM r GROUP BY z),
q_present_given_r_present_given_z AS (
SELECT TRUE AS present, r_present_given_z.present AS r_present, 
r_present_given_z.z AS r_z
FROM q_present
LEFT JOIN r_present_given_z ON TRUE
)
SELECT my_p.x
FROM P my_p
LEFT JOIN q_present_given_r_present_given_z ON 
q_present_given_r_present_given_z.r_z = my_p.x
WHERE NOT(
  q_present_given_r_present_given_z.present
  AND NOT q_present_given_r_present_given_z.r_present
);
{code}
is not equivalent to my very original query with the same counter-example as 
you very first suggestion:
{code:java}
CREATE TABLE P(x INTEGER);
CREATE TABLE Q(y INTEGER);
CREATE TABLE R(z INTEGER);
INSERT INTO P VALUES (1);
{code}

> Wrong plan for nested NOT EXISTS subqueries
> ---
>
> Key: CALCITE-4242
> URL: https://issues.apache.org/jira/browse/CALCITE-4242
> Project: Calcite
>  Issue Type: Bug
>Reporter: Martin Raszyk
>Priority: Major
>
> Suppose we initialize an empty database as follows.
>  
> {code:java}
> CREATE TABLE P(x INTEGER);
> CREATE TABLE Q(y INTEGER);
> CREATE TABLE R(z INTEGER);
> INSERT INTO P VALUES (1);
> INSERT INTO Q VALUES (1);{code}
>  
> The following query is supposed to yield an empty table as the result.
>  
> {code:java}
> SELECT x FROM P
> WHERE NOT EXISTS (
>   SELECT y FROM Q
>   WHERE NOT EXISTS (
> SELECT z FROM R
> WHERE x = z
>   )
> ){code}
>  
> However, the query is parsed and converted to the following plan
> {code:java}
> LogicalProject(X=[$0])
>   LogicalFilter(condition=[IS NULL($2)])
> LogicalJoin(condition=[=($0, $1)], joinType=[left])
>   LogicalTableScan(table=[[Bug, P]])
>   LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
> LogicalProject(Z=[$1], $f0=[true])
>   LogicalFilter(condition=[IS NULL($2)])
> LogicalJoin(condition=[true], joinType=[left])
>   

[jira] [Commented] (CALCITE-4242) Wrong plan for nested NOT EXISTS subqueries

2020-09-17 Thread Martin Raszyk (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197422#comment-17197422
 ] 

Martin Raszyk commented on CALCITE-4242:


I tried to use `LEFT SEMI JOIN` in PostgreSQL version `psql (PostgreSQL) 12.4 
(Ubuntu 12.4-0ubuntu0.20.04.1)`, but I didn't succeed.

The query

 
{code:java}
SELECT x FROM P LEFT SEMI JOIN Q ON TRUE;
{code}
leads to syntax error.

 

The query
{code:java}
SELECT x FROM P SEMI JOIN Q ON TRUE;
{code}
is evaluated, but does not yield the proper multiplicities.

Finally, the query
{code:java}
SELECT my_p.x FROM P AS my_p SEMI JOIN Q ON TRUE;
{code}
leads to syntax error, too.

Moreover, I said that "one could keep the LEFT JOIN right-associative for 
nested NOT EXISTS subqueries *without data dependencies crossing two nesting 
levels*". In my very original query
{code:java}
SELECT x FROM P
WHERE NOT EXISTS (
  SELECT y FROM Q
  WHERE NOT EXISTS (
SELECT z FROM R
WHERE x = z
  )
)
{code}
this is not the case and your most recent query which I fixed as follows so 
that it evaluates
{code:java}
WITH
q_present AS (SELECT TRUE as present FROM q),
r_present_given_z AS (SELECT TRUE as present, z FROM r GROUP BY z),
q_present_given_r_present_given_z AS (
SELECT TRUE AS present, r_present_given_z.present AS r_present, 
r_present_given_z.z AS r_z
FROM q_present
LEFT JOIN r_present_given_z ON TRUE
)
SELECT my_p.x
FROM P my_p
LEFT JOIN q_present_given_r_present_given_z ON 
q_present_given_r_present_given_z.r_z = my_p.x
WHERE NOT(
  q_present_given_r_present_given_z.present
  AND NOT q_present_given_r_present_given_z.r_present
);
{code}
is not equivalent to my very original query with the same counter-example as 
you very first suggestion:
{code:java}
CREATE TABLE P(x INTEGER);
CREATE TABLE Q(y INTEGER);
CREATE TABLE R(z INTEGER);
INSERT INTO P VALUES (1);
{code}

> Wrong plan for nested NOT EXISTS subqueries
> ---
>
> Key: CALCITE-4242
> URL: https://issues.apache.org/jira/browse/CALCITE-4242
> Project: Calcite
>  Issue Type: Bug
>Reporter: Martin Raszyk
>Priority: Major
>
> Suppose we initialize an empty database as follows.
>  
> {code:java}
> CREATE TABLE P(x INTEGER);
> CREATE TABLE Q(y INTEGER);
> CREATE TABLE R(z INTEGER);
> INSERT INTO P VALUES (1);
> INSERT INTO Q VALUES (1);{code}
>  
> The following query is supposed to yield an empty table as the result.
>  
> {code:java}
> SELECT x FROM P
> WHERE NOT EXISTS (
>   SELECT y FROM Q
>   WHERE NOT EXISTS (
> SELECT z FROM R
> WHERE x = z
>   )
> ){code}
>  
> However, the query is parsed and converted to the following plan
> {code:java}
> LogicalProject(X=[$0])
>   LogicalFilter(condition=[IS NULL($2)])
> LogicalJoin(condition=[=($0, $1)], joinType=[left])
>   LogicalTableScan(table=[[Bug, P]])
>   LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
> LogicalProject(Z=[$1], $f0=[true])
>   LogicalFilter(condition=[IS NULL($2)])
> LogicalJoin(condition=[true], joinType=[left])
>   LogicalTableScan(table=[[Bug, Q]])
>   LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
> LogicalProject(Z=[$0], $f0=[true])
>   LogicalTableScan(table=[[Bug, R]])
> {code}
> that corresponds to the following SQL query
> {code:java}
> SELECT P.X
> FROM Bug.P
> LEFT JOIN (SELECT t0.Z, MIN(TRUE) AS $f1
> FROM Bug.Q
> LEFT JOIN (SELECT Z, MIN(TRUE) AS $f1
> FROM Bug.R
> GROUP BY Z) AS t0 ON TRUE
> WHERE t0.$f1 IS NULL
> GROUP BY t0.Z) AS t3 ON P.X = t3.Z
> WHERE t3.$f1 IS NULL
> {code}
> which yields the (non-empty) table P as the result.
> Hence, the parsed and converted query is not equivalent to the input query.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)