[jira] [Updated] (CALCITE-3954) Always compare types using equals

2020-04-23 Thread Danny Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chen updated CALCITE-3954:

Description: As discussed in CALCITE-3932, the intern of data type can not 
be guaranteed, so we must alway compare the data types using equals.

> Always compare types using equals
> -
>
> Key: CALCITE-3954
> URL: https://issues.apache.org/jira/browse/CALCITE-3954
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.22.0
>Reporter: Danny Chen
>Assignee: Danny Chen
>Priority: Major
> Fix For: 1.23.0
>
>
> As discussed in CALCITE-3932, the intern of data type can not be guaranteed, 
> so we must alway compare the data types using equals.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-3954) Always compare types using equals

2020-04-23 Thread Danny Chen (Jira)
Danny Chen created CALCITE-3954:
---

 Summary: Always compare types using equals
 Key: CALCITE-3954
 URL: https://issues.apache.org/jira/browse/CALCITE-3954
 Project: Calcite
  Issue Type: Improvement
  Components: core
Affects Versions: 1.22.0
Reporter: Danny Chen
Assignee: Danny Chen
 Fix For: 1.23.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-3932) Make data type cache thread local, non-evictable

2020-04-23 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan updated CALCITE-3932:
---
Fix Version/s: (was: 1.23.0)

> Make data type cache thread local, non-evictable
> 
>
> Key: CALCITE-3932
> URL: https://issues.apache.org/jira/browse/CALCITE-3932
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Data type can be evicted out of cache, which is global, thread-safe. 
> It seems not necessary to cache them globally, because most of them are 
> RelRecordType, which is query dependent, not sharable between different 
> queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable

2020-04-23 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091078#comment-17091078
 ] 

Haisheng Yuan commented on CALCITE-3932:


Yes, that is feasible. But what if down stream projects already use \{{==}} for 
data type comparison? I don't know if there are.

> Make data type cache thread local, non-evictable
> 
>
> Key: CALCITE-3932
> URL: https://issues.apache.org/jira/browse/CALCITE-3932
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
> Fix For: 1.23.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Data type can be evicted out of cache, which is global, thread-safe. 
> It seems not necessary to cache them globally, because most of them are 
> RelRecordType, which is query dependent, not sharable between different 
> queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable

2020-04-23 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091057#comment-17091057
 ] 

Julian Hyde commented on CALCITE-3932:
--

I agree, won't-fix is the right remedy here.

I wonder whether we should reconsider (as a different Jira case) our interning 
of types. Change from MUST intern to SHOULD intern, and always compare types 
using {{equals}}.

We clearly want to do *some* interning, especially within a query, so that 
there aren't hundreds of copies of the same record type all over the place. But 
if people don't intern, or intern in different query-specific caches, then the 
logic will still work.

If {{equals}} is written using the standard template
{code:java}
return this == o
  || o instanceof TheType && field1 == o.field1 and field2 == o.field2 {code}
(that is, avoiding deep comparison if possible) then the performance will be 
pretty much the same.

> Make data type cache thread local, non-evictable
> 
>
> Key: CALCITE-3932
> URL: https://issues.apache.org/jira/browse/CALCITE-3932
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
> Fix For: 1.23.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Data type can be evicted out of cache, which is global, thread-safe. 
> It seems not necessary to cache them globally, because most of them are 
> RelRecordType, which is query dependent, not sharable between different 
> queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (CALCITE-3932) Make data type cache thread local, non-evictable

2020-04-23 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091057#comment-17091057
 ] 

Julian Hyde edited comment on CALCITE-3932 at 4/24/20, 12:39 AM:
-

I agree, won't-fix is the right remedy here.

I wonder whether we should reconsider (as a different Jira case) our interning 
of types. Change from MUST intern to SHOULD intern, and always compare types 
using {{equals}}.

We clearly want to do *some* interning, especially within a query, so that 
there aren't hundreds of copies of the same record type all over the place. But 
if people don't intern, or intern in different query-specific caches, then the 
logic will still work.

If {{equals}} is written using the standard template
{code:java}
return this == o
  || o instanceof TheType && field1 == o.field1 && field2 == o.field2 {code}
(that is, avoiding deep comparison if possible) then the performance will be 
pretty much the same.


was (Author: julianhyde):
I agree, won't-fix is the right remedy here.

I wonder whether we should reconsider (as a different Jira case) our interning 
of types. Change from MUST intern to SHOULD intern, and always compare types 
using {{equals}}.

We clearly want to do *some* interning, especially within a query, so that 
there aren't hundreds of copies of the same record type all over the place. But 
if people don't intern, or intern in different query-specific caches, then the 
logic will still work.

If {{equals}} is written using the standard template
{code:java}
return this == o
  || o instanceof TheType && field1 == o.field1 and field2 == o.field2 {code}
(that is, avoiding deep comparison if possible) then the performance will be 
pretty much the same.

> Make data type cache thread local, non-evictable
> 
>
> Key: CALCITE-3932
> URL: https://issues.apache.org/jira/browse/CALCITE-3932
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
> Fix For: 1.23.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Data type can be evicted out of cache, which is global, thread-safe. 
> It seems not necessary to cache them globally, because most of them are 
> RelRecordType, which is query dependent, not sharable between different 
> queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (CALCITE-3952) Improve SortRemoveRule to remove Sort based on rowcount

2020-04-23 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091027#comment-17091027
 ] 

Julian Hyde edited comment on CALCITE-3952 at 4/24/20, 12:24 AM:
-

[~vgarg] {{select count\(*) as c from foo order by c limit 100 offset 10}};

It should return empty.


was (Author: hyuan):
[~vgarg] select count(*) as c from foo order by c limit 100 offset 10;

It should return empty.

> Improve SortRemoveRule to remove Sort based on rowcount
> ---
>
> Key: CALCITE-3952
> URL: https://issues.apache.org/jira/browse/CALCITE-3952
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If a query is guaranteed to produce maximum one row it is safe to remove Sort 
> (along with limit). 
> Example:
> {code:sql}
> select count(*) cs from store_sales where ss_ext_sales_price > 100.00 order 
> by cs limit 100
> {code}
> Although logically equivalent this can greatly benefit physical plans by 
> removing extra operator and avoiding unnecessary data transfer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3952) Improve SortRemoveRule to remove Sort based on rowcount

2020-04-23 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091050#comment-17091050
 ] 

Julian Hyde commented on CALCITE-3952:
--

The SQL {{select * from (values 1) as t(x) order by x fetch 1 offset 10}} 
should return zero rows. After you remove the Sort it will return 1 row.

I don't think we need a boolean flag and two rule instances. One rule that 
checks both conditions.

Yes, I think it's worth doing the RelBuilder thing as well. 3 lines in 
RelBuilder and a simple test in RelBuilderTest. (I know it's a bit 
belt-and-braces.)

> Improve SortRemoveRule to remove Sort based on rowcount
> ---
>
> Key: CALCITE-3952
> URL: https://issues.apache.org/jira/browse/CALCITE-3952
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If a query is guaranteed to produce maximum one row it is safe to remove Sort 
> (along with limit). 
> Example:
> {code:sql}
> select count(*) cs from store_sales where ss_ext_sales_price > 100.00 order 
> by cs limit 100
> {code}
> Although logically equivalent this can greatly benefit physical plans by 
> removing extra operator and avoiding unnecessary data transfer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (CALCITE-3952) Improve SortRemoveRule to remove Sort based on rowcount

2020-04-23 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091050#comment-17091050
 ] 

Julian Hyde edited comment on CALCITE-3952 at 4/24/20, 12:23 AM:
-

The SQL {{select * from (values 1) as t\(x) order by x fetch 1 offset 10}} 
should return zero rows. After you remove the Sort it will return 1 row.

I don't think we need a boolean flag and two rule instances. One rule that 
checks both conditions.

Yes, I think it's worth doing the RelBuilder thing as well. 3 lines in 
RelBuilder and a simple test in RelBuilderTest. (I know it's a bit 
belt-and-braces.)


was (Author: julianhyde):
The SQL {{select * from (values 1) as t(x) order by x fetch 1 offset 10}} 
should return zero rows. After you remove the Sort it will return 1 row.

I don't think we need a boolean flag and two rule instances. One rule that 
checks both conditions.

Yes, I think it's worth doing the RelBuilder thing as well. 3 lines in 
RelBuilder and a simple test in RelBuilderTest. (I know it's a bit 
belt-and-braces.)

> Improve SortRemoveRule to remove Sort based on rowcount
> ---
>
> Key: CALCITE-3952
> URL: https://issues.apache.org/jira/browse/CALCITE-3952
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If a query is guaranteed to produce maximum one row it is safe to remove Sort 
> (along with limit). 
> Example:
> {code:sql}
> select count(*) cs from store_sales where ss_ext_sales_price > 100.00 order 
> by cs limit 100
> {code}
> Although logically equivalent this can greatly benefit physical plans by 
> removing extra operator and avoiding unnecessary data transfer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-3946) Add parser support for MULTISET/SET and VOLATILE modifiers in CREATE TABLE statements

2020-04-23 Thread Julian Hyde (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde updated CALCITE-3946:
-
Description: 
Add support to Calcite's Babel parser for {{MULTISET}}/{{SET}} and {{VOLATILE}} 
modifiers in {{CREATE TABLE}} statements.

The syntax for these statements is:
{code:sql}
 CREATE TABLE [SET|MULTISET] [VOLATILE] 
   [IF NOT EXISTS] ( , ...);
{code}

  was:
Add support to Calcite's Babel parser for MULTISET/SET and VOLATILE modifiers 
in CREATE TABLE statements.

The syntax for these statements is:
CREATE TABLE [SET|MULTISET] [VOLATILE]  [IF NOT EXISTS] 
( , ...);


> Add parser support for MULTISET/SET and VOLATILE modifiers in CREATE TABLE 
> statements
> -
>
> Key: CALCITE-3946
> URL: https://issues.apache.org/jira/browse/CALCITE-3946
> Project: Calcite
>  Issue Type: Improvement
>  Components: babel
>Affects Versions: 1.22.0
>Reporter: Drew Schmitt
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Add support to Calcite's Babel parser for {{MULTISET}}/{{SET}} and 
> {{VOLATILE}} modifiers in {{CREATE TABLE}} statements.
> The syntax for these statements is:
> {code:sql}
>  CREATE TABLE [SET|MULTISET] [VOLATILE] 
>[IF NOT EXISTS] ( , ...);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3952) Improve SortRemoveRule to remove Sort based on rowcount

2020-04-23 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091029#comment-17091029
 ] 

Haisheng Yuan commented on CALCITE-3952:


[~vgarg] Why do we need a configuration to turn it on/off? Isn't it always good?

> Improve SortRemoveRule to remove Sort based on rowcount
> ---
>
> Key: CALCITE-3952
> URL: https://issues.apache.org/jira/browse/CALCITE-3952
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If a query is guaranteed to produce maximum one row it is safe to remove Sort 
> (along with limit). 
> Example:
> {code:sql}
> select count(*) cs from store_sales where ss_ext_sales_price > 100.00 order 
> by cs limit 100
> {code}
> Although logically equivalent this can greatly benefit physical plans by 
> removing extra operator and avoiding unnecessary data transfer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3952) Improve SortRemoveRule to remove Sort based on rowcount

2020-04-23 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091027#comment-17091027
 ] 

Haisheng Yuan commented on CALCITE-3952:


[~vgarg] select count(*) as c from foo order by c limit 100 offset 10;

It should return empty.

> Improve SortRemoveRule to remove Sort based on rowcount
> ---
>
> Key: CALCITE-3952
> URL: https://issues.apache.org/jira/browse/CALCITE-3952
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If a query is guaranteed to produce maximum one row it is safe to remove Sort 
> (along with limit). 
> Example:
> {code:sql}
> select count(*) cs from store_sales where ss_ext_sales_price > 100.00 order 
> by cs limit 100
> {code}
> Although logically equivalent this can greatly benefit physical plans by 
> removing extra operator and avoiding unnecessary data transfer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3952) Improve SortRemoveRule to remove Sort based on rowcount

2020-04-23 Thread Vineet Garg (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090994#comment-17090994
 ] 

Vineet Garg commented on CALCITE-3952:
--

bq. Given a rel which emit at most 1 row (RelMetadataQuery.getRowCount<=1), 
should RelMetadataQuery.collations(rel) match the Sort order
[~jinxing6...@126.com] I don't think it is necessary in this case for sort 
order to match. If input is producing at most 1 row there is no need to sort.


> Improve SortRemoveRule to remove Sort based on rowcount
> ---
>
> Key: CALCITE-3952
> URL: https://issues.apache.org/jira/browse/CALCITE-3952
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If a query is guaranteed to produce maximum one row it is safe to remove Sort 
> (along with limit). 
> Example:
> {code:sql}
> select count(*) cs from store_sales where ss_ext_sales_price > 100.00 order 
> by cs limit 100
> {code}
> Although logically equivalent this can greatly benefit physical plans by 
> removing extra operator and avoiding unnecessary data transfer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3952) Improve SortRemoveRule to remove Sort based on rowcount

2020-04-23 Thread Vineet Garg (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090992#comment-17090992
 ] 

Vineet Garg commented on CALCITE-3952:
--

Thanks for the feedback [~julianhyde]. I have few follow-up questions
bq. When we generate the baseline xml file we insert tests in approximately 
alphabetical order. Reduces merge conflicts. Better than editing manually.
How do I regenerate these files automatically? Unlike mvn running the test with 
gradle doesn't seem to generate xml file under target/surefire.

bq. be sure to check for OFFSET. If there is an offset you can’t safely remove 
the Sort
Can you provide an example where it will not be safe? The patch currently (pull 
request is open) checks if input has atmost single row and removes the sort 
only if LIMIT >= 1. I am not sure why offset will change things here.

bq.  I noticed that RelBuilder skips Aggregate if getMaxRowCount <= 1. Maybe it 
could do the same for Sort. And then maybe you wouldn’t need to modify 
SortRemoveRule.
The pull request I have opened has made changes to SortRemoveRule. May be it is 
worthwhile having this logic in RelBuilder as well? Let me know what are your 
thoughts on this.
 

> Improve SortRemoveRule to remove Sort based on rowcount
> ---
>
> Key: CALCITE-3952
> URL: https://issues.apache.org/jira/browse/CALCITE-3952
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If a query is guaranteed to produce maximum one row it is safe to remove Sort 
> (along with limit). 
> Example:
> {code:sql}
> select count(*) cs from store_sales where ss_ext_sales_price > 100.00 order 
> by cs limit 100
> {code}
> Although logically equivalent this can greatly benefit physical plans by 
> removing extra operator and avoiding unnecessary data transfer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-3932) Make data type cache thread local, non-evictable

2020-04-23 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-3932.

Resolution: Won't Fix

> Make data type cache thread local, non-evictable
> 
>
> Key: CALCITE-3932
> URL: https://issues.apache.org/jira/browse/CALCITE-3932
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
> Fix For: 1.23.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Data type can be evicted out of cache, which is global, thread-safe. 
> It seems not necessary to cache them globally, because most of them are 
> RelRecordType, which is query dependent, not sharable between different 
> queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable

2020-04-23 Thread Haisheng Yuan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090927#comment-17090927
 ] 

Haisheng Yuan commented on CALCITE-3932:


For thread local, even it is long-lived, as long as the cache is evictable, we 
are fine. But for the case of materialized views shared among connections, 
that's indeed a problem.  I will close it with won't fix.

> Make data type cache thread local, non-evictable
> 
>
> Key: CALCITE-3932
> URL: https://issues.apache.org/jira/browse/CALCITE-3932
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
> Fix For: 1.23.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Data type can be evicted out of cache, which is global, thread-safe. 
> It seems not necessary to cache them globally, because most of them are 
> RelRecordType, which is query dependent, not sharable between different 
> queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3932) Make data type cache thread local, non-evictable

2020-04-23 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090839#comment-17090839
 ] 

Julian Hyde commented on CALCITE-3932:
--

It's possible that one thread will create a type and another thread will later 
use it. Especially in the scenario of materialized views, which are shared 
among connections and statements.

So a thread-local cache doesn't seem to be a good fit.

Also, threads might be long-lived and fairly numerous (because people use 
thread pools) so the cache clutter may build up.

A global (static) cache with a WeakInterner sounds more promising.

> Make data type cache thread local, non-evictable
> 
>
> Key: CALCITE-3932
> URL: https://issues.apache.org/jira/browse/CALCITE-3932
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
> Fix For: 1.23.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Data type can be evicted out of cache, which is global, thread-safe. 
> It seems not necessary to cache them globally, because most of them are 
> RelRecordType, which is query dependent, not sharable between different 
> queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3953) SqlToRelConverter creates char literal with coercibility IMPLICIT

2020-04-23 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090726#comment-17090726
 ] 

Ruben Q L commented on CALCITE-3953:


Relevant code is in {{SqlLiteral#createSqlType}}:
{code}
...
case CHAR:
  NlsString string = (NlsString) value;
  Charset charset = string.getCharset();
  if (null == charset) {
charset = typeFactory.getDefaultCharset();
  }
  SqlCollation collation = string.getCollation();
  if (null == collation) {
collation = SqlCollation.COERCIBLE;
  }
  RelDataType type =
  typeFactory.createSqlType(
  SqlTypeName.CHAR,
  string.getValue().length());
  type =
  typeFactory.createTypeWithCharsetAndCollation(
  type,
  charset,
  collation);
  return type;
{code}

In the test of the description, that code does NOT return a type {{CHAR(3)}} 
with {{SqlCollation.COERCIBLE}}, in fact it returns a {{CHAR(3)}} with 
{{SqlCollation.IMPLICIT}}. The root cause is the 
{{typeFactory.createTypeWithCharsetAndCollation}}, which internally does a 
{{return canonize(newType);}} This canonization "breaks" the type's 
SqlCollation. The reason for that is that type's digest 
(BasicSqlType#generateTypeString) does not (always) include SqlCollation. And 
apart from that, SqlCollation's name does not include coercibility info.

> SqlToRelConverter creates char literal with coercibility IMPLICIT
> -
>
> Key: CALCITE-3953
> URL: https://issues.apache.org/jira/browse/CALCITE-3953
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.22.0
>Reporter: Ruben Q L
>Priority: Major
>
> The problem can be reproduced with the following test (to be added in 
> SqlToRelConverterTest):
> {code:java}
>   @Test void testLiteralCoercibility() {
> final String sql = "select * from dept where name = 'abc'";
> final RelNode rel = tester.convertSqlToRel(sql).rel;
> final List filters = new ArrayList<>();
> final RelShuttleImpl visitor = new RelShuttleImpl() {
>   @Override public RelNode visit(LogicalFilter filter) {
> filters.add(filter);
> return super.visit(filter);
>   }
> };
> visitor.visit(rel);
> assertThat(filters.size(), is(1));
> assertThat(filters.get(0).getCondition(), instanceOf(RexCall.class));
> final RexCall call = (RexCall) filters.get(0).getCondition();
> final RexNode literal = 
> call.getOperands().stream().filter(RexLiteral.class::isInstance).findFirst().orElse(null);
> assertThat (literal, notNullValue());
> assertThat (literal.getType().getCollation(), notNullValue());
> assertThat (literal.getType().getCollation().getCoercibility(), 
> is(SqlCollation.Coercibility.COERCIBLE));
>   }
> {code}
> Which fails with the message:
> {code:java}
> java.lang.AssertionError: 
> Expected: is 
>  but: was 
> {code}
> According to {{SqlCollation.Coercibility}} javadoc:
>  _A character value expression consisting of a value other than a column 
> (e.g., a host variable or a literal) has the coercibility characteristic 
> Coercible, with the default collation for its character repertoire._



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-3953) SqlToRelConverter creates char literal with coercibility IMPLICIT

2020-04-23 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-3953:
---
Description: 
The problem can be reproduced with the following test (to be added in 
SqlToRelConverterTest):
{code:java}
  @Test void testLiteralCoercibility() {
final String sql = "select * from dept where name = 'abc'";
final RelNode rel = tester.convertSqlToRel(sql).rel;
final List filters = new ArrayList<>();
final RelShuttleImpl visitor = new RelShuttleImpl() {
  @Override public RelNode visit(LogicalFilter filter) {
filters.add(filter);
return super.visit(filter);
  }
};
visitor.visit(rel);
assertThat(filters.size(), is(1));
assertThat(filters.get(0).getCondition(), instanceOf(RexCall.class));
final RexCall call = (RexCall) filters.get(0).getCondition();
final RexNode literal = 
call.getOperands().stream().filter(RexLiteral.class::isInstance).findFirst().orElse(null);
assertThat (literal, notNullValue());
assertThat (literal.getType().getCollation(), notNullValue());
assertThat (literal.getType().getCollation().getCoercibility(), 
is(SqlCollation.Coercibility.COERCIBLE));
  }
{code}
Which fails with the message:
{code:java}
java.lang.AssertionError: 
Expected: is 
 but: was 
{code}
According to {{SqlCollation.Coercibility}} javadoc:
 _A character value expression consisting of a value other than a column (e.g., 
a host variable or a literal) has the coercibility characteristic Coercible, 
with the default collation for its character repertoire._

  was:
The problem can be reproduced with the following test (to be added in 
SqlToRelConverterTest):
{code:java}
  @Test void testLiteralCoercibility() {
final String sql = "select * from dept where name = 'abc'";
final RelNode rel = tester.convertSqlToRel(sql).rel;
final List filters = new ArrayList<>();
final RelShuttleImpl visitor = new RelShuttleImpl() {
  @Override public RelNode visit(LogicalFilter filter) {
filters.add(filter);
return super.visit(filter);
  }
};
visitor.visit(rel);
assertThat(filters.size(), is(1));
assertThat(filters.get(0).getCondition(), instanceOf(RexCall.class));
RexCall call = (RexCall) filters.get(0).getCondition();
RexNode literal = 
call.getOperands().stream().filter(RexLiteral.class::isInstance).findFirst().orElse(null);
assertThat (literal, notNullValue());
assertThat (literal.getType().getCollation(), notNullValue());
assertThat (literal.getType().getCollation().getCoercibility(), 
is(SqlCollation.Coercibility.COERCIBLE));
  }
{code}
Which fails with the message:
{code:java}
java.lang.AssertionError: 
Expected: is 
 but: was 
{code}
According to {{SqlCollation.Coercibility}} javadoc:
 _A character value expression consisting of a value other than a column (e.g., 
a host variable or a literal) has the coercibility characteristic Coercible, 
with the default collation for its character repertoire._


> SqlToRelConverter creates char literal with coercibility IMPLICIT
> -
>
> Key: CALCITE-3953
> URL: https://issues.apache.org/jira/browse/CALCITE-3953
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.22.0
>Reporter: Ruben Q L
>Priority: Major
>
> The problem can be reproduced with the following test (to be added in 
> SqlToRelConverterTest):
> {code:java}
>   @Test void testLiteralCoercibility() {
> final String sql = "select * from dept where name = 'abc'";
> final RelNode rel = tester.convertSqlToRel(sql).rel;
> final List filters = new ArrayList<>();
> final RelShuttleImpl visitor = new RelShuttleImpl() {
>   @Override public RelNode visit(LogicalFilter filter) {
> filters.add(filter);
> return super.visit(filter);
>   }
> };
> visitor.visit(rel);
> assertThat(filters.size(), is(1));
> assertThat(filters.get(0).getCondition(), instanceOf(RexCall.class));
> final RexCall call = (RexCall) filters.get(0).getCondition();
> final RexNode literal = 
> call.getOperands().stream().filter(RexLiteral.class::isInstance).findFirst().orElse(null);
> assertThat (literal, notNullValue());
> assertThat (literal.getType().getCollation(), notNullValue());
> assertThat (literal.getType().getCollation().getCoercibility(), 
> is(SqlCollation.Coercibility.COERCIBLE));
>   }
> {code}
> Which fails with the message:
> {code:java}
> java.lang.AssertionError: 
> Expected: is 
>  but: was 
> {code}
> According to {{SqlCollation.Coercibility}} javadoc:
>  _A character value expression consisting of a value other than a column 
> (e.g., a host variable or a literal) has the coercibility characteristic 
> Coercible, with the default collation 

[jira] [Created] (CALCITE-3953) SqlToRelConverter creates char literal with coercibility IMPLICIT

2020-04-23 Thread Ruben Q L (Jira)
Ruben Q L created CALCITE-3953:
--

 Summary: SqlToRelConverter creates char literal with coercibility 
IMPLICIT
 Key: CALCITE-3953
 URL: https://issues.apache.org/jira/browse/CALCITE-3953
 Project: Calcite
  Issue Type: Bug
  Components: core
Affects Versions: 1.22.0
Reporter: Ruben Q L


The problem can be reproduced with the following test (to be added in 
SqlToRelConverterTest):
{code:java}
  @Test void testLiteralCoercibility() {
final String sql = "select * from dept where name = 'abc'";
final RelNode rel = tester.convertSqlToRel(sql).rel;
final List filters = new ArrayList<>();
final RelShuttleImpl visitor = new RelShuttleImpl() {
  @Override public RelNode visit(LogicalFilter filter) {
filters.add(filter);
return super.visit(filter);
  }
};
visitor.visit(rel);
assertThat(filters.size(), is(1));
assertThat(filters.get(0).getCondition(), instanceOf(RexCall.class));
RexCall call = (RexCall) filters.get(0).getCondition();
RexNode literal = 
call.getOperands().stream().filter(RexLiteral.class::isInstance).findFirst().orElse(null);
assertThat (literal, notNullValue());
assertThat (literal.getType().getCollation(), notNullValue());
assertThat (literal.getType().getCollation().getCoercibility(), 
is(SqlCollation.Coercibility.COERCIBLE));
  }
{code}
Which fails with the message:
{code:java}
java.lang.AssertionError: 
Expected: is 
 but: was 
{code}
According to {{SqlCollation.Coercibility}} javadoc:
 _A character value expression consisting of a value other than a column (e.g., 
a host variable or a literal) has the coercibility characteristic Coercible, 
with the default collation for its character repertoire._



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (CALCITE-3950) Doc of SqlGroupingFunction contradicts its behavior

2020-04-23 Thread Jin Xing (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jin Xing reassigned CALCITE-3950:
-

Assignee: Jin Xing

> Doc of SqlGroupingFunction contradicts its behavior
> ---
>
> Key: CALCITE-3950
> URL: https://issues.apache.org/jira/browse/CALCITE-3950
> Project: Calcite
>  Issue Type: Bug
>Reporter: Jin Xing
>Assignee: Jin Xing
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently doc of SqlGroupingFunctions says:
> {code:java}
> /**
>  * The {@code GROUPING} function.
>  *
>  * Accepts 1 or more arguments.
>  * Example: {@code GROUPING(deptno, gender)} returns
>  * 3 if both deptno and gender are being grouped,
>  * 2 if only deptno is being grouped,
>  * 1 if only gender is being groped,
>  * 0 if neither deptno nor gender are being grouped.{code}
> But its behavior in agg.iq is as below:
> {code:java}
> # GROUPING in SELECT clause of CUBE query
> select deptno, job, count(*) as c, grouping(deptno) as d,
>   grouping(job) j, grouping(deptno, job) as x
> from "scott".emp
> group by cube(deptno, job);
> ++---++---+---+---+
> | DEPTNO | JOB   | C  | D | J | X |
> ++---++---+---+---+
> | 10 | CLERK |  1 | 0 | 0 | 0 |
> | 10 | MANAGER   |  1 | 0 | 0 | 0 |
> | 10 | PRESIDENT |  1 | 0 | 0 | 0 |
> | 10 |   |  3 | 0 | 1 | 1 |
> | 20 | ANALYST   |  2 | 0 | 0 | 0 |
> | 20 | CLERK |  2 | 0 | 0 | 0 |
> | 20 | MANAGER   |  1 | 0 | 0 | 0 |
> | 20 |   |  5 | 0 | 1 | 1 |
> | 30 | CLERK |  1 | 0 | 0 | 0 |
> | 30 | MANAGER   |  1 | 0 | 0 | 0 |
> | 30 | SALESMAN  |  4 | 0 | 0 | 0 |
> | 30 |   |  6 | 0 | 1 | 1 |
> || ANALYST   |  2 | 1 | 0 | 2 |
> || CLERK |  4 | 1 | 0 | 2 |
> || MANAGER   |  3 | 1 | 0 | 2 |
> || PRESIDENT |  1 | 1 | 0 | 2 |
> || SALESMAN  |  4 | 1 | 0 | 2 |
> ||   | 14 | 1 | 1 | 3 |
> ++---++---+---+---+
> (18 rows)
> {code}
>  
> The doc needs to be rectified thus to be consistent with query result and the 
> behavior of Hive[1] and PostgreSQL[2]
>  [1] 
> [https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup?spm=ata.13261165.0.0.528c6dfcXalQFy#EnhancedAggregation,Cube,GroupingandRollup-Groupingfunction]
>  [2] [https://www.postgresql.org/docs/9.5/functions-aggregate.html] 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-3950) Doc of SqlGroupingFunction contradicts its behavior

2020-04-23 Thread Jin Xing (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jin Xing updated CALCITE-3950:
--
Summary: Doc of SqlGroupingFunction contradicts its behavior  (was: Doc of 
SqlGroupingFunction contradicts with its behavior)

> Doc of SqlGroupingFunction contradicts its behavior
> ---
>
> Key: CALCITE-3950
> URL: https://issues.apache.org/jira/browse/CALCITE-3950
> Project: Calcite
>  Issue Type: Bug
>Reporter: Jin Xing
>Priority: Major
>
> Currently doc of SqlGroupingFunctions says:
> {code:java}
> /**
>  * The {@code GROUPING} function.
>  *
>  * Accepts 1 or more arguments.
>  * Example: {@code GROUPING(deptno, gender)} returns
>  * 3 if both deptno and gender are being grouped,
>  * 2 if only deptno is being grouped,
>  * 1 if only gender is being groped,
>  * 0 if neither deptno nor gender are being grouped.{code}
> But its behavior in agg.iq is as below:
> {code:java}
> # GROUPING in SELECT clause of CUBE query
> select deptno, job, count(*) as c, grouping(deptno) as d,
>   grouping(job) j, grouping(deptno, job) as x
> from "scott".emp
> group by cube(deptno, job);
> ++---++---+---+---+
> | DEPTNO | JOB   | C  | D | J | X |
> ++---++---+---+---+
> | 10 | CLERK |  1 | 0 | 0 | 0 |
> | 10 | MANAGER   |  1 | 0 | 0 | 0 |
> | 10 | PRESIDENT |  1 | 0 | 0 | 0 |
> | 10 |   |  3 | 0 | 1 | 1 |
> | 20 | ANALYST   |  2 | 0 | 0 | 0 |
> | 20 | CLERK |  2 | 0 | 0 | 0 |
> | 20 | MANAGER   |  1 | 0 | 0 | 0 |
> | 20 |   |  5 | 0 | 1 | 1 |
> | 30 | CLERK |  1 | 0 | 0 | 0 |
> | 30 | MANAGER   |  1 | 0 | 0 | 0 |
> | 30 | SALESMAN  |  4 | 0 | 0 | 0 |
> | 30 |   |  6 | 0 | 1 | 1 |
> || ANALYST   |  2 | 1 | 0 | 2 |
> || CLERK |  4 | 1 | 0 | 2 |
> || MANAGER   |  3 | 1 | 0 | 2 |
> || PRESIDENT |  1 | 1 | 0 | 2 |
> || SALESMAN  |  4 | 1 | 0 | 2 |
> ||   | 14 | 1 | 1 | 3 |
> ++---++---+---+---+
> (18 rows)
> {code}
>  
> The doc needs to be rectified thus to be consistent with query result and the 
> behavior of Hive[1] and PostgreSQL[2]
>  [1] 
> [https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup?spm=ata.13261165.0.0.528c6dfcXalQFy#EnhancedAggregation,Cube,GroupingandRollup-Groupingfunction]
>  [2] [https://www.postgresql.org/docs/9.5/functions-aggregate.html] 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3952) Improve SortRemoveRule to remove Sort based on rowcount

2020-04-23 Thread Jin Xing (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090431#comment-17090431
 ] 

Jin Xing commented on CALCITE-3952:
---

Yes, [~julianhyde], it should be getMaxRowCount :D

> Improve SortRemoveRule to remove Sort based on rowcount
> ---
>
> Key: CALCITE-3952
> URL: https://issues.apache.org/jira/browse/CALCITE-3952
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>
> If a query is guaranteed to produce maximum one row it is safe to remove Sort 
> (along with limit). 
> Example:
> {code:sql}
> select count(*) cs from store_sales where ss_ext_sales_price > 100.00 order 
> by cs limit 100
> {code}
> Although logically equivalent this can greatly benefit physical plans by 
> removing extra operator and avoiding unnecessary data transfer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3878) Make ArrayList creation with initial capacity when size is fixed

2020-04-23 Thread neoremind (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090376#comment-17090376
 ] 

neoremind commented on CALCITE-3878:


Thanks Haisheng for moving this forward :)

> Make ArrayList creation with initial capacity when size is fixed
> 
>
> Key: CALCITE-3878
> URL: https://issues.apache.org/jira/browse/CALCITE-3878
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.22.0
>Reporter: neoremind
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> I find many places in Calcite where _new ArrayList<>()_ is used, if the list 
> is expected to be immutable or not resizing, it is always a good manner to 
> create with initial capacity, better for memory usage and performance.
> I search all occurrences, focus on the core module, to make it safe, I only 
> update local variables with fixed size and not working in recursive method. 
> If the local variable reference goes out of scope, if resizing is needed, 
> things will work normally as well, so no side effect, but for the "escaping" 
> case, I am very conservative and do not change them.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-2223) ProjectMergeRule is infinitely matched when is applied after ProjectReduceExpressionsRule

2020-04-23 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090354#comment-17090354
 ] 

Ruben Q L commented on CALCITE-2223:


Thanks [~hyuan], I agree.

[~volodymyr], I have  tried the test provided in the description of the current 
ticket, and it runs successfully (no OOM). I guess we can consider the current 
ticket as "resolved".

> ProjectMergeRule is infinitely matched when is applied after 
> ProjectReduceExpressionsRule
> -
>
> Key: CALCITE-2223
> URL: https://issues.apache.org/jira/browse/CALCITE-2223
> Project: Calcite
>  Issue Type: Bug
>Reporter: Vova Vysotskyi
>Priority: Critical
>  Labels: pull-request-available
> Attachments: 
> TestLimitWithExchanges_testPushLimitPastUnionExchange.png, heap_overview.png, 
> provenance_contents.png
>
>
> For queries like this:
> {code:sql}
> select t1.f from (select cast(f as int) f, f from (select cast(f as int) f 
> from (values('1')) t(f))) as t1
> {code}
> OOM is thrown when {{ProjectMergeRule}} is applied before applying 
> {{ProjectReduceExpressionsRule}} in VolcanoPlanner.
>  A simple test to reproduce this issue (in {{RelOptRulesTest}}):
> {code:java}
>   @Test public void testOomProjectMergeRule() {
> RelBuilder relBuilder = 
> RelBuilder.create(RelBuilderTest.config().build());
> RelNode relNode = relBuilder
> .values(new String[]{"f"}, "1")
> .project(
> relBuilder.alias(
> relBuilder.cast(relBuilder.field(0), SqlTypeName.INTEGER),
> "f"))
> .project(
> relBuilder.alias(
> relBuilder.cast(relBuilder.field(0), SqlTypeName.INTEGER),
> "f0"),
> relBuilder.alias(relBuilder.field(0), "f"))
> .project(
> relBuilder.alias(relBuilder.field(0), "f"))
> .build();
> RelOptPlanner planner = relNode.getCluster().getPlanner();
> RuleSet ruleSet =
> RuleSets.ofList(
> ReduceExpressionsRule.PROJECT_INSTANCE,
> new ProjectMergeRuleWithLongerName(),
> EnumerableRules.ENUMERABLE_PROJECT_RULE,
> EnumerableRules.ENUMERABLE_VALUES_RULE);
> Program program = Programs.of(ruleSet);
> RelTraitSet toTraits =
> relNode.getCluster().traitSet()
> .replace(0, EnumerableConvention.INSTANCE);
> RelNode output = program.run(planner, relNode, toTraits,
> ImmutableList.of(), 
> ImmutableList.of());
> // check for output
>   }
>   /**
>* ProjectMergeRule inheritor which has
>* class name greater than ProjectReduceExpressionsRule class name 
> (String.compareTo()).
>*
>* It is needed for RuleQueue.popMatch() method
>* to apply this rule before ProjectReduceExpressionsRule.
>*/
>   private static class ProjectMergeRuleWithLongerName extends 
> ProjectMergeRule {
> public ProjectMergeRuleWithLongerName() {
>   super(true, RelFactories.LOGICAL_BUILDER);
> }
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3952) Improve SortRemoveRule to remove Sort based on rowcount

2020-04-23 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090339#comment-17090339
 ] 

Julian Hyde commented on CALCITE-3952:
--

[~jinxing6...@126.com] Do not use getRowCount for these purposes. It is 
approximate. Use getMaxRowCount.

[~vgarg] I noticed that RelBuilder skips Aggregate if getMaxRowCount <= 1. 
Maybe it could do the same for Sort. And then maybe you wouldn’t need to modify 
SortRemoveRule.

> Improve SortRemoveRule to remove Sort based on rowcount
> ---
>
> Key: CALCITE-3952
> URL: https://issues.apache.org/jira/browse/CALCITE-3952
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>
> If a query is guaranteed to produce maximum one row it is safe to remove Sort 
> (along with limit). 
> Example:
> {code:sql}
> select count(*) cs from store_sales where ss_ext_sales_price > 100.00 order 
> by cs limit 100
> {code}
> Although logically equivalent this can greatly benefit physical plans by 
> removing extra operator and avoiding unnecessary data transfer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-3951) Support different string comparison based on SqlCollation

2020-04-23 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-3951:
---
Summary: Support different string comparison based on SqlCollation  (was: 
Support different comparison based on collation)

> Support different string comparison based on SqlCollation
> -
>
> Key: CALCITE-3951
> URL: https://issues.apache.org/jira/browse/CALCITE-3951
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Ruben Q L
>Assignee: Ruben Q L
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently SqlCollation defines concepts like Coercibility, Charset, Locale, 
> etc. However, we cannot specify on a certain collation that e.g. a string 
> field should use case insensitive comparison. The goal of this ticket is to 
> evolve SqlCollation to support that, and adapt the corresponding classes to 
> use that (optional) "non-standard" comparison.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3952) Improve SortRemoveRule to remove Sort based on rowcount

2020-04-23 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090331#comment-17090331
 ] 

Julian Hyde commented on CALCITE-3952:
--

When we generate the baseline xml file we insert tests in approximately 
alphabetical order. Reduces merge conflicts. Better than editing manually.

LIMIT 0 should already be handled by PruneEmptyRules.

[~vgarg] be sure to check for OFFSET. If there is an offset you can’t safely 
remove the Sort. (But maybe PruneEmptyRules could handle this case.)

 

> Improve SortRemoveRule to remove Sort based on rowcount
> ---
>
> Key: CALCITE-3952
> URL: https://issues.apache.org/jira/browse/CALCITE-3952
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>
> If a query is guaranteed to produce maximum one row it is safe to remove Sort 
> (along with limit). 
> Example:
> {code:sql}
> select count(*) cs from store_sales where ss_ext_sales_price > 100.00 order 
> by cs limit 100
> {code}
> Although logically equivalent this can greatly benefit physical plans by 
> removing extra operator and avoiding unnecessary data transfer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3952) Improve SortRemoveRule to remove Sort based on rowcount

2020-04-23 Thread Jin Xing (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090322#comment-17090322
 ] 

Jin Xing commented on CALCITE-3952:
---

Given a rel which emit at most 1 row (RelMetadataQuery.getRowCount<=1), should 
RelMetadataQuery.collations(rel) match the Sort order ? If so the operator of 
Sort can be removed automatically.

> Improve SortRemoveRule to remove Sort based on rowcount
> ---
>
> Key: CALCITE-3952
> URL: https://issues.apache.org/jira/browse/CALCITE-3952
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>
> If a query is guaranteed to produce maximum one row it is safe to remove Sort 
> (along with limit). 
> Example:
> {code:sql}
> select count(*) cs from store_sales where ss_ext_sales_price > 100.00 order 
> by cs limit 100
> {code}
> Although logically equivalent this can greatly benefit physical plans by 
> removing extra operator and avoiding unnecessary data transfer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Issue Comment Deleted] (CALCITE-3952) Improve SortRemoveRule to remove Sort based on rowcount

2020-04-23 Thread Jin Xing (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jin Xing updated CALCITE-3952:
--
Comment: was deleted

(was: Given a rel which emit at most 1 row (RelMetadataQuery.getRowCount(rel) 
<=1), besides the optimization on Sort, seems we can also optimize other 
operators like Aggregate.)

> Improve SortRemoveRule to remove Sort based on rowcount
> ---
>
> Key: CALCITE-3952
> URL: https://issues.apache.org/jira/browse/CALCITE-3952
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>
> If a query is guaranteed to produce maximum one row it is safe to remove Sort 
> (along with limit). 
> Example:
> {code:sql}
> select count(*) cs from store_sales where ss_ext_sales_price > 100.00 order 
> by cs limit 100
> {code}
> Although logically equivalent this can greatly benefit physical plans by 
> removing extra operator and avoiding unnecessary data transfer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3952) Improve SortRemoveRule to remove Sort based on rowcount

2020-04-23 Thread Jin Xing (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090309#comment-17090309
 ] 

Jin Xing commented on CALCITE-3952:
---

Given a rel which emit at most 1 row (RelMetadataQuery.getRowCount(rel) <=1), 
besides the optimization on Sort, seems we can also optimize other operators 
like Aggregate.

> Improve SortRemoveRule to remove Sort based on rowcount
> ---
>
> Key: CALCITE-3952
> URL: https://issues.apache.org/jira/browse/CALCITE-3952
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>
> If a query is guaranteed to produce maximum one row it is safe to remove Sort 
> (along with limit). 
> Example:
> {code:sql}
> select count(*) cs from store_sales where ss_ext_sales_price > 100.00 order 
> by cs limit 100
> {code}
> Although logically equivalent this can greatly benefit physical plans by 
> removing extra operator and avoiding unnecessary data transfer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)