[jira] [Comment Edited] (CALCITE-2965) Implement string functions: REPEAT, SPACE, SOUNDEX, DIFFERENCE

2019-04-02 Thread Chunwei Lei (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808387#comment-16808387
 ] 

Chunwei Lei edited comment on CALCITE-2965 at 4/3/19 5:51 AM:
--

[~julianhyde], I found that actually not all of these functions are Oracle 
functions as follows.
 * MySql[1]:soundex/space/repeat
 * PostgreSQL[2][3]:soundex/difference/repeat
 * Oracle[4]: soundex
 * SQL Server[5]:soundex/space/replicate(similar to repeat)

It looks like we can treat soundex as standard function. What do you think 
about it?

[1] https://dev.mysql.com/doc/refman/8.0/en/string-functions.html
 [2] https://www.postgre[5]sql.org/docs/9.1/fuzzystrmatch.html
[3] https://www.postgresql.org/docs/9.1/functions-string.html
 [4] https://docs.oracle.com/cd/B28359_01/server.111/b28286/functions155.htm
 [5] 
https://docs.microsoft.com/en-us/sql/t-sql/functions/string-functions-transact-sql?view=sql-server-2017

 


was (Author: chunwei lei):
[~julianhyde], I found that actually not all of these functions are Oracle 
functions as follows.
 * MySql[1]:soundex/space/repeat
 * PostgreSQL[2][3]:soundex/difference/repeat
 * Oracle[4]: soundex
 * SQL Server[5]:soundex/space/replicate(similar to repeat)

It looks like we can treat soundex as standard function. What do you think 
about it?

[1]https://dev.mysql.com/doc/refman/8.0/en/string-functions.html
[2]https://www.postgresql.org/docs/9.1/fuzzystrmatch.html
[3]https://www.postgresql.org/docs/9.1/functions-string.html
[4]https://docs.oracle.com/cd/B28359_01/server.111/b28286/functions155.htm
[5]https://docs.microsoft.com/en-us/sql/t-sql/functions/string-functions-transact-sql?view=sql-server-2017

 

> Implement string functions: REPEAT, SPACE, SOUNDEX, DIFFERENCE
> --
>
> Key: CALCITE-2965
> URL: https://issues.apache.org/jira/browse/CALCITE-2965
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Chunwei Lei
>Assignee: Chunwei Lei
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Some string functions including REPEAT, SPACE, SOUNDEX, DIFFERENCE are not 
> implemented now. It would be great if these functions can be implemented.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CALCITE-2976) Improve materialized view rewriting coverage with disjunctive predicates

2019-04-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-2976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-2976:

Labels: pull-request-available  (was: )

> Improve materialized view rewriting coverage with disjunctive predicates
> 
>
> Key: CALCITE-2976
> URL: https://issues.apache.org/jira/browse/CALCITE-2976
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
>
> For instance, in the following case:
> {code}
> @Test public void testJoinAggregateMaterializationAggregateFuncs14() {
>   checkMaterialize(
>   "select \"empid\", \"emps\".\"name\", \"emps\".\"deptno\", 
> \"depts\".\"name\", "
>   + "count(*) as c, sum(\"empid\") as s\n"
>   + "from \"emps\" join \"depts\" using (\"deptno\")\n"
>   + "where (\"depts\".\"name\" is not null and \"emps\".\"name\" = 
> 'a') or "
>   + "(\"depts\".\"name\" is not null and \"emps\".\"name\" = 'b')\n"
>   + "group by \"empid\", \"emps\".\"name\", \"depts\".\"name\", 
> \"emps\".\"deptno\"",
>   "select \"depts\".\"deptno\", sum(\"empid\") as s\n"
>   + "from \"emps\" join \"depts\" using (\"deptno\")\n"
>   + "where \"depts\".\"name\" is not null and \"emps\".\"name\" = 
> 'a'\n"
>   + "group by \"depts\".\"deptno\"",
>   HR_FKUK_MODEL,
>   CONTAINS_M0);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2976) Improve materialized view rewriting coverage with disjunctive predicates

2019-04-02 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808331#comment-16808331
 ] 

Jesus Camacho Rodriguez commented on CALCITE-2976:
--

PR also enables materialization tests that were previously ignored.

> Improve materialized view rewriting coverage with disjunctive predicates
> 
>
> Key: CALCITE-2976
> URL: https://issues.apache.org/jira/browse/CALCITE-2976
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> For instance, in the following case:
> {code}
> @Test public void testJoinAggregateMaterializationAggregateFuncs14() {
>   checkMaterialize(
>   "select \"empid\", \"emps\".\"name\", \"emps\".\"deptno\", 
> \"depts\".\"name\", "
>   + "count(*) as c, sum(\"empid\") as s\n"
>   + "from \"emps\" join \"depts\" using (\"deptno\")\n"
>   + "where (\"depts\".\"name\" is not null and \"emps\".\"name\" = 
> 'a') or "
>   + "(\"depts\".\"name\" is not null and \"emps\".\"name\" = 'b')\n"
>   + "group by \"empid\", \"emps\".\"name\", \"depts\".\"name\", 
> \"emps\".\"deptno\"",
>   "select \"depts\".\"deptno\", sum(\"empid\") as s\n"
>   + "from \"emps\" join \"depts\" using (\"deptno\")\n"
>   + "where \"depts\".\"name\" is not null and \"emps\".\"name\" = 
> 'a'\n"
>   + "group by \"depts\".\"deptno\"",
>   HR_FKUK_MODEL,
>   CONTAINS_M0);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CALCITE-2976) Improve materialized view rewriting coverage with disjunctive predicates

2019-04-02 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created CALCITE-2976:


 Summary: Improve materialized view rewriting coverage with 
disjunctive predicates
 Key: CALCITE-2976
 URL: https://issues.apache.org/jira/browse/CALCITE-2976
 Project: Calcite
  Issue Type: Improvement
  Components: core
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


For instance, in the following case:

{code}
@Test public void testJoinAggregateMaterializationAggregateFuncs14() {
  checkMaterialize(
  "select \"empid\", \"emps\".\"name\", \"emps\".\"deptno\", 
\"depts\".\"name\", "
  + "count(*) as c, sum(\"empid\") as s\n"
  + "from \"emps\" join \"depts\" using (\"deptno\")\n"
  + "where (\"depts\".\"name\" is not null and \"emps\".\"name\" = 'a') 
or "
  + "(\"depts\".\"name\" is not null and \"emps\".\"name\" = 'b')\n"
  + "group by \"empid\", \"emps\".\"name\", \"depts\".\"name\", 
\"emps\".\"deptno\"",
  "select \"depts\".\"deptno\", sum(\"empid\") as s\n"
  + "from \"emps\" join \"depts\" using (\"deptno\")\n"
  + "where \"depts\".\"name\" is not null and \"emps\".\"name\" = 'a'\n"
  + "group by \"depts\".\"deptno\"",
  HR_FKUK_MODEL,
  CONTAINS_M0);
}
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CALCITE-2973) Allow theta joins that has equi keys to be executed using a merge join or hash join algorithm

2019-04-02 Thread Lai Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lai Zhou updated CALCITE-2973:
--
Summary: Allow theta joins that has equi keys to be executed using a merge 
join or hash join algorithm  (was: Allow theta joins to be executed using a 
merge join algorithm)

> Allow theta joins that has equi keys to be executed using a merge join or 
> hash join algorithm
> -
>
> Key: CALCITE-2973
> URL: https://issues.apache.org/jira/browse/CALCITE-2973
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Lai Zhou
>Priority: Minor
>
> Now the EnumerableMergeJoinRule only supports an inner and equi join.
> If users make a theta-join query  for a large dataset (such as 1*1), 
> the nested-loop join process will take dozens of time than the sort-merge 
> join process .
> So if we can apply merge-join or hash-join rule for a theta join, it will 
> improve the performance greatly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CALCITE-2973) Allow theta joins to be executed using a merge join algorithm

2019-04-02 Thread Lai Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808324#comment-16808324
 ] 

Lai Zhou edited comment on CALCITE-2973 at 4/3/19 3:51 AM:
---

[~julianhyde],[~zabetak] , good idea.

I just create a new rule for my application, to avoid changing the  
calcite-core.

I'll make a PR later to  allow theta joins to be executed using a merge join or 
hash join.

I draw a table to describe the relationship of join types and join operators 
after re-desgined:

 
|| ||inner||non-inner ||
|*only equi condition*|EnumerableJoin|EnumerableJoin |
|*only*  *non-equi  condition*|EnumerableJoin|EnumerableThetaJoin |
|*mixed equi and non-equi  condition*|EnumerableJoin+
 EnumerableFilter
 or
 EnumerableMergeJoin
 (changed)+
 EnumerableFilter
  |EnumerableThetaJoin
 or
 EnumerableMergeJoin
 (changed)
 or
 EnumerableHashJoin
 (new)|

If a join is non-inner and has  ** equi and non-equi  condition meanwhile, we 
have 3 choice to plan it.

Now  EnumerableThetaJoin  and EnumerableMergeJoin have a corresponding rule 
respectively, 

What do you think if I introduce a  new rule( EnumerableThetaHashJoinRule) to 
allow theta joins  to be executed using a hash join?

 

 


was (Author: hhlai1990):
[~julianhyde],[~zabetak] , good idea.

I just create a new rule for my application, to avoid changing the  
calcite-core.

I'll make a PR later to  allow theta joins to be executed using a merge join or 
hash join.

I draw a table to describe the relationship of join types and join operators:

 
|| ||inner||non-inner ||
|*only equi condition*|EnumerableJoin|EnumerableJoin |
|*only*  *non-equi  condition*|EnumerableJoin|EnumerableThetaJoin |
|*mixed equi and non-equi  condition*|EnumerableJoin+
 EnumerableFilter
 or
 EnumerableMergeJoin
 (changed)+
EnumerableFilter
  |EnumerableThetaJoin
 or
 EnumerableMergeJoin
 (changed)
 or
 EnumerableHashJoin
 (new)|

If a join is non-inner and has  ** equi and non-equi  condition meanwhile, we 
have 3 choice to plan it.

Now  EnumerableThetaJoin  and EnumerableMergeJoin have a corresponding rule 
respectively, 

What do you think if I introduce a  new rule( EnumerableThetaHashJoinRule) to 
allow theta joins  to be executed using a hash join?

 

 

> Allow theta joins to be executed using a merge join algorithm
> -
>
> Key: CALCITE-2973
> URL: https://issues.apache.org/jira/browse/CALCITE-2973
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Lai Zhou
>Priority: Minor
>
> Now the EnumerableMergeJoinRule only supports an inner and equi join.
> If users make a theta-join query  for a large dataset (such as 1*1), 
> the nested-loop join process will take dozens of time than the sort-merge 
> join process .
> So if we can apply merge-join or hash-join rule for a theta join, it will 
> improve the performance greatly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CALCITE-2973) Allow theta joins to be executed using a merge join algorithm

2019-04-02 Thread Lai Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808324#comment-16808324
 ] 

Lai Zhou edited comment on CALCITE-2973 at 4/3/19 3:51 AM:
---

[~julianhyde],[~zabetak] , good idea.

I just create a new rule for my application, to avoid changing the  
calcite-core.

I'll make a PR later to  allow theta joins to be executed using a merge join or 
hash join.

I draw a table to describe the relationship of join types and join operators 
after re-desgined:

 
|| ||inner||non-inner ||
|*only equi condition*|EnumerableJoin|EnumerableJoin |
|*only*  *non-equi  condition*|EnumerableJoin|EnumerableThetaJoin |
|*mixed equi and non-equi  condition*|EnumerableJoin+
 EnumerableFilter
 or
 EnumerableMergeJoin
 (changed)+
 EnumerableFilter
  |EnumerableThetaJoin
 or
 EnumerableMergeJoin
 (changed)
 or
 EnumerableHashJoin
 (new)|

If a join is non-inner and has  equi and non-equi  conditions meanwhile, we 
have 3 choice to plan it.

Now  EnumerableThetaJoin  and EnumerableMergeJoin have a corresponding rule 
respectively, 

What do you think if I introduce a  new rule( EnumerableThetaHashJoinRule) to 
allow theta joins  to be executed using a hash join?

 

 


was (Author: hhlai1990):
[~julianhyde],[~zabetak] , good idea.

I just create a new rule for my application, to avoid changing the  
calcite-core.

I'll make a PR later to  allow theta joins to be executed using a merge join or 
hash join.

I draw a table to describe the relationship of join types and join operators 
after re-desgined:

 
|| ||inner||non-inner ||
|*only equi condition*|EnumerableJoin|EnumerableJoin |
|*only*  *non-equi  condition*|EnumerableJoin|EnumerableThetaJoin |
|*mixed equi and non-equi  condition*|EnumerableJoin+
 EnumerableFilter
 or
 EnumerableMergeJoin
 (changed)+
 EnumerableFilter
  |EnumerableThetaJoin
 or
 EnumerableMergeJoin
 (changed)
 or
 EnumerableHashJoin
 (new)|

If a join is non-inner and has  ** equi and non-equi  condition meanwhile, we 
have 3 choice to plan it.

Now  EnumerableThetaJoin  and EnumerableMergeJoin have a corresponding rule 
respectively, 

What do you think if I introduce a  new rule( EnumerableThetaHashJoinRule) to 
allow theta joins  to be executed using a hash join?

 

 

> Allow theta joins to be executed using a merge join algorithm
> -
>
> Key: CALCITE-2973
> URL: https://issues.apache.org/jira/browse/CALCITE-2973
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Lai Zhou
>Priority: Minor
>
> Now the EnumerableMergeJoinRule only supports an inner and equi join.
> If users make a theta-join query  for a large dataset (such as 1*1), 
> the nested-loop join process will take dozens of time than the sort-merge 
> join process .
> So if we can apply merge-join or hash-join rule for a theta join, it will 
> improve the performance greatly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CALCITE-2973) Allow theta joins to be executed using a merge join algorithm

2019-04-02 Thread Lai Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808324#comment-16808324
 ] 

Lai Zhou edited comment on CALCITE-2973 at 4/3/19 3:46 AM:
---

[~julianhyde],[~zabetak] , good idea.

I just create a new rule for my application, to avoid changing the  
calcite-core.

I'll make a PR later to  allow theta joins to be executed using a merge join or 
hash join.

I draw a table to describe the relationship of join types and join operators:

 
|| ||inner||non-inner ||
|*only equi condition*|EnumerableJoin|EnumerableJoin |
|*only*  *non-equi  condition*|EnumerableJoin|EnumerableThetaJoin |
|*mixed equi and non-equi  condition*|EnumerableJoin+
 EnumerableFilter
 or
 EnumerableMergeJoin
 (changed)+
EnumerableFilter
  |EnumerableThetaJoin
 or
 EnumerableMergeJoin
 (changed)
 or
 EnumerableHashJoin
 (new)|

If a join is non-inner and has  ** equi and non-equi  condition meanwhile, we 
have 3 choice to plan it.

Now  EnumerableThetaJoin  and EnumerableMergeJoin have a corresponding rule 
respectively, 

What do you think if I introduce a  new rule( EnumerableThetaHashJoinRule) to 
allow theta joins  to be executed using a hash join?

 

 


was (Author: hhlai1990):
[~julianhyde],[~zabetak] , good idea.

I just create a new rule for my application, to avoid changing the  
calcite-core.

I'll make a PR later to  allow theta joins to be executed using a merge join or 
hash join.

I draw a table to describe the relationship of join types and join operators:

 
|| ||inner||non-inner ||
|*only equi condition*|EnumerableJoin|EnumerableJoin |
|*only*  *non-equi  condition* **  **  |EnumerableJoin|EnumerableThetaJoin |
|*mixed equi and non-equi  condition*|EnumerableJoin+
 EnumerableFilter
 or
 EnumerableMergeJoin
(changed)
  |EnumerableThetaJoin
 or
 EnumerableMergeJoin
(changed)
 or
 EnumerableHashJoin
(new)|

If a join is non-inner and has  ** equi and non-equi  condition meanwhile, we 
have 3 choice to plan it.

Now  EnumerableThetaJoin  and EnumerableMergeJoin have a corresponding rule 
respectively, 

What do you think if I introduce a  new rule( EnumerableThetaHashJoinRule) to 
allow theta joins  to be executed using a hash join?

 

 

> Allow theta joins to be executed using a merge join algorithm
> -
>
> Key: CALCITE-2973
> URL: https://issues.apache.org/jira/browse/CALCITE-2973
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Lai Zhou
>Priority: Minor
>
> Now the EnumerableMergeJoinRule only supports an inner and equi join.
> If users make a theta-join query  for a large dataset (such as 1*1), 
> the nested-loop join process will take dozens of time than the sort-merge 
> join process .
> So if we can apply merge-join or hash-join rule for a theta join, it will 
> improve the performance greatly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CALCITE-2973) Allow theta joins to be executed using a merge join algorithm

2019-04-02 Thread Lai Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808324#comment-16808324
 ] 

Lai Zhou edited comment on CALCITE-2973 at 4/3/19 3:45 AM:
---

[~julianhyde],[~zabetak] , good idea.

I just create a new rule for my application, to avoid changing the  
calcite-core.

I'll make a PR later to  allow theta joins to be executed using a merge join or 
hash join.

I draw a table to describe the relationship of join types and join operators:

 
|| ||inner||non-inner ||
|*only equi condition*|EnumerableJoin|EnumerableJoin |
|*only*  *non-equi  condition** ** * |EnumerableJoin|EnumerableThetaJoin |
|*mixed equi and non-equi  condition*|EnumerableJoin+
EnumerableFilter
 or
 EnumerableMergeJoin(changed)
  |EnumerableThetaJoin
 or
 EnumerableMergeJoin(changed)
 or
 EnumerableHashJoin(new)|

If a join is non-inner and has  ** equi and non-equi  condition meanwhile, we 
have 3 choice to plan it.

Now  EnumerableThetaJoin  and EnumerableMergeJoin have a corresponding rule 
respectively, 

What do you think if I introduce a  new rule( EnumerableThetaHashJoinRule) to 
allow theta joins  to be executed using a hash join?

 

 


was (Author: hhlai1990):
[~julianhyde],[~zabetak] , good idea.

I just create a new rule for my application, to avoid changing the  
calcite-core.

I'll make a PR later to  allow theta joins to be executed using a merge join or 
hash join.

I draw a table to describe the relationship of join types and join operators:

 
|| ||inner||non-inner||
|*only equi condition*|EnumerableJoin|EnumerableJoin|
|*only*  *non-equi  condition*** ** |EnumerableJoin|EnumerableThetaJoin|
|*mixed equi and non-equi  condition*|EnumerableJoin+EnumerableFilter
or
EnumerableMergeJoin(changed)
 
|EnumerableThetaJoin
or
 EnumerableMergeJoin
or
EnumerableHashJoin|

If a join is non-inner and has  ** equi and non-equi  condition meanwhile, we 
have 3 choice to plan it.

Now  EnumerableThetaJoin  and EnumerableMergeJoin have a corresponding rule 
respectively, 

What do you think if I introduce a  new rule( EnumerableThetaHashJoinRule) to 
allow theta joins  to be executed using a hash join?

 

 

> Allow theta joins to be executed using a merge join algorithm
> -
>
> Key: CALCITE-2973
> URL: https://issues.apache.org/jira/browse/CALCITE-2973
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Lai Zhou
>Priority: Minor
>
> Now the EnumerableMergeJoinRule only supports an inner and equi join.
> If users make a theta-join query  for a large dataset (such as 1*1), 
> the nested-loop join process will take dozens of time than the sort-merge 
> join process .
> So if we can apply merge-join or hash-join rule for a theta join, it will 
> improve the performance greatly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CALCITE-2973) Allow theta joins to be executed using a merge join algorithm

2019-04-02 Thread Lai Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808324#comment-16808324
 ] 

Lai Zhou edited comment on CALCITE-2973 at 4/3/19 3:45 AM:
---

[~julianhyde],[~zabetak] , good idea.

I just create a new rule for my application, to avoid changing the  
calcite-core.

I'll make a PR later to  allow theta joins to be executed using a merge join or 
hash join.

I draw a table to describe the relationship of join types and join operators:

 
|| ||inner||non-inner ||
|*only equi condition*|EnumerableJoin|EnumerableJoin |
|*only*  *non-equi  condition* **  **  |EnumerableJoin|EnumerableThetaJoin |
|*mixed equi and non-equi  condition*|EnumerableJoin+
 EnumerableFilter
 or
 EnumerableMergeJoin
(changed)
  |EnumerableThetaJoin
 or
 EnumerableMergeJoin
(changed)
 or
 EnumerableHashJoin
(new)|

If a join is non-inner and has  ** equi and non-equi  condition meanwhile, we 
have 3 choice to plan it.

Now  EnumerableThetaJoin  and EnumerableMergeJoin have a corresponding rule 
respectively, 

What do you think if I introduce a  new rule( EnumerableThetaHashJoinRule) to 
allow theta joins  to be executed using a hash join?

 

 


was (Author: hhlai1990):
[~julianhyde],[~zabetak] , good idea.

I just create a new rule for my application, to avoid changing the  
calcite-core.

I'll make a PR later to  allow theta joins to be executed using a merge join or 
hash join.

I draw a table to describe the relationship of join types and join operators:

 
|| ||inner||non-inner ||
|*only equi condition*|EnumerableJoin|EnumerableJoin |
|*only*  *non-equi  condition** ** * |EnumerableJoin|EnumerableThetaJoin |
|*mixed equi and non-equi  condition*|EnumerableJoin+
EnumerableFilter
 or
 EnumerableMergeJoin(changed)
  |EnumerableThetaJoin
 or
 EnumerableMergeJoin(changed)
 or
 EnumerableHashJoin(new)|

If a join is non-inner and has  ** equi and non-equi  condition meanwhile, we 
have 3 choice to plan it.

Now  EnumerableThetaJoin  and EnumerableMergeJoin have a corresponding rule 
respectively, 

What do you think if I introduce a  new rule( EnumerableThetaHashJoinRule) to 
allow theta joins  to be executed using a hash join?

 

 

> Allow theta joins to be executed using a merge join algorithm
> -
>
> Key: CALCITE-2973
> URL: https://issues.apache.org/jira/browse/CALCITE-2973
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Lai Zhou
>Priority: Minor
>
> Now the EnumerableMergeJoinRule only supports an inner and equi join.
> If users make a theta-join query  for a large dataset (such as 1*1), 
> the nested-loop join process will take dozens of time than the sort-merge 
> join process .
> So if we can apply merge-join or hash-join rule for a theta join, it will 
> improve the performance greatly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2973) Allow theta joins to be executed using a merge join algorithm

2019-04-02 Thread Lai Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808324#comment-16808324
 ] 

Lai Zhou commented on CALCITE-2973:
---

[~julianhyde],[~zabetak] , good idea.

I just create a new rule for my application, to avoid changing the  
calcite-core.

I'll make a PR later to  allow theta joins to be executed using a merge join or 
hash join.

I draw a table to describe the relationship of join types and join operators:

 
|| ||inner||non-inner||
|*only equi condition*|EnumerableJoin|EnumerableJoin|
|*only*  *non-equi  condition*** ** |EnumerableJoin|EnumerableThetaJoin|
|*mixed equi and non-equi  condition*|EnumerableJoin+EnumerableFilter
or
EnumerableMergeJoin(changed)
 
|EnumerableThetaJoin
or
 EnumerableMergeJoin
or
EnumerableHashJoin|

If a join is non-inner and has  ** equi and non-equi  condition meanwhile, we 
have 3 choice to plan it.

Now  EnumerableThetaJoin  and EnumerableMergeJoin have a corresponding rule 
respectively, 

What do you think if I introduce a  new rule( EnumerableThetaHashJoinRule) to 
allow theta joins  to be executed using a hash join?

 

 

> Allow theta joins to be executed using a merge join algorithm
> -
>
> Key: CALCITE-2973
> URL: https://issues.apache.org/jira/browse/CALCITE-2973
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Lai Zhou
>Priority: Minor
>
> Now the EnumerableMergeJoinRule only supports an inner and equi join.
> If users make a theta-join query  for a large dataset (such as 1*1), 
> the nested-loop join process will take dozens of time than the sort-merge 
> join process .
> So if we can apply merge-join or hash-join rule for a theta join, it will 
> improve the performance greatly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CALCITE-2892) Add the JSON_KEYS function

2019-04-02 Thread Hongze Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongze Zhang resolved CALCITE-2892.
---
Resolution: Fixed

Fixed in 
[80dc5bcb3efa9c72573c4cc5b02f512d34c20d13|https://gitbox.apache.org/repos/asf?p=calcite.git;a=commit;h=80dc5bcb3efa9c72573c4cc5b02f512d34c20d13].
 Thanks, [~x1q1j1]!

> Add the JSON_KEYS function
> --
>
> Key: CALCITE-2892
> URL: https://issues.apache.org/jira/browse/CALCITE-2892
> Project: Calcite
>  Issue Type: Sub-task
>Reporter: Forward Xu
>Assignee: Forward Xu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> [{{JSON_KEYS(_{{json_doc}}_[, 
> _{{path}}_])}}|https://dev.mysql.com/doc/refman/5.7/en/json-search-functions.html#function_json-keys]
> Returns the keys from the top-level value of a JSON object as a JSON array, 
> or, if a _{{path}}_ argument is given, the top-level keys from the selected 
> path. Returns {{NULL}} if any argument is {{NULL}}, the _{{json_doc}}_ 
> argument is not an object, or _{{path}}_, if given, does not locate an 
> object. An error occurs if the _{{json_doc}}_argument is not a valid JSON 
> document or the _{{path}}_ argument is not a valid path expression or 
> contains a {{*}} or {{**}} wildcard.
> The result array is empty if the selected object is empty. If the top-level 
> value has nested subobjects, the return value does not include keys from 
> those subobjects.
> Example SQL:
> {code:java}
> // code placeholder
> SELECT JSON_KEYS(v) AS c1
> ,JSON_KEYS(v, 'lax $.a') AS c2
> ,JSON_KEYS(v, '$.b') AS c3
> ,JSON_KEYS(v, 'strict $.a[0]') AS c4
> ,JSON_KEYS(v, 'strict $.a[1]') AS c5
> FROM (VALUES ('{"a": [10, true],"b": {"c": 30}}')) AS t(v)
> LIMIT 10;
> {code}
> Result:
> ||c1||c2||c3||c4||c5||
> |["a", "b"]|[NULL]|["c"]|[NULL]|[NULL]|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CALCITE-2975) Add the JSON_REMOVE function

2019-04-02 Thread Forward Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forward Xu updated CALCITE-2975:

Issue Type: Sub-task  (was: New Feature)
Parent: CALCITE-2867

> Add the JSON_REMOVE function
> 
>
> Key: CALCITE-2975
> URL: https://issues.apache.org/jira/browse/CALCITE-2975
> Project: Calcite
>  Issue Type: Sub-task
>Reporter: Forward Xu
>Assignee: Forward Xu
>Priority: Major
>
> JSON_REMOVE(json_doc, path[, path] ...)
> Removes data from a JSON document and returns the result. Returns NULL if any 
> argument is NULL. An error occurs if the json_doc argument is not a valid 
> JSON document or any path argument is not a valid path expression or is $ or 
> contains a * or ** wildcard.
> The path arguments are evaluated left to right. The document produced by 
> evaluating one path becomes the new value against which the next path is 
> evaluated.
> It is not an error if the element to be removed does not exist in the 
> document; in that case, the path does not affect the document.
> JSON_REMOVE SQL:
> {code:java}
> SELECT JSON_REMOVE(v, '$[1]') AS c1
>  FROM (VALUES ('["a", ["b", "c"], "d"]')) AS t(v);
> {code}
> RESULT:
> ||c1||
> |["a", "d"]|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (CALCITE-2975) Add the JSON_REMOVE function

2019-04-02 Thread Forward Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forward Xu reassigned CALCITE-2975:
---

Assignee: Forward Xu

> Add the JSON_REMOVE function
> 
>
> Key: CALCITE-2975
> URL: https://issues.apache.org/jira/browse/CALCITE-2975
> Project: Calcite
>  Issue Type: New Feature
>Reporter: Forward Xu
>Assignee: Forward Xu
>Priority: Major
>
> JSON_REMOVE(json_doc, path[, path] ...)
> Removes data from a JSON document and returns the result. Returns NULL if any 
> argument is NULL. An error occurs if the json_doc argument is not a valid 
> JSON document or any path argument is not a valid path expression or is $ or 
> contains a * or ** wildcard.
> The path arguments are evaluated left to right. The document produced by 
> evaluating one path becomes the new value against which the next path is 
> evaluated.
> It is not an error if the element to be removed does not exist in the 
> document; in that case, the path does not affect the document.
> JSON_REMOVE SQL:
> {code:java}
> SELECT JSON_REMOVE(v, '$[1]') AS c1
>  FROM (VALUES ('["a", ["b", "c"], "d"]')) AS t(v);
> {code}
> RESULT:
> ||c1||
> |["a", "d"]|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CALCITE-2975) Add the JSON_REMOVE function

2019-04-02 Thread Forward Xu (JIRA)
Forward Xu created CALCITE-2975:
---

 Summary: Add the JSON_REMOVE function
 Key: CALCITE-2975
 URL: https://issues.apache.org/jira/browse/CALCITE-2975
 Project: Calcite
  Issue Type: New Feature
Reporter: Forward Xu


JSON_REMOVE(json_doc, path[, path] ...)

Removes data from a JSON document and returns the result. Returns NULL if any 
argument is NULL. An error occurs if the json_doc argument is not a valid JSON 
document or any path argument is not a valid path expression or is $ or 
contains a * or ** wildcard.

The path arguments are evaluated left to right. The document produced by 
evaluating one path becomes the new value against which the next path is 
evaluated.

It is not an error if the element to be removed does not exist in the document; 
in that case, the path does not affect the document.
JSON_REMOVE SQL:

{code:java}
SELECT JSON_REMOVE(v, '$[1]') AS c1
 FROM (VALUES ('["a", ["b", "c"], "d"]')) AS t(v);
{code}

RESULT:

||c1||
|["a", "d"]|






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2892) Add the JSON_KEYS function

2019-04-02 Thread Hongze Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808291#comment-16808291
 ] 

Hongze Zhang commented on CALCITE-2892:
---

Thank you for taking this, [~x1q1j1]!

So far the PR looks good to me, and I've added some minor changes, please take 
a look at the 
[branch|https://github.com/zhztheplayer/calcite/commit/939cd9417bab48ec1b571caefda4c15df31d43b8].

The minor changes list:
* reformatted reserved and non-reserved keyword list a bit;
* simplified some unparse method on JSON operators;
* simplified first several code lines from method SqlFunctions#jsonKeys.

> Add the JSON_KEYS function
> --
>
> Key: CALCITE-2892
> URL: https://issues.apache.org/jira/browse/CALCITE-2892
> Project: Calcite
>  Issue Type: Sub-task
>Reporter: Forward Xu
>Assignee: Forward Xu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> [{{JSON_KEYS(_{{json_doc}}_[, 
> _{{path}}_])}}|https://dev.mysql.com/doc/refman/5.7/en/json-search-functions.html#function_json-keys]
> Returns the keys from the top-level value of a JSON object as a JSON array, 
> or, if a _{{path}}_ argument is given, the top-level keys from the selected 
> path. Returns {{NULL}} if any argument is {{NULL}}, the _{{json_doc}}_ 
> argument is not an object, or _{{path}}_, if given, does not locate an 
> object. An error occurs if the _{{json_doc}}_argument is not a valid JSON 
> document or the _{{path}}_ argument is not a valid path expression or 
> contains a {{*}} or {{**}} wildcard.
> The result array is empty if the selected object is empty. If the top-level 
> value has nested subobjects, the return value does not include keys from 
> those subobjects.
> Example SQL:
> {code:java}
> // code placeholder
> SELECT JSON_KEYS(v) AS c1
> ,JSON_KEYS(v, 'lax $.a') AS c2
> ,JSON_KEYS(v, '$.b') AS c3
> ,JSON_KEYS(v, 'strict $.a[0]') AS c4
> ,JSON_KEYS(v, 'strict $.a[1]') AS c5
> FROM (VALUES ('{"a": [10, true],"b": {"c": 30}}')) AS t(v)
> LIMIT 10;
> {code}
> Result:
> ||c1||c2||c3||c4||c5||
> |["a", "b"]|[NULL]|["c"]|[NULL]|[NULL]|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2966) Problem with Code Generation

2019-04-02 Thread Danny Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808289#comment-16808289
 ] 

Danny Chan commented on CALCITE-2966:
-

The commit is not that complete, i only see 2 files changed and the 

EnumerableRules.ENUMERABLE_MATCH_RECOGNIZE_RULE is gone. It seems that you 
really did much modifications, can you merge the commits into one ?

> Problem with Code Generation
> 
>
> Key: CALCITE-2966
> URL: https://issues.apache.org/jira/browse/CALCITE-2966
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.20.0
>Reporter: Danny Chan
>Assignee: Danny Chan
>Priority: Major
> Fix For: 1.20.0
>
>
> From the mailing list:
> Hi all,
> I have some problems with the code generation from Linq4j which I'm unable to 
> resolve myself.
> Basically, I want to translate a condition from Rex to a Linq4j expression to 
> use it in generated code.
> In my example the Condition is from Match Recognize and in SQL is: 
> `up."commission" > prev(up."commission")`.
>  
> ```
> RexBuilder rexBuilder = new RexBuilder(implementor.getTypeFactory());
> RexProgramBuilder rexProgramBuilder = new 
> RexProgramBuilder(physType.getRowType(), rexBuilder);
>  
> rexProgramBuilder.addCondition(entry.getValue());
>  
> final Expression condition = 
> RexToLixTranslator.translateCondition(rexProgramBuilder.getProgram(),
>           (JavaTypeFactory) getCluster().getTypeFactory(),
>           builder2,
>           inputGetter1,
>           implementor.allCorrelateVariables,
>           implementor.getConformance());
>  
> builder2.add(Expressions.return_(null, condition));
> ```
>  
> Here, the condition seems okay, it is: ">(PREV(UP.$4, 0), PREV(UP.$4, 1))",  
> so it should be a comparison of two variables (I rewrite the PREV with a 
> custom Input Getter".
> But, the generated code (for Janino) is:
>  
> ```
> Object p1 = row_.get($L4J$C$0_1);
> org.apache.calcite.test.JdbcTest.Employee p0 = 
> (org.apache.calcite.test.JdbcTest.Employee) p1;
> Object p3 = row_.get($L4J$C$1_1);
> org.apache.calcite.test.JdbcTest.Employee p2 = 
> (org.apache.calcite.test.JdbcTest.Employee) p3;
> Object p5 = row_.get($L4J$C$0_1);
> org.apache.calcite.test.JdbcTest.Employee p4 = 
> (org.apache.calcite.test.JdbcTest.Employee) p5;
> Object p7 = row_.get($L4J$C$1_1);
> org.apache.calcite.test.JdbcTest.Employee p6 = 
> (org.apache.calcite.test.JdbcTest.Employee) p7;
> return p0.commission && p2.commission && p4.commission > p6.commission;
> ```
>  
> This confuses me a lot as I do not know where the check for p0.commission and 
> p2.commission comes from.
> It seems that Linq4j adds them as it expects these variables to be nullable, 
> but I have no idea on how to avoid this.
> These fields are Numeric so I always get a compilation exception.
>  
> Can someone help me with this issue?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CALCITE-2973) Allow theta joins to be executed using a merge join algorithm

2019-04-02 Thread Lai Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lai Zhou updated CALCITE-2973:
--
Summary: Allow theta joins to be executed using a merge join algorithm  
(was: Make EnumerableMergeJoinRule to support a theta join)

> Allow theta joins to be executed using a merge join algorithm
> -
>
> Key: CALCITE-2973
> URL: https://issues.apache.org/jira/browse/CALCITE-2973
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Lai Zhou
>Priority: Minor
>
> Now the EnumerableMergeJoinRule only supports an inner and equi join.
> If users make a theta-join query  for a large dataset (such as 1*1), 
> the nested-loop join process will take dozens of time than the sort-merge 
> join process .
> So if we can apply merge-join or hash-join rule for a theta join, it will 
> improve the performance greatly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2972) Upgrade jetty to 9.4.15.v20190215

2019-04-02 Thread Kevin Risden (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808145#comment-16808145
 ] 

Kevin Risden commented on CALCITE-2972:
---

Well removing the deprecations is kicking my butt. Put a bunch of notes on the 
first PR.

I created a second PR that upgrades but just suppresses the deprecations - 
https://github.com/apache/calcite-avatica/pull/93

> Upgrade jetty to 9.4.15.v20190215
> -
>
> Key: CALCITE-2972
> URL: https://issues.apache.org/jira/browse/CALCITE-2972
> Project: Calcite
>  Issue Type: Improvement
>  Components: avatica
>Reporter: Kevin Risden
>Assignee: Kevin Risden
>Priority: Major
>  Labels: pull-request-available
> Fix For: avatica-1.14.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Avatica should be upgraded to the latest Jetty 9.4.15.v20190215



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CALCITE-2930) IllegalStateException when FilterCorrelateRule matches a SEMI or ANTI Correlate

2019-04-02 Thread Stamatis Zampetakis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-2930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis resolved CALCITE-2930.
--
Resolution: Fixed

Fixed in 
[43ba147937dc53dfb86ee8723d45562e48b14d20|https://github.com/apache/calcite/commit/43ba147937dc53dfb86ee8723d45562e48b14d20].
 Thanks for the PR [~rubenql]!

> IllegalStateException when FilterCorrelateRule matches a SEMI or ANTI 
> Correlate
> ---
>
> Key: CALCITE-2930
> URL: https://issues.apache.org/jira/browse/CALCITE-2930
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.18.0, 1.19.0
>Reporter: Ruben Quesada Lopez
>Assignee: Ruben Quesada Lopez
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> FilterCorrelateRule#onMatch method executes the following:
> {code:java}
> public void onMatch(RelOptRuleCall call) {
> final Filter filter = call.rel(0);
> final Correlate corr = call.rel(1);
> ...
> RelOptUtil.classifyFilters(
> corr,
> aboveFilters,
> JoinRelType.INNER,
> false,
> !corr.getJoinType().toJoinType().generatesNullsOnLeft(), // *** HERE!
> !corr.getJoinType().toJoinType().generatesNullsOnRight(), // *** HERE!
> aboveFilters,
> leftFilters,
> rightFilters);
> ...
> {code}
> If the Correlate object has SemiJoinType SEMI (or ANTI), the 
> {{corr.getJoinType().toJoinType()}} will throw a IllegalStateException:
> {code}
> Caused by: java.lang.IllegalStateException: Unable to convert SEMI to 
> JoinRelType
>   at org.apache.calcite.sql.SemiJoinType.toJoinType(SemiJoinType.java:83)
>   at 
> org.apache.calcite.rel.rules.FilterCorrelateRule.onMatch(FilterCorrelateRule.java:89)
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:205)
> {code} 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CALCITE-2930) IllegalStateException when FilterCorrelateRule matches a SEMI or ANTI Correlate

2019-04-02 Thread Stamatis Zampetakis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-2930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated CALCITE-2930:
-
Summary: IllegalStateException when FilterCorrelateRule matches a SEMI or 
ANTI Correlate  (was: FilterCorrelateRule on a Correlate with SemiJoinType SEMI 
(or ANTI) throws IllegalStateException)

> IllegalStateException when FilterCorrelateRule matches a SEMI or ANTI 
> Correlate
> ---
>
> Key: CALCITE-2930
> URL: https://issues.apache.org/jira/browse/CALCITE-2930
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.18.0, 1.19.0
>Reporter: Ruben Quesada Lopez
>Assignee: Ruben Quesada Lopez
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> FilterCorrelateRule#onMatch method executes the following:
> {code:java}
> public void onMatch(RelOptRuleCall call) {
> final Filter filter = call.rel(0);
> final Correlate corr = call.rel(1);
> ...
> RelOptUtil.classifyFilters(
> corr,
> aboveFilters,
> JoinRelType.INNER,
> false,
> !corr.getJoinType().toJoinType().generatesNullsOnLeft(), // *** HERE!
> !corr.getJoinType().toJoinType().generatesNullsOnRight(), // *** HERE!
> aboveFilters,
> leftFilters,
> rightFilters);
> ...
> {code}
> If the Correlate object has SemiJoinType SEMI (or ANTI), the 
> {{corr.getJoinType().toJoinType()}} will throw a IllegalStateException:
> {code}
> Caused by: java.lang.IllegalStateException: Unable to convert SEMI to 
> JoinRelType
>   at org.apache.calcite.sql.SemiJoinType.toJoinType(SemiJoinType.java:83)
>   at 
> org.apache.calcite.rel.rules.FilterCorrelateRule.onMatch(FilterCorrelateRule.java:89)
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:205)
> {code} 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CALCITE-2844) Babel parser should parse PostgreSQL table functions

2019-04-02 Thread Hongze Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807878#comment-16807878
 ] 

Hongze Zhang edited comment on CALCITE-2844 at 4/2/19 3:50 PM:
---

I've run the two tests and reproduced the failures.

The first one {{testOuterApplyFunctionFails}} fails because the table function 
"ramp(deptno)" is now get successfully parsed, as it was previously not a legal 
syntax. 

Besides, the SQL used in the case is wired since it ends with two right 
brackets "{{))}}":

{code:sql}
select * from dept outer apply ramp(deptno^)^)
{code}

In general it should be 

{code:sql}
select * from dept outer apply ramp(deptno^)^
{code}

The second bracket looks like a typo. I think you can just remove the bracket 
and error anchors (the "^"). Then let the test case pass.

The case {{testTableExtend}} fails because of bad lookahead.

I see your LOOKAHEAD hint is set as 

{code}
LOOKAHEAD({ (getToken(1).kind == IDENTIFIER && getToken(2).kind == LPAREN) || 
(getToken(1).kind == SPECIFIC && getToken(2).kind == IDENTIFIER && 
getToken(3).kind == LPAREN) })
{code}

The hint is conflict with "extend clause" (see CALCITE-493). There should be 
multiple solutions to solve the conflict: using syntactic lookahead, or 
changing the expansion match order, etc. Regarding the parsing performance I 
would prefer to change the match order.



was (Author: zhztheplayer):
I've run the two tests and reproduced the failures.

The first one {{testOuterApplyFunctionFails}} fails because the table function 
"ramp(deptno)" is now get successfully parsed, as it was previously not a legal 
syntax. 

Besides, the SQL used in the case is wired since it ends with two right 
brackets "{{))}}":

{code:sql}
select * from dept outer apply ramp(deptno^)^)
{code}

In general it should be 

{code:sql}
select * from dept outer apply ramp(deptno^)^
{code}

The second bracket looks like a typo. I think you can just remove the bracket 
and error anchors (the "^"). Then let the test case pass.

The case {{testTableExtend}} fails because of bad lookahead.

I see your LOOKAHEAD hint is set as 

{code}
LOOKAHEAD({ (getToken(1).kind == IDENTIFIER && getToken(2).kind == LPAREN) || 
(getToken(1).kind == SPECIFIC && getToken(2).kind == IDENTIFIER && 
getToken(3).kind == LPAREN) })
{code}

The hint is conflict with "extend clause" (see CALCITE-493). There should be 
multiple solutions to solve the conflict (using syntactic lookahead, or 
changing the expansion match order, etc.)


> Babel parser should parse PostgreSQL table functions
> 
>
> Key: CALCITE-2844
> URL: https://issues.apache.org/jira/browse/CALCITE-2844
> Project: Calcite
>  Issue Type: Bug
>  Components: babel
>Affects Versions: 1.18.0
>Reporter: Muhammad Gelbana
>Priority: Major
>
> *Code to reproduce the problem*
> {code:java}
> public static void main(String[] args) throws Exception {
> Config parserConfig = 
> configBuilder().setConformance(SqlConformanceEnum.BABEL).setParserFactory(SqlBabelParserImpl.FACTORY).build();
> FrameworkConfig frameworkConfig = 
> Frameworks.newConfigBuilder().parserConfig(parserConfig).build();
> Planner planner = Frameworks.getPlanner(frameworkConfig);
> String pg = "select * from generate_series(1, 5)";
> planner.parse(pg);
> }
> {code}
>  
> *Thrown exception*
> {noformat}
> Exception in thread "main" org.apache.calcite.sql.parser.SqlParseException: 
> Encountered "( 1" at line 1, column 30.
> Was expecting one of:
>     
>     "ORDER" ...
>     "LIMIT" ...
>     "OFFSET" ...
>     "FETCH" ...
>     "NATURAL" ...
>     "JOIN" ...
>     "INNER" ...
>     "LEFT" ...
>     "RIGHT" ...
>     "FULL" ...
>     "CROSS" ...
>     "," ...
>     "OUTER" ...
>     "("  ...
>     "("  ...
>     "("  ...
>     "("  ...
>     "("  ...
>     "MATCH_RECOGNIZE" ...
>     "." ...
>     "(" ...
>      ...
>      ...
>      ...
>      ...
>      ...
>     "TABLESAMPLE" ...
>     "WHERE" ...
>     "GROUP" ...
>     "HAVING" ...
>     "WINDOW" ...
>     "UNION" ...
>     "INTERSECT" ...
>     "EXCEPT" ...
>     "MINUS" ...
>     
>     at 
> org.apache.calcite.sql.parser.babel.SqlBabelParserImpl.convertException(SqlBabelParserImpl.java:354)
>     at 
> org.apache.calcite.sql.parser.babel.SqlBabelParserImpl.normalizeException(SqlBabelParserImpl.java:142)
>     at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:156)
>     at org.apache.calcite.sql.parser.SqlParser.parseStmt(SqlParser.java:181)
>     at org.apache.calcite.prepare.PlannerImpl.parse(PlannerImpl.java:174)
>     at org.apache.calcite.tools.Planner.parse(Planner.java:50)
>     at com.lab.calcite.App2.main(App2.java:24)
> Caused by: org.apache.calcite.sql.parser.babel.ParseException: Encountered "( 
> 1" at line 1, column 30.
> 

[jira] [Closed] (CALCITE-2960) CalciteCatalogReader use a hard-coding config to get functions

2019-04-02 Thread Haisheng Yuan (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan closed CALCITE-2960.
--
Resolution: Duplicate

> CalciteCatalogReader use a hard-coding config to get functions
> --
>
> Key: CALCITE-2960
> URL: https://issues.apache.org/jira/browse/CALCITE-2960
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.18.0
>Reporter: Lai Zhou
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> CalciteCatalogReader use a hard-coding config to get functions
> {code:java}
> functions2.addAll(schema.getFunctions(name, true));
> {code}
> the right way is:
> {code:java}
> functions2.addAll(schema.getFunctions(name, config.caseSensitive()));
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2844) Babel parser should parse PostgreSQL table functions

2019-04-02 Thread Hongze Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807878#comment-16807878
 ] 

Hongze Zhang commented on CALCITE-2844:
---

I've run the two tests and reproduced the failures.

The first one {{testOuterApplyFunctionFails}} fails because the table function 
"ramp(deptno)" is now get successfully parsed, as it was previously not a legal 
syntax. 

Besides, the SQL used in the case is wired since it ends with two right 
brackets "{{))}}":

{code:sql}
select * from dept outer apply ramp(deptno^)^)
{code}

In general it should be 

{code:sql}
select * from dept outer apply ramp(deptno^)^
{code}

The second bracket looks like a typo. I think you can just remove the bracket 
and error anchors (the "^"). Then let the test case pass.

The case {{testTableExtend}} fails because of bad lookahead.

I see your LOOKAHEAD hint is set as 

{code}
LOOKAHEAD({ (getToken(1).kind == IDENTIFIER && getToken(2).kind == LPAREN) || 
(getToken(1).kind == SPECIFIC && getToken(2).kind == IDENTIFIER && 
getToken(3).kind == LPAREN) })
{code}

The hint is conflict with "extend clause" (see CALCITE-493). There should be 
multiple solutions to solve the conflict (using syntactic lookahead, or 
changing the expansion match order, etc. Maybe the later is better?)


> Babel parser should parse PostgreSQL table functions
> 
>
> Key: CALCITE-2844
> URL: https://issues.apache.org/jira/browse/CALCITE-2844
> Project: Calcite
>  Issue Type: Bug
>  Components: babel
>Affects Versions: 1.18.0
>Reporter: Muhammad Gelbana
>Priority: Major
>
> *Code to reproduce the problem*
> {code:java}
> public static void main(String[] args) throws Exception {
> Config parserConfig = 
> configBuilder().setConformance(SqlConformanceEnum.BABEL).setParserFactory(SqlBabelParserImpl.FACTORY).build();
> FrameworkConfig frameworkConfig = 
> Frameworks.newConfigBuilder().parserConfig(parserConfig).build();
> Planner planner = Frameworks.getPlanner(frameworkConfig);
> String pg = "select * from generate_series(1, 5)";
> planner.parse(pg);
> }
> {code}
>  
> *Thrown exception*
> {noformat}
> Exception in thread "main" org.apache.calcite.sql.parser.SqlParseException: 
> Encountered "( 1" at line 1, column 30.
> Was expecting one of:
>     
>     "ORDER" ...
>     "LIMIT" ...
>     "OFFSET" ...
>     "FETCH" ...
>     "NATURAL" ...
>     "JOIN" ...
>     "INNER" ...
>     "LEFT" ...
>     "RIGHT" ...
>     "FULL" ...
>     "CROSS" ...
>     "," ...
>     "OUTER" ...
>     "("  ...
>     "("  ...
>     "("  ...
>     "("  ...
>     "("  ...
>     "MATCH_RECOGNIZE" ...
>     "." ...
>     "(" ...
>      ...
>      ...
>      ...
>      ...
>      ...
>     "TABLESAMPLE" ...
>     "WHERE" ...
>     "GROUP" ...
>     "HAVING" ...
>     "WINDOW" ...
>     "UNION" ...
>     "INTERSECT" ...
>     "EXCEPT" ...
>     "MINUS" ...
>     
>     at 
> org.apache.calcite.sql.parser.babel.SqlBabelParserImpl.convertException(SqlBabelParserImpl.java:354)
>     at 
> org.apache.calcite.sql.parser.babel.SqlBabelParserImpl.normalizeException(SqlBabelParserImpl.java:142)
>     at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:156)
>     at org.apache.calcite.sql.parser.SqlParser.parseStmt(SqlParser.java:181)
>     at org.apache.calcite.prepare.PlannerImpl.parse(PlannerImpl.java:174)
>     at org.apache.calcite.tools.Planner.parse(Planner.java:50)
>     at com.lab.calcite.App2.main(App2.java:24)
> Caused by: org.apache.calcite.sql.parser.babel.ParseException: Encountered "( 
> 1" at line 1, column 30.
> Was expecting one of:
>     
>     "ORDER" ...
>     "LIMIT" ...
>     "OFFSET" ...
>     "FETCH" ...
>     "NATURAL" ...
>     "JOIN" ...
>     "INNER" ...
>     "LEFT" ...
>     "RIGHT" ...
>     "FULL" ...
>     "CROSS" ...
>     "," ...
>     "OUTER" ...
>     "("  ...
>     "("  ...
>     "("  ...
>     "("  ...
>     "("  ...
>     "MATCH_RECOGNIZE" ...
>     "." ...
>     "(" ...
>      ...
>      ...
>      ...
>      ...
>      ...
>     "TABLESAMPLE" ...
>     "WHERE" ...
>     "GROUP" ...
>     "HAVING" ...
>     "WINDOW" ...
>     "UNION" ...
>     "INTERSECT" ...
>     "EXCEPT" ...
>     "MINUS" ...
>     
>     at 
> org.apache.calcite.sql.parser.babel.SqlBabelParserImpl.generateParseException(SqlBabelParserImpl.java:31191)
>     at 
> org.apache.calcite.sql.parser.babel.SqlBabelParserImpl.jj_consume_token(SqlBabelParserImpl.java:31008)
>     at 
> org.apache.calcite.sql.parser.babel.SqlBabelParserImpl.SqlStmtEof(SqlBabelParserImpl.java:877)
>     at 
> org.apache.calcite.sql.parser.babel.SqlBabelParserImpl.parseSqlStmtEof(SqlBabelParserImpl.java:198)
>     at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:148)
>     ... 4 more
> {noformat}



--
This message was sent by 

[jira] [Comment Edited] (CALCITE-2844) Babel parser should parse PostgreSQL table functions

2019-04-02 Thread Hongze Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807878#comment-16807878
 ] 

Hongze Zhang edited comment on CALCITE-2844 at 4/2/19 3:48 PM:
---

I've run the two tests and reproduced the failures.

The first one {{testOuterApplyFunctionFails}} fails because the table function 
"ramp(deptno)" is now get successfully parsed, as it was previously not a legal 
syntax. 

Besides, the SQL used in the case is wired since it ends with two right 
brackets "{{))}}":

{code:sql}
select * from dept outer apply ramp(deptno^)^)
{code}

In general it should be 

{code:sql}
select * from dept outer apply ramp(deptno^)^
{code}

The second bracket looks like a typo. I think you can just remove the bracket 
and error anchors (the "^"). Then let the test case pass.

The case {{testTableExtend}} fails because of bad lookahead.

I see your LOOKAHEAD hint is set as 

{code}
LOOKAHEAD({ (getToken(1).kind == IDENTIFIER && getToken(2).kind == LPAREN) || 
(getToken(1).kind == SPECIFIC && getToken(2).kind == IDENTIFIER && 
getToken(3).kind == LPAREN) })
{code}

The hint is conflict with "extend clause" (see CALCITE-493). There should be 
multiple solutions to solve the conflict (using syntactic lookahead, or 
changing the expansion match order, etc.)



was (Author: zhztheplayer):
I've run the two tests and reproduced the failures.

The first one {{testOuterApplyFunctionFails}} fails because the table function 
"ramp(deptno)" is now get successfully parsed, as it was previously not a legal 
syntax. 

Besides, the SQL used in the case is wired since it ends with two right 
brackets "{{))}}":

{code:sql}
select * from dept outer apply ramp(deptno^)^)
{code}

In general it should be 

{code:sql}
select * from dept outer apply ramp(deptno^)^
{code}

The second bracket looks like a typo. I think you can just remove the bracket 
and error anchors (the "^"). Then let the test case pass.

The case {{testTableExtend}} fails because of bad lookahead.

I see your LOOKAHEAD hint is set as 

{code}
LOOKAHEAD({ (getToken(1).kind == IDENTIFIER && getToken(2).kind == LPAREN) || 
(getToken(1).kind == SPECIFIC && getToken(2).kind == IDENTIFIER && 
getToken(3).kind == LPAREN) })
{code}

The hint is conflict with "extend clause" (see CALCITE-493). There should be 
multiple solutions to solve the conflict (using syntactic lookahead, or 
changing the expansion match order, etc. Maybe the later is better?)


> Babel parser should parse PostgreSQL table functions
> 
>
> Key: CALCITE-2844
> URL: https://issues.apache.org/jira/browse/CALCITE-2844
> Project: Calcite
>  Issue Type: Bug
>  Components: babel
>Affects Versions: 1.18.0
>Reporter: Muhammad Gelbana
>Priority: Major
>
> *Code to reproduce the problem*
> {code:java}
> public static void main(String[] args) throws Exception {
> Config parserConfig = 
> configBuilder().setConformance(SqlConformanceEnum.BABEL).setParserFactory(SqlBabelParserImpl.FACTORY).build();
> FrameworkConfig frameworkConfig = 
> Frameworks.newConfigBuilder().parserConfig(parserConfig).build();
> Planner planner = Frameworks.getPlanner(frameworkConfig);
> String pg = "select * from generate_series(1, 5)";
> planner.parse(pg);
> }
> {code}
>  
> *Thrown exception*
> {noformat}
> Exception in thread "main" org.apache.calcite.sql.parser.SqlParseException: 
> Encountered "( 1" at line 1, column 30.
> Was expecting one of:
>     
>     "ORDER" ...
>     "LIMIT" ...
>     "OFFSET" ...
>     "FETCH" ...
>     "NATURAL" ...
>     "JOIN" ...
>     "INNER" ...
>     "LEFT" ...
>     "RIGHT" ...
>     "FULL" ...
>     "CROSS" ...
>     "," ...
>     "OUTER" ...
>     "("  ...
>     "("  ...
>     "("  ...
>     "("  ...
>     "("  ...
>     "MATCH_RECOGNIZE" ...
>     "." ...
>     "(" ...
>      ...
>      ...
>      ...
>      ...
>      ...
>     "TABLESAMPLE" ...
>     "WHERE" ...
>     "GROUP" ...
>     "HAVING" ...
>     "WINDOW" ...
>     "UNION" ...
>     "INTERSECT" ...
>     "EXCEPT" ...
>     "MINUS" ...
>     
>     at 
> org.apache.calcite.sql.parser.babel.SqlBabelParserImpl.convertException(SqlBabelParserImpl.java:354)
>     at 
> org.apache.calcite.sql.parser.babel.SqlBabelParserImpl.normalizeException(SqlBabelParserImpl.java:142)
>     at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:156)
>     at org.apache.calcite.sql.parser.SqlParser.parseStmt(SqlParser.java:181)
>     at org.apache.calcite.prepare.PlannerImpl.parse(PlannerImpl.java:174)
>     at org.apache.calcite.tools.Planner.parse(Planner.java:50)
>     at com.lab.calcite.App2.main(App2.java:24)
> Caused by: org.apache.calcite.sql.parser.babel.ParseException: Encountered "( 
> 1" at line 1, column 30.
> Was expecting one of:
>     
>     "ORDER" ...
>   

[jira] [Resolved] (CALCITE-2947) Type mismatch assertion error when converting NOT IN subquery

2019-04-02 Thread Haisheng Yuan (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-2947.

   Resolution: Fixed
Fix Version/s: 1.20.0

Resolved by patch in https://issues.apache.org/jira/browse/CALCITE-2004

> Type mismatch assertion error when converting NOT IN subquery
> -
>
> Key: CALCITE-2947
> URL: https://issues.apache.org/jira/browse/CALCITE-2947
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>  Labels: sub-query
> Fix For: 1.20.0
>
>
> Repro:
> Add the following test to SqlToRelConverterTest.java.
> {code:java}
> @Test public void testSubQueryNotIN() {
> final String sql = "select deptno\n"
> + "from EMP e\n"
> + "where deptno not in (select deptno\n"
> + "from EMP where empno=e.empno)";
> sql(sql).ok();
>   }
> {code}
> Error:
> {code:java}
> java.lang.AssertionError: type mismatch:
> ref:
> INTEGER NOT NULL
> input:
> INTEGER
>   at org.apache.calcite.util.Litmus$1.fail(Litmus.java:31)
>   at org.apache.calcite.plan.RelOptUtil.eq(RelOptUtil.java:1832)
>   at org.apache.calcite.rex.RexChecker.visitInputRef(RexChecker.java:125)
>   at org.apache.calcite.rex.RexChecker.visitInputRef(RexChecker.java:57)
>   at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:112)
>   at org.apache.calcite.rex.RexChecker.visitCall(RexChecker.java:140)
>   at org.apache.calcite.rex.RexChecker.visitCall(RexChecker.java:57)
>   at org.apache.calcite.rex.RexCall.accept(RexCall.java:191)
>   at org.apache.calcite.rel.core.Filter.isValid(Filter.java:120)
>   at 
> org.apache.calcite.rel.logical.LogicalFilter.(LogicalFilter.java:70)
>   at 
> org.apache.calcite.rel.logical.LogicalFilter.create(LogicalFilter.java:114)
>   at 
> org.apache.calcite.rel.logical.LogicalFilter.create(LogicalFilter.java:101)
>   at 
> org.apache.calcite.rel.core.RelFactories$FilterFactoryImpl.createFilter(RelFactories.java:300)
>   at 
> org.apache.calcite.sql2rel.SqlToRelConverter.createJoin(SqlToRelConverter.java:2433)
> {code}
> If we change not in subquery to in subquery, it can run without error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CALCITE-2893) Type mismatch assertion error when creating join condition

2019-04-02 Thread Haisheng Yuan (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-2893.

   Resolution: Fixed
Fix Version/s: 1.20.0

Resolved by patch in https://issues.apache.org/jira/browse/CALCITE-2004

> Type mismatch assertion error when creating join condition
> --
>
> Key: CALCITE-2893
> URL: https://issues.apache.org/jira/browse/CALCITE-2893
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
> Fix For: 1.20.0
>
>
> Repro:
> in TableFunctionTest.java, add the following test case:
> {code:java}
> @Test public void testValuesLateralJoin() {
> final String q1 = "select *\n"
> + "from (values 4) as t (c)\n"
> + "left join lateral table(\"s\".\"fibonacci2\"(c)) as R(n) on c=n";
>   with()
>   .with(CalciteConnectionProperty.CONFORMANCE,
>   SqlConformanceEnum.LENIENT)
>   .query(q1)
>   .returnsUnordered("C=4; N=null");
>   }
> {code}
> Error message:
> {code:java}
> java.lang.AssertionError: type mismatch:
> ref:
> BIGINT NOT NULL
> input:
> BIGINT
>   at org.apache.calcite.util.Litmus$1.fail(Litmus.java:31)
>   at org.apache.calcite.plan.RelOptUtil.eq(RelOptUtil.java:1832)
>   at org.apache.calcite.rex.RexChecker.visitInputRef(RexChecker.java:125)
>   at org.apache.calcite.rex.RexChecker.visitInputRef(RexChecker.java:57)
>   at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:112)
>   at org.apache.calcite.rex.RexChecker.visitCall(RexChecker.java:140)
>   at org.apache.calcite.rex.RexChecker.visitCall(RexChecker.java:57)
>   at org.apache.calcite.rex.RexCall.accept(RexCall.java:190)
>   at org.apache.calcite.rel.core.Filter.isValid(Filter.java:120)
>   at 
> org.apache.calcite.rel.logical.LogicalFilter.(LogicalFilter.java:70)
>   at 
> org.apache.calcite.rel.logical.LogicalFilter.create(LogicalFilter.java:114)
>   at 
> org.apache.calcite.rel.logical.LogicalFilter.create(LogicalFilter.java:101)
>   at 
> org.apache.calcite.rel.core.RelFactories$FilterFactoryImpl.createFilter(RelFactories.java:295)
>   at 
> org.apache.calcite.sql2rel.SqlToRelConverter.createJoin(SqlToRelConverter.java:2413)
>   at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertFrom(SqlToRelConverter.java:2065)
>   at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:644)
>   at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:625)
>   at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:3080)
>   at 
> org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:561)
> {code}
> Even literal is definitely not null, we should still be able to compare the 
> two.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CALCITE-2004) The TableFunction left outer join was improperly translated

2019-04-02 Thread Haisheng Yuan (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-2004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-2004.

   Resolution: Fixed
Fix Version/s: 1.20.0

> The TableFunction left outer join was improperly translated
> ---
>
> Key: CALCITE-2004
> URL: https://issues.apache.org/jira/browse/CALCITE-2004
> Project: Calcite
>  Issue Type: Bug
>Reporter: Xingcan Cui
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Currently the TableFunction left outer join is translated to a 
> {{LogicalCorrelate}} followed by a {{LogicalFilter}}. For instance, given a 
> simple table {{WordCount(word:String, frequency:Int)}}, a table function 
> {{split: word:String => (letter:String, length:String)}}, and a query 
> {{SELECT word, letter, length FROM WordCount LEFT JOIN LATERAL 
> TABLE(split(word)) AS T (letter, length) ON frequency = length OR length < 
> 5"}} the query will be translated to the logical plan below.
> {code:java}
> LogicalProject(word=[$0], name=[$2], length=[$3])
>   LogicalFilter(condition=[OR(=($1, CAST($3):BIGINT), <($3, 5))])
> LogicalCorrelate(correlation=[$cor0], joinType=[left], 
> requiredColumns=[{0}])
>   LogicalTableScan(table=[[WordCount]])
>   LogicalTableFunctionScan(invocation=[split($cor0.word)], 
> rowType=[RecordType(VARCHAR(65536) _1, INTEGER _2)], elementType=[class 
> [Ljava.lang.Object;])
> {code}
> With this plan, an outer row might be completely filtered out (instead of 
> being padded with null) if the join predicate filter filters out all join 
> results produced by a correlate join for the outer row. Maybe we should 
> attach the filter to the correlate operation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2004) The TableFunction left outer join was improperly translated

2019-04-02 Thread Haisheng Yuan (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807871#comment-16807871
 ] 

Haisheng Yuan commented on CALCITE-2004:


Pushed to master, thanks for the review, [~julianhyde].

> The TableFunction left outer join was improperly translated
> ---
>
> Key: CALCITE-2004
> URL: https://issues.apache.org/jira/browse/CALCITE-2004
> Project: Calcite
>  Issue Type: Bug
>Reporter: Xingcan Cui
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Currently the TableFunction left outer join is translated to a 
> {{LogicalCorrelate}} followed by a {{LogicalFilter}}. For instance, given a 
> simple table {{WordCount(word:String, frequency:Int)}}, a table function 
> {{split: word:String => (letter:String, length:String)}}, and a query 
> {{SELECT word, letter, length FROM WordCount LEFT JOIN LATERAL 
> TABLE(split(word)) AS T (letter, length) ON frequency = length OR length < 
> 5"}} the query will be translated to the logical plan below.
> {code:java}
> LogicalProject(word=[$0], name=[$2], length=[$3])
>   LogicalFilter(condition=[OR(=($1, CAST($3):BIGINT), <($3, 5))])
> LogicalCorrelate(correlation=[$cor0], joinType=[left], 
> requiredColumns=[{0}])
>   LogicalTableScan(table=[[WordCount]])
>   LogicalTableFunctionScan(invocation=[split($cor0.word)], 
> rowType=[RecordType(VARCHAR(65536) _1, INTEGER _2)], elementType=[class 
> [Ljava.lang.Object;])
> {code}
> With this plan, an outer row might be completely filtered out (instead of 
> being padded with null) if the join predicate filter filters out all join 
> results produced by a correlate join for the outer row. Maybe we should 
> attach the filter to the correlate operation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CALCITE-2884) Add the JSON_INSERT,JSON_REPLACE,JSON_SET function

2019-04-02 Thread Forward Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-2884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forward Xu updated CALCITE-2884:

Description: 
{{JSON_INSERT(jsondoc}}, _{{path}}_, _{{val}}_[, path, val] )

{{JSON_REPLACE(jsondoc}}, _{{path}}_, _{{val}}_[, path, val] )

{{JSON_SET(jsondoc}}, _{{path}}_, _{{val}}_[, path, val] )

Inserts data into a JSON document and returns the result. Returns {{NULL}} if 
any argument is {{NULL}}. An error occurs if the _{{json_doc}}_ argument is not 
a valid JSON document or any _{{path}}_ argument is not a valid path expression 
or contains a  *or {{**}} wildcard.

The path-value pairs are evaluated left to right. The document produced by 
evaluating one pair becomes the new value against which the next pair is 
evaluated.

A path-value pair for an existing path in the document is ignored and does not 
overwrite the existing document value. A path-value pair for a nonexisting path 
in the document adds the value to the document if the path identifies one of 
these types of values:
 * A member not present in an existing object. The member is added to the 
object and associated with the new value.

 * A position past the end of an existing array. The array is extended with the 
new value. If the existing value is not an array, it is autowrapped as an 
array, then extended with the new value.

Otherwise, a path-value pair for a nonexisting path in the document is ignored 
and has no effect.

For a comparison of 
[{{JSON_INSERT()}}|https://dev.mysql.com/doc/refman/5.7/en/json-modification-functions.html#function_json-insert],
 
[{{JSON_REPLACE()}}|https://dev.mysql.com/doc/refman/5.7/en/json-modification-functions.html#function_json-replace],
 and 
[{{JSON_SET()}}|https://dev.mysql.com/doc/refman/5.7/en/json-modification-functions.html#function_json-set],
 see the discussion of 
[{{JSON_SET()}}|https://dev.mysql.com/doc/refman/5.7/en/json-modification-functions.html#function_json-set].

JSON_INSERT SQL:
{code:java}
SELECT JSON_INSERT(v, '$.a', 10, '$.c', '[true, false]') AS c1
 FROM (VALUES ('{ "a": 1, "b": [2, 3]}')) AS t(v);{code}
Result:
||c1||
|{"a": 1, "b": [2, 3], "c": "[true, false]"}|

JSON_REPLACE SQL:
{code:java}
SELECT JSON_REPLACE(v, '$.a', 10, '$.c', '[true, false]') AS c1
 FROM (VALUES ('{ "a": 1, "b": [2, 3]}')) AS t(v);{code}
Result:
||c1||
|{"a": 10, "b": [2, 3],}|

JSON_SET SQL:
{code:java}
SELECT JSON_INSERT(v, '$.a', 10, '$.c', '[true, false]') AS c1
 FROM (VALUES ('{ "a": 1, "b": [2, 3]}')) AS t(v);{code}
Result:
||c1||
|{"a": 10, "b": [2, 3], "c": "[true, false]"}|

 

  was:
{{JSON_INSERT(jsondoc}}, _{{path}}_, _{{val}}_[, path, val] )

{{JSON_REPLACE(jsondoc}}, _{{path}}_, _{{val}}_[, path, val] )

{{JSON_SET(jsondoc}}, _{{path}}_, _{{val}}_[, path, val] )

Inserts data into a JSON document and returns the result. Returns {{NULL}} if 
any argument is {{NULL}}. An error occurs if the _{{json_doc}}_ argument is not 
a valid JSON document or any _{{path}}_ argument is not a valid path expression 
or contains a  *or {{**}} wildcard.

The path-value pairs are evaluated left to right. The document produced by 
evaluating one pair becomes the new value against which the next pair is 
evaluated.

A path-value pair for an existing path in the document is ignored and does not 
overwrite the existing document value. A path-value pair for a nonexisting path 
in the document adds the value to the document if the path identifies one of 
these types of values:
 * A member not present in an existing object. The member is added to the 
object and associated with the new value.

 * A position past the end of an existing array. The array is extended with the 
new value. If the existing value is not an array, it is autowrapped as an 
array, then extended with the new value.

Otherwise, a path-value pair for a nonexisting path in the document is ignored 
and has no effect.

For a comparison of 
[{{JSON_INSERT()}}|https://dev.mysql.com/doc/refman/5.7/en/json-modification-functions.html#function_json-insert],
 
[{{JSON_REPLACE()}}|https://dev.mysql.com/doc/refman/5.7/en/json-modification-functions.html#function_json-replace],
 and 
[{{JSON_SET()}}|https://dev.mysql.com/doc/refman/5.7/en/json-modification-functions.html#function_json-set],
 see the discussion of 
[{{JSON_SET()}}|https://dev.mysql.com/doc/refman/5.7/en/json-modification-functions.html#function_json-set].

Example SQL:
{code:java}
SELECT JSON_INSERT(v, '$.a', 10, '$.c', '[true, false]') AS c1
 FROM (VALUES ('{ "a": 1, "b": [2, 3]}')) AS t(v);{code}
Result:
||c1||
|{"a": 1, "b": [2, 3], "c": "[true, false]"}|

 


> Add the JSON_INSERT,JSON_REPLACE,JSON_SET function
> --
>
> Key: CALCITE-2884
> URL: https://issues.apache.org/jira/browse/CALCITE-2884
> Project: Calcite
>  Issue Type: Sub-task
>Reporter: Forward Xu

[jira] [Updated] (CALCITE-2884) Add the JSON_INSERT,JSON_REPLACE,JSON_SET function

2019-04-02 Thread Forward Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-2884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forward Xu updated CALCITE-2884:

Summary: Add the JSON_INSERT,JSON_REPLACE,JSON_SET function  (was: Add the 
JSON_INSERT function)

> Add the JSON_INSERT,JSON_REPLACE,JSON_SET function
> --
>
> Key: CALCITE-2884
> URL: https://issues.apache.org/jira/browse/CALCITE-2884
> Project: Calcite
>  Issue Type: Sub-task
>Reporter: Forward Xu
>Assignee: Forward Xu
>Priority: Major
>
> {{JSON_INSERT(jsondoc}}, _{{path}}_, _{{val}}_[, path, val] )
> Inserts data into a JSON document and returns the result. Returns {{NULL}} if 
> any argument is {{NULL}}. An error occurs if the _{{json_doc}}_ argument is 
> not a valid JSON document or any _{{path}}_ argument is not a valid path 
> expression or contains a  *or {{**}} wildcard.
> The path-value pairs are evaluated left to right. The document produced by 
> evaluating one pair becomes the new value against which the next pair is 
> evaluated.
> A path-value pair for an existing path in the document is ignored and does 
> not overwrite the existing document value. A path-value pair for a 
> nonexisting path in the document adds the value to the document if the path 
> identifies one of these types of values:
>  * A member not present in an existing object. The member is added to the 
> object and associated with the new value.
>  * A position past the end of an existing array. The array is extended with 
> the new value. If the existing value is not an array, it is autowrapped as an 
> array, then extended with the new value.
> Otherwise, a path-value pair for a nonexisting path in the document is 
> ignored and has no effect.
> For a comparison of 
> [{{JSON_INSERT()}}|https://dev.mysql.com/doc/refman/5.7/en/json-modification-functions.html#function_json-insert],
>  
> [{{JSON_REPLACE()}}|https://dev.mysql.com/doc/refman/5.7/en/json-modification-functions.html#function_json-replace],
>  and 
> [{{JSON_SET()}}|https://dev.mysql.com/doc/refman/5.7/en/json-modification-functions.html#function_json-set],
>  see the discussion of 
> [{{JSON_SET()}}|https://dev.mysql.com/doc/refman/5.7/en/json-modification-functions.html#function_json-set].
> Example SQL:
> {code:java}
> SELECT JSON_INSERT(v, '$.a', 10, '$.c', '[true, false]') AS c1
>  FROM (VALUES ('{ "a": 1, "b": [2, 3]}')) AS t(v);{code}
> Result:
>  
> ||c1||
> |{"a": 1, "b": [2, 3], "c": "[true, false]"}|
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2966) Problem with Code Generation

2019-04-02 Thread Julian Feinauer (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807828#comment-16807828
 ] 

Julian Feinauer commented on CALCITE-2966:
--

Hey [~danny0405]... the full code is in my fork for this commit: 
https://github.com/JulianFeinauer/calcite/tree/ae8704ab44632ae4bbe51cafef2251920955b183
The bug appears when the "testMatch" Method in the JdbcTest class is run.
This is the fastest and easiest way to reproduce it, is this okay for you?

> Problem with Code Generation
> 
>
> Key: CALCITE-2966
> URL: https://issues.apache.org/jira/browse/CALCITE-2966
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.20.0
>Reporter: Danny Chan
>Assignee: Danny Chan
>Priority: Major
> Fix For: 1.20.0
>
>
> From the mailing list:
> Hi all,
> I have some problems with the code generation from Linq4j which I'm unable to 
> resolve myself.
> Basically, I want to translate a condition from Rex to a Linq4j expression to 
> use it in generated code.
> In my example the Condition is from Match Recognize and in SQL is: 
> `up."commission" > prev(up."commission")`.
>  
> ```
> RexBuilder rexBuilder = new RexBuilder(implementor.getTypeFactory());
> RexProgramBuilder rexProgramBuilder = new 
> RexProgramBuilder(physType.getRowType(), rexBuilder);
>  
> rexProgramBuilder.addCondition(entry.getValue());
>  
> final Expression condition = 
> RexToLixTranslator.translateCondition(rexProgramBuilder.getProgram(),
>           (JavaTypeFactory) getCluster().getTypeFactory(),
>           builder2,
>           inputGetter1,
>           implementor.allCorrelateVariables,
>           implementor.getConformance());
>  
> builder2.add(Expressions.return_(null, condition));
> ```
>  
> Here, the condition seems okay, it is: ">(PREV(UP.$4, 0), PREV(UP.$4, 1))",  
> so it should be a comparison of two variables (I rewrite the PREV with a 
> custom Input Getter".
> But, the generated code (for Janino) is:
>  
> ```
> Object p1 = row_.get($L4J$C$0_1);
> org.apache.calcite.test.JdbcTest.Employee p0 = 
> (org.apache.calcite.test.JdbcTest.Employee) p1;
> Object p3 = row_.get($L4J$C$1_1);
> org.apache.calcite.test.JdbcTest.Employee p2 = 
> (org.apache.calcite.test.JdbcTest.Employee) p3;
> Object p5 = row_.get($L4J$C$0_1);
> org.apache.calcite.test.JdbcTest.Employee p4 = 
> (org.apache.calcite.test.JdbcTest.Employee) p5;
> Object p7 = row_.get($L4J$C$1_1);
> org.apache.calcite.test.JdbcTest.Employee p6 = 
> (org.apache.calcite.test.JdbcTest.Employee) p7;
> return p0.commission && p2.commission && p4.commission > p6.commission;
> ```
>  
> This confuses me a lot as I do not know where the check for p0.commission and 
> p2.commission comes from.
> It seems that Linq4j adds them as it expects these variables to be nullable, 
> but I have no idea on how to avoid this.
> These fields are Numeric so I always get a compilation exception.
>  
> Can someone help me with this issue?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2973) Make EnumerableMergeJoinRule to support a theta join

2019-04-02 Thread Stamatis Zampetakis (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807826#comment-16807826
 ] 

Stamatis Zampetakis commented on CALCITE-2973:
--

Thanks for working on this [~hhlai1990] ! It seems that it is not only a rule 
issue but you also plan to change the merge join algorithm implementation. I 
think the summary is a bit misleading. How about changing it to "Allow theta 
joins to be executed using a merge join algorithm" or something similar?

Apart from that, I think it would be better to generalize EnumerableMergeJoin 
operator than create a new one.

> Make EnumerableMergeJoinRule to support a theta join
> 
>
> Key: CALCITE-2973
> URL: https://issues.apache.org/jira/browse/CALCITE-2973
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Lai Zhou
>Priority: Minor
>
> Now the EnumerableMergeJoinRule only supports an inner and equi join.
> If users make a theta-join query  for a large dataset (such as 1*1), 
> the nested-loop join process will take dozens of time than the sort-merge 
> join process .
> So if we can apply merge-join or hash-join rule for a theta join, it will 
> improve the performance greatly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2967) New rule to remove SemiJoin

2019-04-02 Thread Ruben Quesada Lopez (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807728#comment-16807728
 ] 

Ruben Quesada Lopez commented on CALCITE-2967:
--

In my case, the SemiJoin node can be created from a Join via SemiJoinRule, or 
directly by using RelBuilder#semiJoin method.

> New rule to remove SemiJoin
> ---
>
> Key: CALCITE-2967
> URL: https://issues.apache.org/jira/browse/CALCITE-2967
> Project: Calcite
>  Issue Type: New Feature
>Reporter: Ruben Quesada Lopez
>Assignee: Ruben Quesada Lopez
>Priority: Major
>
> (As far as I know, there is no rule to achieve what I am about to describe, 
> if there exists already a way to do it, please let me know).
> In some specific situations, a SemiJoin can be completely removed and 
> replaced by its left child with an appropriate transformed filter.
> Let us say we want to retrieve all employees whose department satisfy a 
> certain condition, i.e.:
> {code}
> SELECT * from Employee e
> WHERE e.deptno IN
>   (SELECT d.deptno FROM Department d
> WHERE )
> {code}
> Which would translate to something like:
> {code}
> SemiJoin (e.deptno=d.deptno)
> Scan (table=Employee as e)
> Filter()
> Scan (table=Department as d)
> {code}
> In a "normal" scenario, e.g. "all employees from Sales department", the plan 
> could no be simplified:
> {code}
> SemiJoin (e.deptno=d.deptno)
> Scan (table=Employee as e)
> Filter(d.name="Sales")
> Scan (table=Department as d)
> {code}
> But with a specific condition, based on deptno, e.g. "all employees whose 
> deptno is greater than 10":
> {code}
> SemiJoin (e.deptno=d.deptno)
> Scan (table=Employee as e)
> Filter(d.deptno>10)
> Scan (table=Department as d)
> {code}
> The plan could be simplified: the SemiJoin is not actually needed, we can 
> perform the query with a single scan and a converted filter:
> {code}
> Filter(e.deptno>10)
> Scan (table=Employee as e)
> {code}
> The goal would be to provide a new rule to achieve that (since there is 
> already a SemiJoinRemoveRule, we could name this new rule as e.g. 
> SemiJoinSimplifyRule?).
> I know that, ideally, this rule should not be needed because the plan could 
> be directly written without the SemiJoin, but let's say that we are in a 
> situation were the plan is systematically generated with the same pattern, 
> and there is no way to know in advanced the filter condition that will be 
> used within.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (CALCITE-2967) New rule to remove SemiJoin

2019-04-02 Thread Danny Chan (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-2967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chan reassigned CALCITE-2967:
---

Assignee: Ruben Quesada Lopez  (was: Danny Chan)

> New rule to remove SemiJoin
> ---
>
> Key: CALCITE-2967
> URL: https://issues.apache.org/jira/browse/CALCITE-2967
> Project: Calcite
>  Issue Type: New Feature
>Reporter: Ruben Quesada Lopez
>Assignee: Ruben Quesada Lopez
>Priority: Major
>
> (As far as I know, there is no rule to achieve what I am about to describe, 
> if there exists already a way to do it, please let me know).
> In some specific situations, a SemiJoin can be completely removed and 
> replaced by its left child with an appropriate transformed filter.
> Let us say we want to retrieve all employees whose department satisfy a 
> certain condition, i.e.:
> {code}
> SELECT * from Employee e
> WHERE e.deptno IN
>   (SELECT d.deptno FROM Department d
> WHERE )
> {code}
> Which would translate to something like:
> {code}
> SemiJoin (e.deptno=d.deptno)
> Scan (table=Employee as e)
> Filter()
> Scan (table=Department as d)
> {code}
> In a "normal" scenario, e.g. "all employees from Sales department", the plan 
> could no be simplified:
> {code}
> SemiJoin (e.deptno=d.deptno)
> Scan (table=Employee as e)
> Filter(d.name="Sales")
> Scan (table=Department as d)
> {code}
> But with a specific condition, based on deptno, e.g. "all employees whose 
> deptno is greater than 10":
> {code}
> SemiJoin (e.deptno=d.deptno)
> Scan (table=Employee as e)
> Filter(d.deptno>10)
> Scan (table=Department as d)
> {code}
> The plan could be simplified: the SemiJoin is not actually needed, we can 
> perform the query with a single scan and a converted filter:
> {code}
> Filter(e.deptno>10)
> Scan (table=Employee as e)
> {code}
> The goal would be to provide a new rule to achieve that (since there is 
> already a SemiJoinRemoveRule, we could name this new rule as e.g. 
> SemiJoinSimplifyRule?).
> I know that, ideally, this rule should not be needed because the plan could 
> be directly written without the SemiJoin, but let's say that we are in a 
> situation were the plan is systematically generated with the same pattern, 
> and there is no way to know in advanced the filter condition that will be 
> used within.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (CALCITE-2967) New rule to remove SemiJoin

2019-04-02 Thread Danny Chan (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-2967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chan reassigned CALCITE-2967:
---

Assignee: Danny Chan  (was: Ruben Quesada Lopez)

> New rule to remove SemiJoin
> ---
>
> Key: CALCITE-2967
> URL: https://issues.apache.org/jira/browse/CALCITE-2967
> Project: Calcite
>  Issue Type: New Feature
>Reporter: Ruben Quesada Lopez
>Assignee: Danny Chan
>Priority: Major
>
> (As far as I know, there is no rule to achieve what I am about to describe, 
> if there exists already a way to do it, please let me know).
> In some specific situations, a SemiJoin can be completely removed and 
> replaced by its left child with an appropriate transformed filter.
> Let us say we want to retrieve all employees whose department satisfy a 
> certain condition, i.e.:
> {code}
> SELECT * from Employee e
> WHERE e.deptno IN
>   (SELECT d.deptno FROM Department d
> WHERE )
> {code}
> Which would translate to something like:
> {code}
> SemiJoin (e.deptno=d.deptno)
> Scan (table=Employee as e)
> Filter()
> Scan (table=Department as d)
> {code}
> In a "normal" scenario, e.g. "all employees from Sales department", the plan 
> could no be simplified:
> {code}
> SemiJoin (e.deptno=d.deptno)
> Scan (table=Employee as e)
> Filter(d.name="Sales")
> Scan (table=Department as d)
> {code}
> But with a specific condition, based on deptno, e.g. "all employees whose 
> deptno is greater than 10":
> {code}
> SemiJoin (e.deptno=d.deptno)
> Scan (table=Employee as e)
> Filter(d.deptno>10)
> Scan (table=Department as d)
> {code}
> The plan could be simplified: the SemiJoin is not actually needed, we can 
> perform the query with a single scan and a converted filter:
> {code}
> Filter(e.deptno>10)
> Scan (table=Employee as e)
> {code}
> The goal would be to provide a new rule to achieve that (since there is 
> already a SemiJoinRemoveRule, we could name this new rule as e.g. 
> SemiJoinSimplifyRule?).
> I know that, ideally, this rule should not be needed because the plan could 
> be directly written without the SemiJoin, but let's say that we are in a 
> situation were the plan is systematically generated with the same pattern, 
> and there is no way to know in advanced the filter condition that will be 
> used within.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2967) New rule to remove SemiJoin

2019-04-02 Thread Danny Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807684#comment-16807684
 ] 

Danny Chan commented on CALCITE-2967:
-

What i'm curious about is after decorrelation, an in clause will be re-written 
to a inner-join but not a semi-join,

so how was your SemiJoin node created ?

> New rule to remove SemiJoin
> ---
>
> Key: CALCITE-2967
> URL: https://issues.apache.org/jira/browse/CALCITE-2967
> Project: Calcite
>  Issue Type: New Feature
>Reporter: Ruben Quesada Lopez
>Assignee: Ruben Quesada Lopez
>Priority: Major
>
> (As far as I know, there is no rule to achieve what I am about to describe, 
> if there exists already a way to do it, please let me know).
> In some specific situations, a SemiJoin can be completely removed and 
> replaced by its left child with an appropriate transformed filter.
> Let us say we want to retrieve all employees whose department satisfy a 
> certain condition, i.e.:
> {code}
> SELECT * from Employee e
> WHERE e.deptno IN
>   (SELECT d.deptno FROM Department d
> WHERE )
> {code}
> Which would translate to something like:
> {code}
> SemiJoin (e.deptno=d.deptno)
> Scan (table=Employee as e)
> Filter()
> Scan (table=Department as d)
> {code}
> In a "normal" scenario, e.g. "all employees from Sales department", the plan 
> could no be simplified:
> {code}
> SemiJoin (e.deptno=d.deptno)
> Scan (table=Employee as e)
> Filter(d.name="Sales")
> Scan (table=Department as d)
> {code}
> But with a specific condition, based on deptno, e.g. "all employees whose 
> deptno is greater than 10":
> {code}
> SemiJoin (e.deptno=d.deptno)
> Scan (table=Employee as e)
> Filter(d.deptno>10)
> Scan (table=Department as d)
> {code}
> The plan could be simplified: the SemiJoin is not actually needed, we can 
> perform the query with a single scan and a converted filter:
> {code}
> Filter(e.deptno>10)
> Scan (table=Employee as e)
> {code}
> The goal would be to provide a new rule to achieve that (since there is 
> already a SemiJoinRemoveRule, we could name this new rule as e.g. 
> SemiJoinSimplifyRule?).
> I know that, ideally, this rule should not be needed because the plan could 
> be directly written without the SemiJoin, but let's say that we are in a 
> situation were the plan is systematically generated with the same pattern, 
> and there is no way to know in advanced the filter condition that will be 
> used within.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CALCITE-2973) Make EnumerableMergeJoinRule to support a theta join

2019-04-02 Thread Lai Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807393#comment-16807393
 ] 

Lai Zhou edited comment on CALCITE-2973 at 4/2/19 9:28 AM:
---

[~julianhyde] , consider another query that the join conditions contains an 
equi condition and a non-equi condition meanwhile :

 
{code:java}
SELECT t1.i_item_desc FROM item t1 LEFT OUTER JOIN item_1 t2 ON 
t1.i_item_sk=t2.i_item_sk and t2.i_item_sk <1{code}
 Merge join is  also good for this query. But now it will be converted to a 
nested loop join.

 I have a try to replace the default ENUMERABLE_JOIN_RULE by a customized rule:
{code:java}
final JoinInfo info = JoinInfo.of(left, right, join.getCondition());
if (!info.isEqui() && join.getJoinType() != JoinRelType.INNER) {
  // EnumerableJoinRel only supports equi-join. We can put a filter on top
  // if it is an inner join.
  try {
boolean hasEquiKeys = !info.leftKeys.isEmpty()
&& !info.rightKeys.isEmpty();
if (hasEquiKeys) {
  return convertToThetaMergeJoin(rel);
} else {
  return new EnumerableThetaJoin(cluster, traitSet, left, right,
  join.getCondition(), join.getVariablesSet(), join.getJoinType());
}
  } catch (Exception e) {
EnumerableRules.LOGGER.debug(e.toString());
return null;
  }
}
{code}
 if the join has equi-keys, it will be converted an EnumerableThetaMergeJoin .
{code:java}
new EnumerableThetaMergeJoin(cluster, traits, left, right, 
info.getEquiCondition(left, right, cluster.getRexBuilder()), 
info.getRemaining(cluster.getRexBuilder()), info.leftKeys, info.rightKeys, 
join.getVariablesSet(), join.getJoinType());{code}
I implement the  EnumerableThetaMergeJoin to handle a theta join with equi keys 
.

The key difference of  EnumerableThetaMergeJoin and  EnumerableMergeJoin is 
that:

EnumerableThetaMergeJoin use a predicate generated by the remaining part of the 
JoinInfo,

and the  predicate will be applied on the cartesians result  of a merge join.

see 
[https://github.com/apache/calcite/blob/27d883983e76691f9294e5edd9e264b978dfa7e9/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L3298]

 I do some changes:
{code:java}
public TResult current() {
  final List list = cartesians.current();
  @SuppressWarnings("unchecked") final TSource left =
  (TSource) list.get(0);
  @SuppressWarnings("unchecked") final TInner right =
  (TInner) list.get(1);
  //apply predicate for the result in cartesians
  boolean isNonEquiPredicateSatisfied=predicate.apply(left, right);
  if (!isNonEquiPredicateSatisfied) {
if (generateNullsOnLeft) {
  return resultSelector.apply(null, right);
}
if (generateNullsOnRight) {
  return resultSelector.apply(left, null);
}
  }
  return resultSelector.apply(left, right);
}
{code}
 

 


was (Author: hhlai1990):
[~julianhyde] , consider another query that the join conditions contains an 
equi condition and a non-equi condition meanwhile :

 
{code:java}
SELECT t1.i_item_desc FROM item t1 LEFT OUTER JOIN item_1 t2 ON 
t1.i_item_sk=t2.i_item_sk and t2.i_item_sk <1{code}
 Merge join is  also good for this query. But now it will be converted to a 
nested loop join.

 I have a try to replace the default ENUMERABLE_JOIN_RULE by a customized rule:
{code:java}
final JoinInfo info = JoinInfo.of(left, right, join.getCondition());
if (!info.isEqui() && join.getJoinType() != JoinRelType.INNER) {
  // EnumerableJoinRel only supports equi-join. We can put a filter on top
  // if it is an inner join.
  try {
boolean hasEquiKeys = !info.leftKeys.isEmpty()
&& !info.rightKeys.isEmpty();
if (hasEquiKeys) {
  return convertToThetaMergeJoin(rel);
} else {
  return new EnumerableThetaJoin(cluster, traitSet, left, right,
  join.getCondition(), join.getVariablesSet(), join.getJoinType());
}
  } catch (Exception e) {
EnumerableRules.LOGGER.debug(e.toString());
return null;
  }
}
{code}
 if the join has equi-keys, it will convert the join rel to a 
EnumerableThetaMergeJoin .
{code:java}
new EnumerableThetaMergeJoin(cluster, traits, left, right, 
info.getEquiCondition(left, right, cluster.getRexBuilder()), 
info.getRemaining(cluster.getRexBuilder()), info.leftKeys, info.rightKeys, 
join.getVariablesSet(), join.getJoinType());{code}
I implement the  EnumerableThetaMergeJoin to handle a theta join with equi keys 
.

The key difference of  EnumerableThetaMergeJoin and  EnumerableMergeJoin is 
that:

EnumerableThetaMergeJoin use a predicate generated by the remaining part of the 
JoinInfo,

and the  predicate will be applied on the cartesians result  of a merge join.

see 
[https://github.com/apache/calcite/blob/27d883983e76691f9294e5edd9e264b978dfa7e9/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L3298]

 I do some changes:
{code:java}
public TResult current() {
  final List list = 

[jira] [Comment Edited] (CALCITE-2973) Make EnumerableMergeJoinRule to support a theta join

2019-04-02 Thread Lai Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807393#comment-16807393
 ] 

Lai Zhou edited comment on CALCITE-2973 at 4/2/19 9:28 AM:
---

[~julianhyde] , consider another query that the join conditions contains an 
equi condition and a non-equi condition meanwhile :

 
{code:java}
SELECT t1.i_item_desc FROM item t1 LEFT OUTER JOIN item_1 t2 ON 
t1.i_item_sk=t2.i_item_sk and t2.i_item_sk <1{code}
 Merge join is  also good for this query. But now it will be converted to a 
nested loop join.

 I have a try to replace the default ENUMERABLE_JOIN_RULE by a customized rule:
{code:java}
final JoinInfo info = JoinInfo.of(left, right, join.getCondition());
if (!info.isEqui() && join.getJoinType() != JoinRelType.INNER) {
  // EnumerableJoinRel only supports equi-join. We can put a filter on top
  // if it is an inner join.
  try {
boolean hasEquiKeys = !info.leftKeys.isEmpty()
&& !info.rightKeys.isEmpty();
if (hasEquiKeys) {
  return convertToThetaMergeJoin(rel);
} else {
  return new EnumerableThetaJoin(cluster, traitSet, left, right,
  join.getCondition(), join.getVariablesSet(), join.getJoinType());
}
  } catch (Exception e) {
EnumerableRules.LOGGER.debug(e.toString());
return null;
  }
}
{code}
 if the join has equi-keys, it will be converted to an EnumerableThetaMergeJoin 
.
{code:java}
new EnumerableThetaMergeJoin(cluster, traits, left, right, 
info.getEquiCondition(left, right, cluster.getRexBuilder()), 
info.getRemaining(cluster.getRexBuilder()), info.leftKeys, info.rightKeys, 
join.getVariablesSet(), join.getJoinType());{code}
I implement the  EnumerableThetaMergeJoin to handle a theta join with equi keys 
.

The key difference of  EnumerableThetaMergeJoin and  EnumerableMergeJoin is 
that:

EnumerableThetaMergeJoin use a predicate generated by the remaining part of the 
JoinInfo,

and the  predicate will be applied on the cartesians result  of a merge join.

see 
[https://github.com/apache/calcite/blob/27d883983e76691f9294e5edd9e264b978dfa7e9/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L3298]

 I do some changes:
{code:java}
public TResult current() {
  final List list = cartesians.current();
  @SuppressWarnings("unchecked") final TSource left =
  (TSource) list.get(0);
  @SuppressWarnings("unchecked") final TInner right =
  (TInner) list.get(1);
  //apply predicate for the result in cartesians
  boolean isNonEquiPredicateSatisfied=predicate.apply(left, right);
  if (!isNonEquiPredicateSatisfied) {
if (generateNullsOnLeft) {
  return resultSelector.apply(null, right);
}
if (generateNullsOnRight) {
  return resultSelector.apply(left, null);
}
  }
  return resultSelector.apply(left, right);
}
{code}
 

 


was (Author: hhlai1990):
[~julianhyde] , consider another query that the join conditions contains an 
equi condition and a non-equi condition meanwhile :

 
{code:java}
SELECT t1.i_item_desc FROM item t1 LEFT OUTER JOIN item_1 t2 ON 
t1.i_item_sk=t2.i_item_sk and t2.i_item_sk <1{code}
 Merge join is  also good for this query. But now it will be converted to a 
nested loop join.

 I have a try to replace the default ENUMERABLE_JOIN_RULE by a customized rule:
{code:java}
final JoinInfo info = JoinInfo.of(left, right, join.getCondition());
if (!info.isEqui() && join.getJoinType() != JoinRelType.INNER) {
  // EnumerableJoinRel only supports equi-join. We can put a filter on top
  // if it is an inner join.
  try {
boolean hasEquiKeys = !info.leftKeys.isEmpty()
&& !info.rightKeys.isEmpty();
if (hasEquiKeys) {
  return convertToThetaMergeJoin(rel);
} else {
  return new EnumerableThetaJoin(cluster, traitSet, left, right,
  join.getCondition(), join.getVariablesSet(), join.getJoinType());
}
  } catch (Exception e) {
EnumerableRules.LOGGER.debug(e.toString());
return null;
  }
}
{code}
 if the join has equi-keys, it will be converted an EnumerableThetaMergeJoin .
{code:java}
new EnumerableThetaMergeJoin(cluster, traits, left, right, 
info.getEquiCondition(left, right, cluster.getRexBuilder()), 
info.getRemaining(cluster.getRexBuilder()), info.leftKeys, info.rightKeys, 
join.getVariablesSet(), join.getJoinType());{code}
I implement the  EnumerableThetaMergeJoin to handle a theta join with equi keys 
.

The key difference of  EnumerableThetaMergeJoin and  EnumerableMergeJoin is 
that:

EnumerableThetaMergeJoin use a predicate generated by the remaining part of the 
JoinInfo,

and the  predicate will be applied on the cartesians result  of a merge join.

see 
[https://github.com/apache/calcite/blob/27d883983e76691f9294e5edd9e264b978dfa7e9/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L3298]

 I do some changes:
{code:java}
public TResult current() {
  final List list = cartesians.current();
 

[jira] [Comment Edited] (CALCITE-2973) Make EnumerableMergeJoinRule to support a theta join

2019-04-02 Thread Lai Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807393#comment-16807393
 ] 

Lai Zhou edited comment on CALCITE-2973 at 4/2/19 9:24 AM:
---

[~julianhyde] , consider another query that the join conditions contains an 
equi condition and a non-equi condition meanwhile :

 
{code:java}
SELECT t1.i_item_desc FROM item t1 LEFT OUTER JOIN item_1 t2 ON 
t1.i_item_sk=t2.i_item_sk and t2.i_item_sk <1{code}
 Merge join is  also good for this query. But now it will be converted to a 
nested loop join.

 I have a try to replace the default ENUMERABLE_JOIN_RULE by a customized rule:
{code:java}
final JoinInfo info = JoinInfo.of(left, right, join.getCondition());
if (!info.isEqui() && join.getJoinType() != JoinRelType.INNER) {
  // EnumerableJoinRel only supports equi-join. We can put a filter on top
  // if it is an inner join.
  try {
boolean hasEquiKeys = !info.leftKeys.isEmpty()
&& !info.rightKeys.isEmpty();
if (hasEquiKeys) {
  return convertToThetaMergeJoin(rel);
} else {
  return new EnumerableThetaJoin(cluster, traitSet, left, right,
  join.getCondition(), join.getVariablesSet(), join.getJoinType());
}
  } catch (Exception e) {
EnumerableRules.LOGGER.debug(e.toString());
return null;
  }
}
{code}
 if the join has equi-keys, it will convert the join rel to a 
EnumerableThetaMergeJoin .
{code:java}
new EnumerableThetaMergeJoin(cluster, traits, left, right, 
info.getEquiCondition(left, right, cluster.getRexBuilder()), 
info.getRemaining(cluster.getRexBuilder()), info.leftKeys, info.rightKeys, 
join.getVariablesSet(), join.getJoinType());{code}
I implement the  EnumerableThetaMergeJoin to handle a theta join with equi keys 
.

The key difference of  EnumerableThetaMergeJoin and  EnumerableMergeJoin is 
that:

EnumerableThetaMergeJoin use a predicate generated by the remaining part of the 
JoinInfo,

and the  predicate will be applied on the cartesians result  of a merge join.

see 
[https://github.com/apache/calcite/blob/27d883983e76691f9294e5edd9e264b978dfa7e9/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L3298]

 
{code:java}
public TResult current() {
  final List list = cartesians.current();
  @SuppressWarnings("unchecked") final TSource left =
  (TSource) list.get(0);
  @SuppressWarnings("unchecked") final TInner right =
  (TInner) list.get(1);
  //apply predicate for the result in cartesians
  boolean isNonEquiPredicateSatisfied=predicate.apply(left, right);
  if (!isNonEquiPredicateSatisfied) {
if (generateNullsOnLeft) {
  return resultSelector.apply(null, right);
}
if (generateNullsOnRight) {
  return resultSelector.apply(left, null);
}
  }
  return resultSelector.apply(left, right);
}
{code}
 

 


was (Author: hhlai1990):
[~julianhyde] , consider another query that the join conditions contains an 
equi condition and a non-equi condition meanwhile :

 
{code:java}
SELECT t1.i_item_desc FROM item t1 LEFT OUTER JOIN item_1 t2 ON 
t1.i_item_sk=t2.i_item_sk and t2.i_item_sk <1{code}
 Merge join is  also good for this query. But now it will be converted to a 
nested loop join.

 I have a try to replace the default ENUMERABLE_JOIN_RULE by a customized rule:
{code:java}
final JoinInfo info = JoinInfo.of(left, right, join.getCondition());
if (!info.isEqui() && join.getJoinType() != JoinRelType.INNER) {
  // EnumerableJoinRel only supports equi-join. We can put a filter on top
  // if it is an inner join.
  try {
boolean hasEquiKeys = !info.leftKeys.isEmpty()
&& !info.rightKeys.isEmpty();
if (hasEquiKeys) {
  return convertToThetaMergeJoin(rel);
} else {
  return new EnumerableThetaJoin(cluster, traitSet, left, right,
  join.getCondition(), join.getVariablesSet(), join.getJoinType());
}
  } catch (Exception e) {
EnumerableRules.LOGGER.debug(e.toString());
return null;
  }
}
{code}
 if the join has equi-keys, it will convert the join rel to a 
EnumerableThetaMergeJoin .
{code:java}
new EnumerableThetaMergeJoin(cluster, traits, left, right, 
info.getEquiCondition(left, right, cluster.getRexBuilder()), 
info.getRemaining(cluster.getRexBuilder()), info.leftKeys, info.rightKeys, 
join.getVariablesSet(), join.getJoinType());{code}
I implement the  EnumerableThetaMergeJoin to handle a theta join with equi keys 
.

The key difference of  EnumerableThetaMergeJoin and  EnumerableMergeJoin is 
that:

EnumerableThetaMergeJoin use a predicate generated by the remaining part of the 
JoinInfo,

and the  predicate will apply on the cartesians result  of a merge join.

see 

[jira] [Comment Edited] (CALCITE-2973) Make EnumerableMergeJoinRule to support a theta join

2019-04-02 Thread Lai Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807393#comment-16807393
 ] 

Lai Zhou edited comment on CALCITE-2973 at 4/2/19 9:24 AM:
---

[~julianhyde] , consider another query that the join conditions contains an 
equi condition and a non-equi condition meanwhile :

 
{code:java}
SELECT t1.i_item_desc FROM item t1 LEFT OUTER JOIN item_1 t2 ON 
t1.i_item_sk=t2.i_item_sk and t2.i_item_sk <1{code}
 Merge join is  also good for this query. But now it will be converted to a 
nested loop join.

 I have a try to replace the default ENUMERABLE_JOIN_RULE by a customized rule:
{code:java}
final JoinInfo info = JoinInfo.of(left, right, join.getCondition());
if (!info.isEqui() && join.getJoinType() != JoinRelType.INNER) {
  // EnumerableJoinRel only supports equi-join. We can put a filter on top
  // if it is an inner join.
  try {
boolean hasEquiKeys = !info.leftKeys.isEmpty()
&& !info.rightKeys.isEmpty();
if (hasEquiKeys) {
  return convertToThetaMergeJoin(rel);
} else {
  return new EnumerableThetaJoin(cluster, traitSet, left, right,
  join.getCondition(), join.getVariablesSet(), join.getJoinType());
}
  } catch (Exception e) {
EnumerableRules.LOGGER.debug(e.toString());
return null;
  }
}
{code}
 if the join has equi-keys, it will convert the join rel to a 
EnumerableThetaMergeJoin .
{code:java}
new EnumerableThetaMergeJoin(cluster, traits, left, right, 
info.getEquiCondition(left, right, cluster.getRexBuilder()), 
info.getRemaining(cluster.getRexBuilder()), info.leftKeys, info.rightKeys, 
join.getVariablesSet(), join.getJoinType());{code}
I implement the  EnumerableThetaMergeJoin to handle a theta join with equi keys 
.

The key difference of  EnumerableThetaMergeJoin and  EnumerableMergeJoin is 
that:

EnumerableThetaMergeJoin use a predicate generated by the remaining part of the 
JoinInfo,

and the  predicate will be applied on the cartesians result  of a merge join.

see 
[https://github.com/apache/calcite/blob/27d883983e76691f9294e5edd9e264b978dfa7e9/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L3298]

 I do some changes:
{code:java}
public TResult current() {
  final List list = cartesians.current();
  @SuppressWarnings("unchecked") final TSource left =
  (TSource) list.get(0);
  @SuppressWarnings("unchecked") final TInner right =
  (TInner) list.get(1);
  //apply predicate for the result in cartesians
  boolean isNonEquiPredicateSatisfied=predicate.apply(left, right);
  if (!isNonEquiPredicateSatisfied) {
if (generateNullsOnLeft) {
  return resultSelector.apply(null, right);
}
if (generateNullsOnRight) {
  return resultSelector.apply(left, null);
}
  }
  return resultSelector.apply(left, right);
}
{code}
 

 


was (Author: hhlai1990):
[~julianhyde] , consider another query that the join conditions contains an 
equi condition and a non-equi condition meanwhile :

 
{code:java}
SELECT t1.i_item_desc FROM item t1 LEFT OUTER JOIN item_1 t2 ON 
t1.i_item_sk=t2.i_item_sk and t2.i_item_sk <1{code}
 Merge join is  also good for this query. But now it will be converted to a 
nested loop join.

 I have a try to replace the default ENUMERABLE_JOIN_RULE by a customized rule:
{code:java}
final JoinInfo info = JoinInfo.of(left, right, join.getCondition());
if (!info.isEqui() && join.getJoinType() != JoinRelType.INNER) {
  // EnumerableJoinRel only supports equi-join. We can put a filter on top
  // if it is an inner join.
  try {
boolean hasEquiKeys = !info.leftKeys.isEmpty()
&& !info.rightKeys.isEmpty();
if (hasEquiKeys) {
  return convertToThetaMergeJoin(rel);
} else {
  return new EnumerableThetaJoin(cluster, traitSet, left, right,
  join.getCondition(), join.getVariablesSet(), join.getJoinType());
}
  } catch (Exception e) {
EnumerableRules.LOGGER.debug(e.toString());
return null;
  }
}
{code}
 if the join has equi-keys, it will convert the join rel to a 
EnumerableThetaMergeJoin .
{code:java}
new EnumerableThetaMergeJoin(cluster, traits, left, right, 
info.getEquiCondition(left, right, cluster.getRexBuilder()), 
info.getRemaining(cluster.getRexBuilder()), info.leftKeys, info.rightKeys, 
join.getVariablesSet(), join.getJoinType());{code}
I implement the  EnumerableThetaMergeJoin to handle a theta join with equi keys 
.

The key difference of  EnumerableThetaMergeJoin and  EnumerableMergeJoin is 
that:

EnumerableThetaMergeJoin use a predicate generated by the remaining part of the 
JoinInfo,

and the  predicate will be applied on the cartesians result  of a merge join.

see 
[https://github.com/apache/calcite/blob/27d883983e76691f9294e5edd9e264b978dfa7e9/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L3298]

 
{code:java}
public TResult current() {
  final List list = cartesians.current();
 

[jira] [Comment Edited] (CALCITE-2973) Make EnumerableMergeJoinRule to support a theta join

2019-04-02 Thread Lai Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807393#comment-16807393
 ] 

Lai Zhou edited comment on CALCITE-2973 at 4/2/19 9:22 AM:
---

[~julianhyde] , consider another query that the join conditions contains an 
equi condition and a non-equi condition meanwhile :

 
{code:java}
SELECT t1.i_item_desc FROM item t1 LEFT OUTER JOIN item_1 t2 ON 
t1.i_item_sk=t2.i_item_sk and t2.i_item_sk <1{code}
 Merge join is  also good for this query. But now it will be converted to a 
nested loop join.

 I have a try to replace the default ENUMERABLE_JOIN_RULE by a customized rule:
{code:java}
final JoinInfo info = JoinInfo.of(left, right, join.getCondition());
if (!info.isEqui() && join.getJoinType() != JoinRelType.INNER) {
  // EnumerableJoinRel only supports equi-join. We can put a filter on top
  // if it is an inner join.
  try {
boolean hasEquiKeys = !info.leftKeys.isEmpty()
&& !info.rightKeys.isEmpty();
if (hasEquiKeys) {
  return convertToThetaMergeJoin(rel);
} else {
  return new EnumerableThetaJoin(cluster, traitSet, left, right,
  join.getCondition(), join.getVariablesSet(), join.getJoinType());
}
  } catch (Exception e) {
EnumerableRules.LOGGER.debug(e.toString());
return null;
  }
}
{code}
 if the join has equi-keys, it will convert the join rel to a 
EnumerableThetaMergeJoin .
{code:java}
new EnumerableThetaMergeJoin(cluster, traits, left, right, 
info.getEquiCondition(left, right, cluster.getRexBuilder()), 
info.getRemaining(cluster.getRexBuilder()), info.leftKeys, info.rightKeys, 
join.getVariablesSet(), join.getJoinType());{code}
I implement the  EnumerableThetaMergeJoin to handle a theta join with equi keys 
.

The key difference of  EnumerableThetaMergeJoin and  EnumerableMergeJoin is 
that:

EnumerableThetaMergeJoin use a predicate generated by the remaining part of the 
JoinInfo,

and the  predicate will apply on the cartesians result  of a merge join.

see 
[https://github.com/apache/calcite/blob/27d883983e76691f9294e5edd9e264b978dfa7e9/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L3298|https://github.com/apache/calcite/blob/27d883983e76691f9294e5edd9e264b978dfa7e9/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L3298]

 
{code:java}
public TResult current() {
  final List list = cartesians.current();
  @SuppressWarnings("unchecked") final TSource left =
  (TSource) list.get(0);
  @SuppressWarnings("unchecked") final TInner right =
  (TInner) list.get(1);
  //apply predicate for the result in cartesians
  boolean isNonEquiPredicateSatisfied=predicate.apply(left, right);
  if (!isNonEquiPredicateSatisfied) {
if (generateNullsOnLeft) {
  return resultSelector.apply(null, right);
}
if (generateNullsOnRight) {
  return resultSelector.apply(left, null);
}
  }
  return resultSelector.apply(left, right);
}
{code}
 

 


was (Author: hhlai1990):
[~julianhyde] , consider another query that the join conditions contains an 
equi condition and a non-equi condition meanwhile :

 
{code:java}
SELECT t1.i_item_desc FROM item t1 LEFT OUTER JOIN item_1 t2 ON 
t1.i_item_sk=t2.i_item_sk and t2.i_item_sk <1{code}
 Merge join is  also good for this query. But now it will be converted to a 
nested loop join.

 

 

> Make EnumerableMergeJoinRule to support a theta join
> 
>
> Key: CALCITE-2973
> URL: https://issues.apache.org/jira/browse/CALCITE-2973
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Lai Zhou
>Priority: Minor
>
> Now the EnumerableMergeJoinRule only supports an inner and equi join.
> If users make a theta-join query  for a large dataset (such as 1*1), 
> the nested-loop join process will take dozens of time than the sort-merge 
> join process .
> So if we can apply merge-join or hash-join rule for a theta join, it will 
> improve the performance greatly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2966) Problem with Code Generation

2019-04-02 Thread Julian Feinauer (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807566#comment-16807566
 ] 

Julian Feinauer commented on CALCITE-2966:
--

First, thanks [~danny0405]... are we sure that this is a bug or could it be due 
to me doing something wrong with the API?
Can I supply you with more informations about my code?

> Problem with Code Generation
> 
>
> Key: CALCITE-2966
> URL: https://issues.apache.org/jira/browse/CALCITE-2966
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.20.0
>Reporter: Danny Chan
>Assignee: Danny Chan
>Priority: Major
> Fix For: 1.20.0
>
>
> From the mailing list:
> Hi all,
> I have some problems with the code generation from Linq4j which I'm unable to 
> resolve myself.
> Basically, I want to translate a condition from Rex to a Linq4j expression to 
> use it in generated code.
> In my example the Condition is from Match Recognize and in SQL is: 
> `up."commission" > prev(up."commission")`.
>  
> ```
> RexBuilder rexBuilder = new RexBuilder(implementor.getTypeFactory());
> RexProgramBuilder rexProgramBuilder = new 
> RexProgramBuilder(physType.getRowType(), rexBuilder);
>  
> rexProgramBuilder.addCondition(entry.getValue());
>  
> final Expression condition = 
> RexToLixTranslator.translateCondition(rexProgramBuilder.getProgram(),
>           (JavaTypeFactory) getCluster().getTypeFactory(),
>           builder2,
>           inputGetter1,
>           implementor.allCorrelateVariables,
>           implementor.getConformance());
>  
> builder2.add(Expressions.return_(null, condition));
> ```
>  
> Here, the condition seems okay, it is: ">(PREV(UP.$4, 0), PREV(UP.$4, 1))",  
> so it should be a comparison of two variables (I rewrite the PREV with a 
> custom Input Getter".
> But, the generated code (for Janino) is:
>  
> ```
> Object p1 = row_.get($L4J$C$0_1);
> org.apache.calcite.test.JdbcTest.Employee p0 = 
> (org.apache.calcite.test.JdbcTest.Employee) p1;
> Object p3 = row_.get($L4J$C$1_1);
> org.apache.calcite.test.JdbcTest.Employee p2 = 
> (org.apache.calcite.test.JdbcTest.Employee) p3;
> Object p5 = row_.get($L4J$C$0_1);
> org.apache.calcite.test.JdbcTest.Employee p4 = 
> (org.apache.calcite.test.JdbcTest.Employee) p5;
> Object p7 = row_.get($L4J$C$1_1);
> org.apache.calcite.test.JdbcTest.Employee p6 = 
> (org.apache.calcite.test.JdbcTest.Employee) p7;
> return p0.commission && p2.commission && p4.commission > p6.commission;
> ```
>  
> This confuses me a lot as I do not know where the check for p0.commission and 
> p2.commission comes from.
> It seems that Linq4j adds them as it expects these variables to be nullable, 
> but I have no idea on how to avoid this.
> These fields are Numeric so I always get a compilation exception.
>  
> Can someone help me with this issue?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-1338) JoinProjectTransposeRule makes wrong transformation when the right child of left outer join has a RexLiteral project expression.

2019-04-02 Thread Chunwei Lei (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807488#comment-16807488
 ] 

Chunwei Lei commented on CALCITE-1338:
--

I will take a look into this since it will cause data error.

> JoinProjectTransposeRule makes wrong transformation when the right child of 
> left outer join has a RexLiteral project expression.
> 
>
> Key: CALCITE-1338
> URL: https://issues.apache.org/jira/browse/CALCITE-1338
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Yi Xinglu
>Priority: Major
>
> h3. SQL: 
> {code:borderStyle=solid}
> SELECT * 
> FROM dept a 
> LEFT JOIN (
>   SELECT b.name, 1 FROM dept b
> ) AS b 
> ON a.name=b.name
> {code}
> h3. Selected rule set:
> {code:borderStyle=solid}
>   SubQueryRemoveRule.JOIN
>   JoinProjectTransposeRule.RIGHT_PROJECT_INCLUDE_OUTER
> {code}
> h3. Optimized logical plan:
> {code:borderStyle=solid}
> LogicalProject(DEPTNO=[$0], NAME=[$1], NAME0=[$2], EXPR$1=[$3])
>   LogicalProject(DEPTNO=[$0], NAME=[$1], NAME0=[$2], 
> EXPR$1=[CAST($3):INTEGER])
> LogicalProject(DEPTNO=[$0], NAME=[$1], NAME0=[$3], EXPR$1=[1])
>   LogicalJoin(condition=[=($3, $1)], joinType=[left])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {code}
> h3. The right logical plan should be
> {code:borderStyle=solid}
> LogicalProject(DEPTNO=[$0], NAME=[$1], NAME0=[$2], EXPR$1=[$3])
>   LogicalJoin(condition=[=($2, $1)], joinType=[left])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> LogicalProject(NAME=[$1], EXPR$1=[1])
>   LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {code}
> h3. Summary
> The RexLiteral project expression will make logical plan get different 
> results when it's right child or parent node of left outer join.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2004) The TableFunction left outer join was improperly translated

2019-04-02 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807463#comment-16807463
 ] 

Julian Hyde commented on CALCITE-2004:
--

[~hyuan] +1 nice work, well tested. Go ahead and merge to master.

> The TableFunction left outer join was improperly translated
> ---
>
> Key: CALCITE-2004
> URL: https://issues.apache.org/jira/browse/CALCITE-2004
> Project: Calcite
>  Issue Type: Bug
>Reporter: Xingcan Cui
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently the TableFunction left outer join is translated to a 
> {{LogicalCorrelate}} followed by a {{LogicalFilter}}. For instance, given a 
> simple table {{WordCount(word:String, frequency:Int)}}, a table function 
> {{split: word:String => (letter:String, length:String)}}, and a query 
> {{SELECT word, letter, length FROM WordCount LEFT JOIN LATERAL 
> TABLE(split(word)) AS T (letter, length) ON frequency = length OR length < 
> 5"}} the query will be translated to the logical plan below.
> {code:java}
> LogicalProject(word=[$0], name=[$2], length=[$3])
>   LogicalFilter(condition=[OR(=($1, CAST($3):BIGINT), <($3, 5))])
> LogicalCorrelate(correlation=[$cor0], joinType=[left], 
> requiredColumns=[{0}])
>   LogicalTableScan(table=[[WordCount]])
>   LogicalTableFunctionScan(invocation=[split($cor0.word)], 
> rowType=[RecordType(VARCHAR(65536) _1, INTEGER _2)], elementType=[class 
> [Ljava.lang.Object;])
> {code}
> With this plan, an outer row might be completely filtered out (instead of 
> being padded with null) if the join predicate filter filters out all join 
> results produced by a correlate join for the outer row. Maybe we should 
> attach the filter to the correlate operation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CALCITE-2754) LISTAGG support

2019-04-02 Thread Chunwei Lei (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807421#comment-16807421
 ] 

Chunwei Lei edited comment on CALCITE-2754 at 4/2/19 6:08 AM:
--

I opened [PR#1142|https://github.com/apache/calcite/pull/1142] addressing all 
comments. Please feel free to close it if you still work on it.


was (Author: chunwei lei):
I opened [PR#1142|https://github.com/apache/calcite/pull/1142] addressing all 
comments. Please feel free to close it if you still work on it.

 

 

> LISTAGG support
> ---
>
> Key: CALCITE-2754
> URL: https://issues.apache.org/jira/browse/CALCITE-2754
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 1.18.0
>Reporter: Sergey Nuyanzin
>Assignee: Sergey Nuyanzin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Inspired by CALCITE-2224.
> There is an implementation of LISTAGG mentioned in description of CALCITE-2224
> Unfortunately there is no info about LISTAGG in public available SQL:2016 
> draft at [1] while there is some description about LISTAGG as feature T625 
> e.g.at [2]
> [1] http://jtc1sc32.org/doc/N2551-2600/32N2572T-text_for_ballot-DIS_9075-2.pdf
> [2] https://modern-sql.com/feature/listagg



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CALCITE-2754) LISTAGG support

2019-04-02 Thread Chunwei Lei (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807421#comment-16807421
 ] 

Chunwei Lei edited comment on CALCITE-2754 at 4/2/19 6:07 AM:
--

I opened [PR#1142|https://github.com/apache/calcite/pull/1142] addressing all 
comments. Please feel free to close it if you still work on it.

 

 


was (Author: chunwei lei):
I opened [PR#1142|[https://github.com/apache/calcite/pull/1142]] addressing all 
comments. Please feel free to close it if you still work on it.

 

 

> LISTAGG support
> ---
>
> Key: CALCITE-2754
> URL: https://issues.apache.org/jira/browse/CALCITE-2754
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 1.18.0
>Reporter: Sergey Nuyanzin
>Assignee: Sergey Nuyanzin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Inspired by CALCITE-2224.
> There is an implementation of LISTAGG mentioned in description of CALCITE-2224
> Unfortunately there is no info about LISTAGG in public available SQL:2016 
> draft at [1] while there is some description about LISTAGG as feature T625 
> e.g.at [2]
> [1] http://jtc1sc32.org/doc/N2551-2600/32N2572T-text_for_ballot-DIS_9075-2.pdf
> [2] https://modern-sql.com/feature/listagg



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2754) LISTAGG support

2019-04-02 Thread Chunwei Lei (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807421#comment-16807421
 ] 

Chunwei Lei commented on CALCITE-2754:
--

I opened [PR#1142|[https://github.com/apache/calcite/pull/1142]] addressing all 
comments. Please feel free to close it if you still work on it.

 

 

> LISTAGG support
> ---
>
> Key: CALCITE-2754
> URL: https://issues.apache.org/jira/browse/CALCITE-2754
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 1.18.0
>Reporter: Sergey Nuyanzin
>Assignee: Sergey Nuyanzin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Inspired by CALCITE-2224.
> There is an implementation of LISTAGG mentioned in description of CALCITE-2224
> Unfortunately there is no info about LISTAGG in public available SQL:2016 
> draft at [1] while there is some description about LISTAGG as feature T625 
> e.g.at [2]
> [1] http://jtc1sc32.org/doc/N2551-2600/32N2572T-text_for_ballot-DIS_9075-2.pdf
> [2] https://modern-sql.com/feature/listagg



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)