[jira] [Commented] (CALCITE-2969) Improve design of join-like relational expressions

2019-05-31 Thread Danny Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16853520#comment-16853520
 ] 

Danny Chan commented on CALCITE-2969:
-

Thanks [~rubenql], i would recheck the doc to see if i missed something.

> Improve design of join-like relational expressions
> --
>
> Key: CALCITE-2969
> URL: https://issues.apache.org/jira/browse/CALCITE-2969
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>  Time Spent: 31h
>  Remaining Estimate: 0h
>
> The existing join-like (Join, SemiJoin, Correlate, etc.) logical and physical 
> relational expressions have a few design issues which make some parts of the 
> codebase complicated and difficult to understand.
> The goal of this ticket is to improve the design of the respective 
> expressions based on the discussion in the dev list (see thread [Join, 
> SemiJoin, 
> Correlate|https://mail-archives.apache.org/mod_mbox/calcite-dev/201903.mbox/%3C8EEA04A0-4A77-4283-BD20-B019E19AE126%40apache.org%3E]).
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2969) Improve design of join-like relational expressions

2019-05-31 Thread Ruben Quesada Lopez (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16853085#comment-16853085
 ] 

Ruben Quesada Lopez commented on CALCITE-2969:
--

I think there's one small detail that we forgot in this ticket: the file 
{{algebra.md}} still references the deprecated {{SemiJoin}} when describing the 
{{semiJoin(expr)}} method:
{code}
...
| `semiJoin(expr)` | Creates a [SemiJoin]({{ site.apiRoot 
}}/org/apache/calcite/rel/core/SemiJoin.html) of the two most recent relational 
expressions.
{code}
This is no longer correct, since this method creates now a {{Join}} with 
{{type=SEMI}}.

> Improve design of join-like relational expressions
> --
>
> Key: CALCITE-2969
> URL: https://issues.apache.org/jira/browse/CALCITE-2969
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>  Time Spent: 31h
>  Remaining Estimate: 0h
>
> The existing join-like (Join, SemiJoin, Correlate, etc.) logical and physical 
> relational expressions have a few design issues which make some parts of the 
> codebase complicated and difficult to understand.
> The goal of this ticket is to improve the design of the respective 
> expressions based on the discussion in the dev list (see thread [Join, 
> SemiJoin, 
> Correlate|https://mail-archives.apache.org/mod_mbox/calcite-dev/201903.mbox/%3C8EEA04A0-4A77-4283-BD20-B019E19AE126%40apache.org%3E]).
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2969) Improve design of join-like relational expressions

2019-05-30 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16852065#comment-16852065
 ] 

Julian Hyde commented on CALCITE-2969:
--

There are a whole bunch of deprecation warnings. Generally we stop using APIs 
internally when they are deprecated. [~danny0405], Can you please fix these 
warnings before 1.20?

{noformat}core/src/main/java/org/apache/calcite/rel/logical/LogicalCorrelate.java:[29,30]
 org.apache.calcite.sql.SemiJoinType in org.apache.calcite.sql has been 
deprecated 
core/src/main/java/org/apache/calcite/rel/core/Correlate.java:[30,30] 
org.apache.calcite.sql.SemiJoinType in org.apache.calcite.sql has been 
deprecated 
core/src/main/java/org/apache/calcite/rel/mutable/MutableRels.java:[35,35] 
org.apache.calcite.rel.core.SemiJoin in org.apache.calcite.rel.core has been 
deprecated 
core/src/main/java/org/apache/calcite/rel/metadata/RelMdRowCount.java:[30,35] 
org.apache.calcite.rel.core.SemiJoin in org.apache.calcite.rel.core has been 
deprecated 
core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableSemiJoinRule.java:[22,35]
 org.apache.calcite.rel.core.SemiJoin in org.apache.calcite.rel.core has been 
deprecated 
core/src/main/java/org/apache/calcite/rel/rules/JoinToCorrelateRule.java:[27,35]
 org.apache.calcite.rel.core.SemiJoin in org.apache.calcite.rel.core has been 
deprecated 
core/src/main/java/org/apache/calcite/rel/metadata/RelMdNodeTypes.java:[29,35] 
org.apache.calcite.rel.core.SemiJoin in org.apache.calcite.rel.core has been 
deprecated 
core/src/main/java/org/apache/calcite/rel/metadata/RelMdCollation.java:[23,45] 
org.apache.calcite.adapter.enumerable.EnumerableSemiJoin in 
org.apache.calcite.adapter.enumerable has been deprecated 
core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableSemiJoin.java:[31,35]
 org.apache.calcite.rel.core.SemiJoin in org.apache.calcite.rel.core has been 
deprecated 
core/src/main/java/org/apache/calcite/rel/metadata/RelMdPopulationSize.java:[25,35]
 org.apache.calcite.rel.core.SemiJoin in org.apache.calcite.rel.core has been 
deprecated 
core/src/main/java/org/apache/calcite/rel/metadata/RelMdUniqueKeys.java:[26,35] 
org.apache.calcite.rel.core.SemiJoin in org.apache.calcite.rel.core has been 
deprecated 
core/src/main/java/org/apache/calcite/rel/metadata/RelMdColumnUniqueness.java:[33,35]
 org.apache.calcite.rel.core.SemiJoin in org.apache.calcite.rel.core has been 
deprecated 
core/src/main/java/org/apache/calcite/rel/metadata/RelMdUtil.java:[28,35] 
org.apache.calcite.rel.core.SemiJoin in org.apache.calcite.rel.core has been 
deprecated 
core/src/main/java/org/apache/calcite/rel/metadata/RelMdSize.java:[29,35] 
org.apache.calcite.rel.core.SemiJoin in org.apache.calcite.rel.core has been 
deprecated 
core/src/main/java/org/apache/calcite/rel/metadata/RelMdDistinctRowCount.java:[27,35]
 org.apache.calcite.rel.core.SemiJoin in org.apache.calcite.rel.core has been 
deprecated 
core/src/main/java/org/apache/calcite/rel/mutable/MutableRels.java:[366,24] 
org.apache.calcite.rel.core.SemiJoin in org.apache.calcite.rel.core has been 
deprecated 
core/src/main/java/org/apache/calcite/rel/mutable/MutableRels.java:[367,13] 
org.apache.calcite.rel.core.SemiJoin in org.apache.calcite.rel.core has been 
deprecated 
core/src/main/java/org/apache/calcite/rel/mutable/MutableRels.java:[367,34] 
org.apache.calcite.rel.core.SemiJoin in org.apache.calcite.rel.core has been 
deprecated 
core/src/main/java/org/apache/calcite/rel/metadata/RelMdRowCount.java:[191,29] 
org.apache.calcite.rel.core.SemiJoin in org.apache.calcite.rel.core has been 
deprecated 
core/src/main/java/org/apache/calcite/tools/Programs.java:[89,26] 
ENUMERABLE_SEMI_JOIN_RULE in 
org.apache.calcite.adapter.enumerable.EnumerableRules has been deprecated 
core/src/main/java/org/apache/calcite/rel/metadata/RelMdNodeTypes.java:[117,67] 
org.apache.calcite.rel.core.SemiJoin in org.apache.calcite.rel.core has been 
deprecated 
core/src/main/java/org/apache/calcite/rel/metadata/RelMdNodeTypes.java:[119,30] 
org.apache.calcite.rel.core.SemiJoin in org.apache.calcite.rel.core has been 
deprecated 
core/src/main/java/org/apache/calcite/rel/metadata/RelMdCollation.java:[159,49] 
org.apache.calcite.adapter.enumerable.EnumerableSemiJoin in 
org.apache.calcite.adapter.enumerable has been deprecated 
core/src/main/java/org/apache/calcite/rel/metadata/RelMdPopulationSize.java:[88,35]
 org.apache.calcite.rel.core.SemiJoin in org.apache.calcite.rel.core has been 
deprecated 
core/src/main/java/org/apache/calcite/rel/metadata/RelMdUniqueKeys.java:[209,45]
 org.apache.calcite.rel.core.SemiJoin in org.apache.calcite.rel.core has been 
deprecated 
core/src/main/java/org/apache/calcite/rel/metadata/RelMdColumnUniqueness.java:[316,35]
 org.apache.calcite.rel.core.SemiJoin in org.apache.calcite.rel.core has been 
deprecated 

[jira] [Commented] (CALCITE-2969) Improve design of join-like relational expressions

2019-05-30 Thread Ruben Quesada Lopez (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16851949#comment-16851949
 ] 

Ruben Quesada Lopez commented on CALCITE-2969:
--

Another belated +1 from me. Thanks [~danny0405] for your hard work on this one!

> Improve design of join-like relational expressions
> --
>
> Key: CALCITE-2969
> URL: https://issues.apache.org/jira/browse/CALCITE-2969
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>  Time Spent: 31h
>  Remaining Estimate: 0h
>
> The existing join-like (Join, SemiJoin, Correlate, etc.) logical and physical 
> relational expressions have a few design issues which make some parts of the 
> codebase complicated and difficult to understand.
> The goal of this ticket is to improve the design of the respective 
> expressions based on the discussion in the dev list (see thread [Join, 
> SemiJoin, 
> Correlate|https://mail-archives.apache.org/mod_mbox/calcite-dev/201903.mbox/%3C8EEA04A0-4A77-4283-BD20-B019E19AE126%40apache.org%3E]).
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2969) Improve design of join-like relational expressions

2019-05-30 Thread Michael Mior (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16851819#comment-16851819
 ] 

Michael Mior commented on CALCITE-2969:
---

A belated +1 from me. Thanks [~danny0405]!

> Improve design of join-like relational expressions
> --
>
> Key: CALCITE-2969
> URL: https://issues.apache.org/jira/browse/CALCITE-2969
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.20.0
>
>  Time Spent: 31h
>  Remaining Estimate: 0h
>
> The existing join-like (Join, SemiJoin, Correlate, etc.) logical and physical 
> relational expressions have a few design issues which make some parts of the 
> codebase complicated and difficult to understand.
> The goal of this ticket is to improve the design of the respective 
> expressions based on the discussion in the dev list (see thread [Join, 
> SemiJoin, 
> Correlate|https://mail-archives.apache.org/mod_mbox/calcite-dev/201903.mbox/%3C8EEA04A0-4A77-4283-BD20-B019E19AE126%40apache.org%3E]).
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2969) Improve design of join-like relational expressions

2019-05-29 Thread Danny Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16851455#comment-16851455
 ] 

Danny Chan commented on CALCITE-2969:
-

Thanks so much for all your nice review comments [~julianhyde] [~zabetak] 
[~hyuan] [~rubenql] ! I would merge this PR today after rebase master :)

For things need to do next:
 * I think CALCITE-2968 would still be open. e.g. the new Anti join Enumerable 
support and the metadata adapt for anti.
 * I think CALCITE-2857 could be renamed like "better plan for anti join" after 
we finish  CALCITE-3089
 * +1 for  CALCITE-3089 to be in 1.21.
 * +1 for mark  CALCITE-3037  'duplicate' when we this PR is merged.

> Improve design of join-like relational expressions
> --
>
> Key: CALCITE-2969
> URL: https://issues.apache.org/jira/browse/CALCITE-2969
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 30.5h
>  Remaining Estimate: 0h
>
> The existing join-like (Join, SemiJoin, Correlate, etc.) logical and physical 
> relational expressions have a few design issues which make some parts of the 
> codebase complicated and difficult to understand.
> The goal of this ticket is to improve the design of the respective 
> expressions based on the discussion in the dev list (see thread [Join, 
> SemiJoin, 
> Correlate|https://mail-archives.apache.org/mod_mbox/calcite-dev/201903.mbox/%3C8EEA04A0-4A77-4283-BD20-B019E19AE126%40apache.org%3E]).
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2969) Improve design of join-like relational expressions

2019-05-29 Thread Haisheng Yuan (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16851365#comment-16851365
 ] 

Haisheng Yuan commented on CALCITE-2969:


Regarding [CALCITE-2857|https://issues.apache.org/jira/browse/CALCITE-2857], I 
think we may still need it to be open, but need to change the subject to 
generate better plan for that kind of subquery, if there is no known duplicate 
issue.

> Improve design of join-like relational expressions
> --
>
> Key: CALCITE-2969
> URL: https://issues.apache.org/jira/browse/CALCITE-2969
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 30.5h
>  Remaining Estimate: 0h
>
> The existing join-like (Join, SemiJoin, Correlate, etc.) logical and physical 
> relational expressions have a few design issues which make some parts of the 
> codebase complicated and difficult to understand.
> The goal of this ticket is to improve the design of the respective 
> expressions based on the discussion in the dev list (see thread [Join, 
> SemiJoin, 
> Correlate|https://mail-archives.apache.org/mod_mbox/calcite-dev/201903.mbox/%3C8EEA04A0-4A77-4283-BD20-B019E19AE126%40apache.org%3E]).
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2969) Improve design of join-like relational expressions

2019-05-29 Thread Stamatis Zampetakis (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16851356#comment-16851356
 ] 

Stamatis Zampetakis commented on CALCITE-2969:
--

+1, apart from great work nothing more to say ;) Thanks a lot [~danny0405] and 
of course all the people who pushed this forward.

> Improve design of join-like relational expressions
> --
>
> Key: CALCITE-2969
> URL: https://issues.apache.org/jira/browse/CALCITE-2969
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 30.5h
>  Remaining Estimate: 0h
>
> The existing join-like (Join, SemiJoin, Correlate, etc.) logical and physical 
> relational expressions have a few design issues which make some parts of the 
> codebase complicated and difficult to understand.
> The goal of this ticket is to improve the design of the respective 
> expressions based on the discussion in the dev list (see thread [Join, 
> SemiJoin, 
> Correlate|https://mail-archives.apache.org/mod_mbox/calcite-dev/201903.mbox/%3C8EEA04A0-4A77-4283-BD20-B019E19AE126%40apache.org%3E]).
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2969) Improve design of join-like relational expressions

2019-05-29 Thread Haisheng Yuan (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16851339#comment-16851339
 ] 

Haisheng Yuan commented on CALCITE-2969:


+1, I think it is good to go, except for small comment update.
I think we can do 
[CALCITE-3089|https://issues.apache.org/jira/browse/CALCITE-3089] next.

> Improve design of join-like relational expressions
> --
>
> Key: CALCITE-2969
> URL: https://issues.apache.org/jira/browse/CALCITE-2969
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 30.5h
>  Remaining Estimate: 0h
>
> The existing join-like (Join, SemiJoin, Correlate, etc.) logical and physical 
> relational expressions have a few design issues which make some parts of the 
> codebase complicated and difficult to understand.
> The goal of this ticket is to improve the design of the respective 
> expressions based on the discussion in the dev list (see thread [Join, 
> SemiJoin, 
> Correlate|https://mail-archives.apache.org/mod_mbox/calcite-dev/201903.mbox/%3C8EEA04A0-4A77-4283-BD20-B019E19AE126%40apache.org%3E]).
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2969) Improve design of join-like relational expressions

2019-05-29 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16851327#comment-16851327
 ] 

Julian Hyde commented on CALCITE-2969:
--

+1 I think this change is ready to go in now (i.e. before 1.20). It is a 
significant change (92 files, 900 lines added, 700 lines removed) and will rot 
if we do not merge soon. [~danny0405] has put in a lot of work and we owe it to 
him to approve or to tell him what it would take to fix it.

What do you think, [~michaelmior], [~zabetak], [~hyuan], [~rubenql]?

Review comment:
* I noticed that in many cases where {{isSemiJoin}} is called, we would want 
the same behavior for ANTI_JOIN as for SEMI_JOIN. It's not surprising, because 
we haven't really got anti-join working. It's something we could fix in a 
follow-up case.
* EnumerableMergeJoin (still) extends EquiJoin. I guess that we will make it 
extend (just) Join in CALCITE-3089.

What are the next steps?
* I think CALCITE-2857 and CALCITE-2968 would become 'wontfix'.
* Shall we do CALCITE-3089 next? I guess it would have to be in 1.21.
* It seems that we have covered CALCITE-3037 in this change, so we should mark 
it 'duplicate' when we merge.

> Improve design of join-like relational expressions
> --
>
> Key: CALCITE-2969
> URL: https://issues.apache.org/jira/browse/CALCITE-2969
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 30h 10m
>  Remaining Estimate: 0h
>
> The existing join-like (Join, SemiJoin, Correlate, etc.) logical and physical 
> relational expressions have a few design issues which make some parts of the 
> codebase complicated and difficult to understand.
> The goal of this ticket is to improve the design of the respective 
> expressions based on the discussion in the dev list (see thread [Join, 
> SemiJoin, 
> Correlate|https://mail-archives.apache.org/mod_mbox/calcite-dev/201903.mbox/%3C8EEA04A0-4A77-4283-BD20-B019E19AE126%40apache.org%3E]).
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2969) Improve design of join-like relational expressions

2019-05-28 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16850114#comment-16850114
 ] 

Julian Hyde commented on CALCITE-2969:
--

Yes. We have until Friday, right? I'm going to do a final review today, but I 
think this change is good enough. Maybe we log some issues to follow up in 1.21.

> Improve design of join-like relational expressions
> --
>
> Key: CALCITE-2969
> URL: https://issues.apache.org/jira/browse/CALCITE-2969
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 30h 10m
>  Remaining Estimate: 0h
>
> The existing join-like (Join, SemiJoin, Correlate, etc.) logical and physical 
> relational expressions have a few design issues which make some parts of the 
> codebase complicated and difficult to understand.
> The goal of this ticket is to improve the design of the respective 
> expressions based on the discussion in the dev list (see thread [Join, 
> SemiJoin, 
> Correlate|https://mail-archives.apache.org/mod_mbox/calcite-dev/201903.mbox/%3C8EEA04A0-4A77-4283-BD20-B019E19AE126%40apache.org%3E]).
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2969) Improve design of join-like relational expressions

2019-05-27 Thread Michael Mior (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16849099#comment-16849099
 ] 

Michael Mior commented on CALCITE-2969:
---

[~julianhyde] Do you still think 1.20.0 should wait on this?

> Improve design of join-like relational expressions
> --
>
> Key: CALCITE-2969
> URL: https://issues.apache.org/jira/browse/CALCITE-2969
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 29h 50m
>  Remaining Estimate: 0h
>
> The existing join-like (Join, SemiJoin, Correlate, etc.) logical and physical 
> relational expressions have a few design issues which make some parts of the 
> codebase complicated and difficult to understand.
> The goal of this ticket is to improve the design of the respective 
> expressions based on the discussion in the dev list (see thread [Join, 
> SemiJoin, 
> Correlate|https://mail-archives.apache.org/mod_mbox/calcite-dev/201903.mbox/%3C8EEA04A0-4A77-4283-BD20-B019E19AE126%40apache.org%3E]).
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2969) Improve design of join-like relational expressions

2019-05-17 Thread Danny Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16842244#comment-16842244
 ] 

Danny Chan commented on CALCITE-2969:
-

Thanks [~julianhyde] and [~zabetak], i think it is okey to wait for all of us 
to agree with the change, after all, it has some breaking changes. I would try 
to make response for all of your nice comments.

I kind of want to keep the commits msgs now cause the PR lasts long time, and 
the commit messages is a good recording of review history.

> Improve design of join-like relational expressions
> --
>
> Key: CALCITE-2969
> URL: https://issues.apache.org/jira/browse/CALCITE-2969
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 24h 10m
>  Remaining Estimate: 0h
>
> The existing join-like (Join, SemiJoin, Correlate, etc.) logical and physical 
> relational expressions have a few design issues which make some parts of the 
> codebase complicated and difficult to understand.
> The goal of this ticket is to improve the design of the respective 
> expressions based on the discussion in the dev list (see thread [Join, 
> SemiJoin, 
> Correlate|https://mail-archives.apache.org/mod_mbox/calcite-dev/201903.mbox/%3C8EEA04A0-4A77-4283-BD20-B019E19AE126%40apache.org%3E]).
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2969) Improve design of join-like relational expressions

2019-05-17 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16842224#comment-16842224
 ] 

Julian Hyde commented on CALCITE-2969:
--

I agree with [~zabetak]. I am on vacation until Monday and not able to do 
serious review before then. May I suggest the following. Squash and rebase the 
existing PR into a new branch with a single commit that is a candidate to 
merge. Make sure that the breaking API changes are documented in history.md. 
Then I intend to review it and see whether it feels like an improvement, 
whether it addresses the issues raised during review, and whether the breaking 
API changes are justified.

For the record, I am fine with breaking API changes, if they make things 
significantly simpler. We should especially consider marking APIs deprecated 
for one minor release (i.e. deprecated in 1.20, gone in 1.21).

> Improve design of join-like relational expressions
> --
>
> Key: CALCITE-2969
> URL: https://issues.apache.org/jira/browse/CALCITE-2969
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 23h 50m
>  Remaining Estimate: 0h
>
> The existing join-like (Join, SemiJoin, Correlate, etc.) logical and physical 
> relational expressions have a few design issues which make some parts of the 
> codebase complicated and difficult to understand.
> The goal of this ticket is to improve the design of the respective 
> expressions based on the discussion in the dev list (see thread [Join, 
> SemiJoin, 
> Correlate|https://mail-archives.apache.org/mod_mbox/calcite-dev/201903.mbox/%3C8EEA04A0-4A77-4283-BD20-B019E19AE126%40apache.org%3E]).
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2969) Improve design of join-like relational expressions

2019-05-17 Thread Stamatis Zampetakis (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16842134#comment-16842134
 ] 

Stamatis Zampetakis commented on CALCITE-2969:
--

Thanks for the reminder [~danny0405]! Indeed the PR is in good shape.

However, I think we should not rush it since it is a big and important 
refactoring. The 24h window may be a bit short for some people. Moreover there 
are still open conversations on GitHub which I think should be resolved (by the 
reviewers) before proceeding. [~vlsi] has also done some comments so I am 
putting him in the conversation. 

> Improve design of join-like relational expressions
> --
>
> Key: CALCITE-2969
> URL: https://issues.apache.org/jira/browse/CALCITE-2969
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 23h 10m
>  Remaining Estimate: 0h
>
> The existing join-like (Join, SemiJoin, Correlate, etc.) logical and physical 
> relational expressions have a few design issues which make some parts of the 
> codebase complicated and difficult to understand.
> The goal of this ticket is to improve the design of the respective 
> expressions based on the discussion in the dev list (see thread [Join, 
> SemiJoin, 
> Correlate|https://mail-archives.apache.org/mod_mbox/calcite-dev/201903.mbox/%3C8EEA04A0-4A77-4283-BD20-B019E19AE126%40apache.org%3E]).
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2969) Improve design of join-like relational expressions

2019-05-16 Thread Danny Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16841860#comment-16841860
 ] 

Danny Chan commented on CALCITE-2969:
-

Thanks for all of your nice review comments [~julianhyde] [~hyuan] [~zabetak] 
[~rubenql], i think the PR is kind of stable now(no vital bugs).

So i'm planning to merge this PR in 24 hours if there are no more comments. Be 
free to fire new comments and i'm glad to do fix up in later patches if we 
really found some fatal defects with current patch.

> Improve design of join-like relational expressions
> --
>
> Key: CALCITE-2969
> URL: https://issues.apache.org/jira/browse/CALCITE-2969
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 23h 10m
>  Remaining Estimate: 0h
>
> The existing join-like (Join, SemiJoin, Correlate, etc.) logical and physical 
> relational expressions have a few design issues which make some parts of the 
> codebase complicated and difficult to understand.
> The goal of this ticket is to improve the design of the respective 
> expressions based on the discussion in the dev list (see thread [Join, 
> SemiJoin, 
> Correlate|https://mail-archives.apache.org/mod_mbox/calcite-dev/201903.mbox/%3C8EEA04A0-4A77-4283-BD20-B019E19AE126%40apache.org%3E]).
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2969) Improve design of join-like relational expressions

2019-04-29 Thread Haisheng Yuan (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16829782#comment-16829782
 ] 

Haisheng Yuan commented on CALCITE-2969:


The PR in Github is stuffed with our comments, I will move here.
{quote}
The reason for variablesSet is correlated scalar sub-query in the ON clause. 
Similarly Filter needs variablesSet if there is a correlated scalar sub-query 
in its condition, and Project needs variablesSet for correlated scalar 
sub-query in one of the project expressions.

This stuff isn't very well tested, but it is needed because sub-query can occur 
anywhere in a query.
{quote}

If the variableSet is defined as the relations that are referenced out of 
current sub-query context, then it makes sense. Almost every operator in the 
subquery may reference a column outside of sub-query, thus it needs the 
variableSet.

But I checked the code, only Filter actually creates a variableSet and uses it. 
Project's constructor doesn't even accept variableSet. Although Join has 
variableSet, it is never used, which can cause wrong plan, like the following 
query:

{code:sql}
select empno from sales.emp as r left join sales.dept as s 
on 1= (select s.deptno from sales.emp e1 left outer join sales.dept d1 
 on e1.empno+d1.deptno = s.deptno+20)
{code}

Inital plan is:
{code:java}
LogicalProject(EMPNO=[$0])
  LogicalJoin(condition=[=(1, $SCALAR_QUERY({
LogicalProject(DEPTNO=[$cor1.DEPTNO0])
  LogicalJoin(condition=[=(+($0, $9), +($cor0.DEPTNO0, 20))], joinType=[left])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
}))], joinType=[left])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
{code}

The Project and Join inside the sub-query should have non-empty variableSet, 
but Project and Join never set its variableSet.

Final plan is obviously wrong:
{code:java}
LogicalProject(EMPNO=[$0])
  LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], DEPTNO0=[$9], NAME=[$10])
LogicalJoin(condition=[=(1, $11)], joinType=[left])
  LogicalTableScan(table=[[CATALOG, SALES, EMP]])
  LogicalJoin(condition=[true], joinType=[left])
LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
LogicalAggregate(group=[{}], agg#0=[SINGLE_VALUE($0)])
  LogicalProject(DEPTNO=[$cor1.DEPTNO0])
LogicalJoin(condition=[=(+($0, $9), +($cor0.DEPTNO0, 20))], 
joinType=[left])
  LogicalTableScan(table=[[CATALOG, SALES, EMP]])
  LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
{code}

But anyway, this is off the topic for current issue. I agree we should keep the 
variablesSet. Thanks for the clarification.


> Improve design of join-like relational expressions
> --
>
> Key: CALCITE-2969
> URL: https://issues.apache.org/jira/browse/CALCITE-2969
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 11h 50m
>  Remaining Estimate: 0h
>
> The existing join-like (Join, SemiJoin, Correlate, etc.) logical and physical 
> relational expressions have a few design issues which make some parts of the 
> codebase complicated and difficult to understand.
> The goal of this ticket is to improve the design of the respective 
> expressions based on the discussion in the dev list (see thread [Join, 
> SemiJoin, 
> Correlate|https://mail-archives.apache.org/mod_mbox/calcite-dev/201903.mbox/%3C8EEA04A0-4A77-4283-BD20-B019E19AE126%40apache.org%3E]).
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2969) Improve design of join-like relational expressions

2019-04-27 Thread Danny Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827538#comment-16827538
 ] 

Danny Chan commented on CALCITE-2969:
-

[~zabetak] Thx so much for your suggestions,
{quote}To sum up my suggestions are the following:
 * keep Join and Correlate separate and do not allow joins to have correlated 
variables;
 * retain EnumerableCorrelate without changes;
 * rename EnumerableThetaJoin to EnumerableNestedLoopJoin;{quote}
I have kept EnumerableCorrelate as it is before for the current patch, for 
renaming EnumerableThetaJoin to EnumerableNestedLoopJoin i'm already planning 
to it in another patch, i think we need just confirm the first one.

> Improve design of join-like relational expressions
> --
>
> Key: CALCITE-2969
> URL: https://issues.apache.org/jira/browse/CALCITE-2969
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> The existing join-like (Join, SemiJoin, Correlate, etc.) logical and physical 
> relational expressions have a few design issues which make some parts of the 
> codebase complicated and difficult to understand.
> The goal of this ticket is to improve the design of the respective 
> expressions based on the discussion in the dev list (see thread [Join, 
> SemiJoin, 
> Correlate|https://mail-archives.apache.org/mod_mbox/calcite-dev/201903.mbox/%3C8EEA04A0-4A77-4283-BD20-B019E19AE126%40apache.org%3E]).
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2969) Improve design of join-like relational expressions

2019-04-27 Thread Stamatis Zampetakis (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827499#comment-16827499
 ] 

Stamatis Zampetakis commented on CALCITE-2969:
--

Thanks again for pushing this forward [~danny0405]!

At the moment, I have some rather high level comments.

As far as I can see the PR assumes that Join and Correlate are not the same and 
I think it should remain like that for the moment. Given the fact that they are 
different a Join should never have correlated variables. This precondition 
should allow simplifying some parts of code where we have joins and we perform 
various checks for correlated variables.

At first I thought that EnumerableCorrelate is (or at least should) be renamed 
EnumerableNestedLoopJoin but now I have some second thoughts. Undeniable they 
are very close but the fact that we are setting variables in the one side of 
the join could help us classify it purely as Correlate. We could have an 
EnumerableNestedLoopJoin which rather than setting variables to the one side of 
the join, it performs the classic double for-loop and the operator extracts the 
necessary join attributes from both sides of the join to apply the join 
condition. I have the impression that EnumerableThetaJoin is very close to the 
EnumerableNestedLoopJoin that I describe above.

To sum up my suggestions are the following:
 * keep Join and Correlate separate and do not allow joins to have correlated 
variables;
 * retain EnumerableCorrelate without changes;
 * rename EnumerableThetaJoin to EnumerableNestedLoopJoin;

but let's see what the others have to say about this. I had a look in the PR 
and it seems that [~hyuan] and [~rubenql] also agree on this.

> Improve design of join-like relational expressions
> --
>
> Key: CALCITE-2969
> URL: https://issues.apache.org/jira/browse/CALCITE-2969
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> The existing join-like (Join, SemiJoin, Correlate, etc.) logical and physical 
> relational expressions have a few design issues which make some parts of the 
> codebase complicated and difficult to understand.
> The goal of this ticket is to improve the design of the respective 
> expressions based on the discussion in the dev list (see thread [Join, 
> SemiJoin, 
> Correlate|https://mail-archives.apache.org/mod_mbox/calcite-dev/201903.mbox/%3C8EEA04A0-4A77-4283-BD20-B019E19AE126%40apache.org%3E]).
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2969) Improve design of join-like relational expressions

2019-04-22 Thread Danny Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16823651#comment-16823651
 ] 

Danny Chan commented on CALCITE-2969:
-

[~julianhyde] Thanks for your review, i have made the changes:
 * Add the breaking changes to history.md
 * Rename returnsJustFirstInput to projectsRight which returns false for semi 
and anti join
 * Replace check 
{code:java}
if (rel.getJoinType() != JoinRelType.INNER
&& !rel.getJoinType().returnsJustFirstInput()) 
{code}
with 
{code:java}
rel.getJoinType().generatesNullsOnLeft()
|| rel.getJoinType().generatesNullsOnRight(){code}

 * Add method isCorrelated to Join, and still keep method 
isNonCorrelatedSemiJoin, cause i don't want to composite decision condition
{code:java}
isCorrelated() && isSemiJoin()
{code}
everywhere
 * remove {{variablesSet}} is not-null decision

> Improve design of join-like relational expressions
> --
>
> Key: CALCITE-2969
> URL: https://issues.apache.org/jira/browse/CALCITE-2969
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> The existing join-like (Join, SemiJoin, Correlate, etc.) logical and physical 
> relational expressions have a few design issues which make some parts of the 
> codebase complicated and difficult to understand.
> The goal of this ticket is to improve the design of the respective 
> expressions based on the discussion in the dev list (see thread [Join, 
> SemiJoin, 
> Correlate|https://mail-archives.apache.org/mod_mbox/calcite-dev/201903.mbox/%3C8EEA04A0-4A77-4283-BD20-B019E19AE126%40apache.org%3E]).
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2969) Improve design of join-like relational expressions

2019-04-22 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16823296#comment-16823296
 ] 

Julian Hyde commented on CALCITE-2969:
--

I have reviewed. This looks very promising, and I think we will be able to 
merge soon.
* [~danny0405], while we're reviewing and evolving, can you add commits rather 
than force-push. Of course we can squash and rebase before final commit.
* Can you add a section to HISTORY.md (which will become the release notes for 
1.20) with the list of breaking changes. Breaking changes are things that would 
cause user code to not compile (say, renaming a class), or to not link if it 
has been compiled already. By that definition, deprecations are not breaking 
changes. We will need to review to see whether the list of breaking changes is 
acceptable from a cost:benefit perspective.
* In deprecated APIs, can you indicate what the API has been replaced with. 
Usually a '@deprecated' javadoc comment works well.
* The name {{returnsJustFirstInput}} isn't working for me. How about 
{{projectsRight()}}, which returns false for semi-join? Analogous to 
{{generatesNullsOnRight()}}.
* The check {code}if (rel.getJoinType() != JoinRelType.INNER
&& !rel.getJoinType().returnsJustFirstInput()) {code} seems redundant
* I agree with [~hyuan] that {{isNonCorrelateSemiJoin}} is confusing. Maybe 
have a method {{boolean isCorrelated()}}.
* {{variablesSet}} is not-null; so {{variablesSet == null}} is over-defensive 
and should be removed.

> Improve design of join-like relational expressions
> --
>
> Key: CALCITE-2969
> URL: https://issues.apache.org/jira/browse/CALCITE-2969
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> The existing join-like (Join, SemiJoin, Correlate, etc.) logical and physical 
> relational expressions have a few design issues which make some parts of the 
> codebase complicated and difficult to understand.
> The goal of this ticket is to improve the design of the respective 
> expressions based on the discussion in the dev list (see thread [Join, 
> SemiJoin, 
> Correlate|https://mail-archives.apache.org/mod_mbox/calcite-dev/201903.mbox/%3C8EEA04A0-4A77-4283-BD20-B019E19AE126%40apache.org%3E]).
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2969) Improve design of join-like relational expressions

2019-04-12 Thread Danny Chan (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16816820#comment-16816820
 ] 

Danny Chan commented on CALCITE-2969:
-

[~hyuan] I also think so, so for the first PR, i almost modified the SemiJoin 
and Correlate, the theta join and anti join should be listed here also. And i'm 
planning to do them for the later patches.

> Improve design of join-like relational expressions
> --
>
> Key: CALCITE-2969
> URL: https://issues.apache.org/jira/browse/CALCITE-2969
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Danny Chan
>Priority: Major
>
> The existing join-like (Join, SemiJoin, Correlate, etc.) logical and physical 
> relational expressions have a few design issues which make some parts of the 
> codebase complicated and difficult to understand.
> The goal of this ticket is to improve the design of the respective 
> expressions based on the discussion in the dev list (see thread [Join, 
> SemiJoin, 
> Correlate|https://mail-archives.apache.org/mod_mbox/calcite-dev/201903.mbox/%3C8EEA04A0-4A77-4283-BD20-B019E19AE126%40apache.org%3E]).
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2969) Improve design of join-like relational expressions

2019-04-12 Thread Haisheng Yuan (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16816681#comment-16816681
 ] 

Haisheng Yuan commented on CALCITE-2969:


I was assuming this is just an umbrella issue and there might be some 
sub-tasks. 

> Improve design of join-like relational expressions
> --
>
> Key: CALCITE-2969
> URL: https://issues.apache.org/jira/browse/CALCITE-2969
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Stamatis Zampetakis
>Assignee: Danny Chan
>Priority: Major
>
> The existing join-like (Join, SemiJoin, Correlate, etc.) logical and physical 
> relational expressions have a few design issues which make some parts of the 
> codebase complicated and difficult to understand.
> The goal of this ticket is to improve the design of the respective 
> expressions based on the discussion in the dev list (see thread [Join, 
> SemiJoin, 
> Correlate|https://mail-archives.apache.org/mod_mbox/calcite-dev/201903.mbox/%3C8EEA04A0-4A77-4283-BD20-B019E19AE126%40apache.org%3E]).
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)