[jira] [Commented] (CALCITE-3972) Allow RelBuilder to create RelNode with convention and use it for trait convert

2020-05-09 Thread Xiening Dai (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17103644#comment-17103644
 ] 

Xiening Dai commented on CALCITE-3972:
--

The thing is you don't want people to update RelFactories.java whenever there's 
a new convention available. Convention#getRelFactories seperates the logics, 
and can totally lives in user code base, instead of Calcite core.

> Allow RelBuilder to create RelNode with convention and use it for trait 
> convert
> ---
>
> Key: CALCITE-3972
> URL: https://issues.apache.org/jira/browse/CALCITE-3972
> Project: Calcite
>  Issue Type: Bug
>Reporter: Xiening Dai
>Assignee: Xiening Dai
>Priority: Major
> Fix For: 1.23.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> 1. Provide Convention.transformRelBuilder() to transform an existing 
> RelBuilder into one with specific convention.
> 2. RelBuilder provides withRelFactories() method to allow caller swap the 
> underlying RelFactories and create a new builder. 
> 3. Use the new interface in RelCollationTraitDef for converting into 
> RelCollation traits
> We can avoid ~1/3 of total rule firings in a N way join case with this change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3972) Allow RelBuilder to create RelNode with convention and use it for trait convert

2020-05-09 Thread Danny Chen (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17103635#comment-17103635
 ] 

Danny Chen commented on CALCITE-3972:
-

Instead of add a `getRelFactories` interface to Convention, how about add a 
tool method `RelFactories#getRelFactories(Convention)`, it seems more straight 
forward because `RelFactories` is the factory to create all kinds of factory 
STRUCT.

> Allow RelBuilder to create RelNode with convention and use it for trait 
> convert
> ---
>
> Key: CALCITE-3972
> URL: https://issues.apache.org/jira/browse/CALCITE-3972
> Project: Calcite
>  Issue Type: Bug
>Reporter: Xiening Dai
>Assignee: Xiening Dai
>Priority: Major
> Fix For: 1.23.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> 1. Provide Convention.transformRelBuilder() to transform an existing 
> RelBuilder into one with specific convention.
> 2. RelBuilder provides withRelFactories() method to allow caller swap the 
> underlying RelFactories and create a new builder. 
> 3. Use the new interface in RelCollationTraitDef for converting into 
> RelCollation traits
> We can avoid ~1/3 of total rule firings in a N way join case with this change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3961) VolcanoPlanner.prunedNodes information is lost when duplicate relNode is discarded

2020-05-09 Thread Botong Huang (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17103628#comment-17103628
 ] 

Botong Huang commented on CALCITE-3961:
---

Thanks [~hyuan] for the review!

> VolcanoPlanner.prunedNodes information is lost when duplicate relNode is 
> discarded
> --
>
> Key: CALCITE-3961
> URL: https://issues.apache.org/jira/browse/CALCITE-3961
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
> Fix For: 1.23.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> VolcanoPlanner.prunedNodes stores the list of relNodes that are marked 
> useless. Whenever the planner see two identical relNode (e.g. when Relsets 
> are merged), one of them are discarded. However, when the preserved node is 
> not in the pruned list while the discarded one is, this pruned information is 
> lost. In general, we should preserve this info whenever duplicate relNodes 
> are discarded. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3982) FilterMergeRule can lead to AssertionError

2020-05-09 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17103609#comment-17103609
 ] 

Julian Hyde commented on CALCITE-3982:
--

That sounds right. There are a few places where we use RexProgram as an 
intermediate utility and now that RelBuilder exists there are probably better 
(and more efficient) ways.

> FilterMergeRule can lead to AssertionError
> --
>
> Key: CALCITE-3982
> URL: https://issues.apache.org/jira/browse/CALCITE-3982
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This could potentially happen since Filter creation has a check on whether 
> the expression is flat 
> ([here|https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/core/Filter.java#L74])
>  and Filter merge does not flatten an expression when it is created.
> {noformat}
> java.lang.AssertionError: AND(=($3, 100), OR(OR(null, IS NOT 
> NULL(CAST(100):INTEGER)), =(CAST(100):INTEGER, CAST(200):INTEGER)))
>   at org.apache.calcite.rel.core.Filter.(Filter.java:74)
>   at 
> org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveFilter.(HiveFilter.java:39)
>   at 
> org.apache.hadoop.hive.ql.optimizer.calcite.HiveRelFactories$HiveFilterFactoryImpl.createFilter(HiveRelFactories.java:126)
>   at 
> org.apache.hadoop.hive.ql.optimizer.calcite.HiveRelBuilder.filter(HiveRelBuilder.java:99)
>   at org.apache.calcite.tools.RelBuilder.filter(RelBuilder.java:1055)
>   at 
> org.apache.calcite.rel.rules.FilterMergeRule.onMatch(FilterMergeRule.java:81)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3976) Generify the DefaultEdge class

2020-05-09 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17103608#comment-17103608
 ] 

Julian Hyde commented on CALCITE-3976:
--

Can you attach the PR?

> Generify the DefaultEdge class
> --
>
> Key: CALCITE-3976
> URL: https://issues.apache.org/jira/browse/CALCITE-3976
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Liya Fan
>Assignee: Liya Fan
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Currently, the {{source}} and {{target}} fields of class {{DefaultEdge}} is 
> {{Object}}. This makes it necessary for some casts in the code base. In 
> addition, it does not enforce the assertion that in a graph, the vertices 
> have the same type as sources and targes in the edges. 
> To solve the problem, we generify the DefaultEdge class with the type of the 
> source/target vertices.
> The benefits of generfication includes type safety: the above assertion can 
> be enforced by the generified class. It also gives the compiler an 
> opportunity to detect type related problems at compilation time. Without 
> generification, some problems can only be detected at runtime, when a 
> ClassCastException is thrown. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-3961) VolcanoPlanner.prunedNodes information is lost when duplicate relNode is discarded

2020-05-09 Thread Haisheng Yuan (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haisheng Yuan resolved CALCITE-3961.

Resolution: Fixed

Fixed in 
[https://github.com/apache/calcite/commit/2a4779f478fea75c1a7b075b8da50b20b6fda9bb],
 thanks for the PR, [~botong]!

> VolcanoPlanner.prunedNodes information is lost when duplicate relNode is 
> discarded
> --
>
> Key: CALCITE-3961
> URL: https://issues.apache.org/jira/browse/CALCITE-3961
> Project: Calcite
>  Issue Type: Bug
>Reporter: Botong Huang
>Priority: Major
> Fix For: 1.23.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> VolcanoPlanner.prunedNodes stores the list of relNodes that are marked 
> useless. Whenever the planner see two identical relNode (e.g. when Relsets 
> are merged), one of them are discarded. However, when the preserved node is 
> not in the pruned list while the discarded one is, this pruned information is 
> lost. In general, we should preserve this info whenever duplicate relNodes 
> are discarded. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CALCITE-3788) SqlValidatorImpl.registerOperandSubQueries should skip creating SCALAR_QUERY call when operand is a SqlSelect cause the SqlSelect does not return a scalar value

2020-05-09 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated CALCITE-3788:
--
Summary: SqlValidatorImpl.registerOperandSubQueries should skip creating 
SCALAR_QUERY call when operand is a SqlSelect cause the SqlSelect does not 
return a scalar value  (was: SqlValidatorImpl.registerOperandSubQueries should 
skip creating SCALAR_QUERY call when operand is a SqlSelect and the SqlSelect 
does not return a scalar value)

> SqlValidatorImpl.registerOperandSubQueries should skip creating SCALAR_QUERY 
> call when operand is a SqlSelect cause the SqlSelect does not return a scalar 
> value
> 
>
> Key: CALCITE-3788
> URL: https://issues.apache.org/jira/browse/CALCITE-3788
> Project: Calcite
>  Issue Type: Sub-task
>Reporter: Rui Wang
>Priority: Major
>
> For a table function which uses named argument for a TABLE parameter:
> {code:sql}
> Select * From
> TABLE(TUMBLE(
>data =>  TABLE orders
>...
> )
> {code}
> The TABLE parameter will be wrapped by a SCALAR_QUERY call at this line: 
> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L3067
> However, it is wrong because TABLE paramter is not a query that returns a 
> scalar value.
> It cannot be solved by overriding SqlOperator.argumentMustBeScalar because 
> named argument is a special operator that doesn't tied with other operators.
> One possible resolution is also check if operand is SqlSelect at  
> SqlValidatorImpl.java#L3067.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3982) FilterMergeRule can lead to AssertionError

2020-05-09 Thread Jesus Camacho Rodriguez (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17103589#comment-17103589
 ] 

Jesus Camacho Rodriguez commented on CALCITE-3982:
--

It seems this issue appears in Hive because we override some of the methods in 
the builder to avoid calling expression simplification when we create a filter.

I have not been able to reproduce the issue in Calcite. However, it seems to me 
that using the RexProgram in FilterMergeRule is unnecessary in any case 
([~zabetak] pointed this out in the Hive PR)? I have created a patch that 
removes the usage of the program and relies solely on the builder, and I could 
not find any test regression: https://github.com/apache/calcite/pull/1968 .

> FilterMergeRule can lead to AssertionError
> --
>
> Key: CALCITE-3982
> URL: https://issues.apache.org/jira/browse/CALCITE-3982
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This could potentially happen since Filter creation has a check on whether 
> the expression is flat 
> ([here|https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/core/Filter.java#L74])
>  and Filter merge does not flatten an expression when it is created.
> {noformat}
> java.lang.AssertionError: AND(=($3, 100), OR(OR(null, IS NOT 
> NULL(CAST(100):INTEGER)), =(CAST(100):INTEGER, CAST(200):INTEGER)))
>   at org.apache.calcite.rel.core.Filter.(Filter.java:74)
>   at 
> org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveFilter.(HiveFilter.java:39)
>   at 
> org.apache.hadoop.hive.ql.optimizer.calcite.HiveRelFactories$HiveFilterFactoryImpl.createFilter(HiveRelFactories.java:126)
>   at 
> org.apache.hadoop.hive.ql.optimizer.calcite.HiveRelBuilder.filter(HiveRelBuilder.java:99)
>   at org.apache.calcite.tools.RelBuilder.filter(RelBuilder.java:1055)
>   at 
> org.apache.calcite.rel.rules.FilterMergeRule.onMatch(FilterMergeRule.java:81)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (CALCITE-3982) FilterMergeRule can lead to AssertionError

2020-05-09 Thread Jesus Camacho Rodriguez (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned CALCITE-3982:


Assignee: Jesus Camacho Rodriguez

> FilterMergeRule can lead to AssertionError
> --
>
> Key: CALCITE-3982
> URL: https://issues.apache.org/jira/browse/CALCITE-3982
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This could potentially happen since Filter creation has a check on whether 
> the expression is flat 
> ([here|https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/core/Filter.java#L74])
>  and Filter merge does not flatten an expression when it is created.
> {noformat}
> java.lang.AssertionError: AND(=($3, 100), OR(OR(null, IS NOT 
> NULL(CAST(100):INTEGER)), =(CAST(100):INTEGER, CAST(200):INTEGER)))
>   at org.apache.calcite.rel.core.Filter.(Filter.java:74)
>   at 
> org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveFilter.(HiveFilter.java:39)
>   at 
> org.apache.hadoop.hive.ql.optimizer.calcite.HiveRelFactories$HiveFilterFactoryImpl.createFilter(HiveRelFactories.java:126)
>   at 
> org.apache.hadoop.hive.ql.optimizer.calcite.HiveRelBuilder.filter(HiveRelBuilder.java:99)
>   at org.apache.calcite.tools.RelBuilder.filter(RelBuilder.java:1055)
>   at 
> org.apache.calcite.rel.rules.FilterMergeRule.onMatch(FilterMergeRule.java:81)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (CALCITE-3985) Simplify grouped window function in parser

2020-05-09 Thread Rui Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang reassigned CALCITE-3985:
-

Assignee: Rui Wang

> Simplify grouped window function in parser
> --
>
> Key: CALCITE-3985
> URL: https://issues.apache.org/jira/browse/CALCITE-3985
> Project: Calcite
>  Issue Type: Sub-task
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>
> Currently in parser, there is [1]:
> {code:java}
> SqlCall GroupByWindowingCall():
> {
> final Span s;
> final List args;
> final SqlOperator op;
> }
> {
> (
> 
> {
> s = span();
> op = SqlStdOperatorTable.TUMBLE_OLD;
> }
> |
> 
> {
> s = span();
> op = SqlStdOperatorTable.HOP_OLD;
> }
> |
> 
> {
> s = span();
> op = SqlStdOperatorTable.SESSION_OLD;
> }
> )
> args = UnquantifiedFunctionParameterList(ExprContext.ACCEPT_SUB_QUERY) {
> return op.createCall(s.end(this), args);
> }
> }
> {code}
> The s=span() are duplicates and there could be a way to keep only one 
> s=span().
> [1]: 
> https://github.com/apache/calcite/blob/master/core/src/main/codegen/templates/Parser.jj#L6049



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-3985) Simplify grouped window function in parser

2020-05-09 Thread Rui Wang (Jira)
Rui Wang created CALCITE-3985:
-

 Summary: Simplify grouped window function in parser
 Key: CALCITE-3985
 URL: https://issues.apache.org/jira/browse/CALCITE-3985
 Project: Calcite
  Issue Type: Sub-task
Reporter: Rui Wang


Currently in parser, there is [1]:


{code:java}
SqlCall GroupByWindowingCall():
{
final Span s;
final List args;
final SqlOperator op;
}
{
(

{
s = span();
op = SqlStdOperatorTable.TUMBLE_OLD;
}
|

{
s = span();
op = SqlStdOperatorTable.HOP_OLD;
}
|

{
s = span();
op = SqlStdOperatorTable.SESSION_OLD;
}
)
args = UnquantifiedFunctionParameterList(ExprContext.ACCEPT_SUB_QUERY) {
return op.createCall(s.end(this), args);
}
}
{code}

The s=span() are duplicates and there could be a way to keep only one s=span().

[1]: 
https://github.com/apache/calcite/blob/master/core/src/main/codegen/templates/Parser.jj#L6049




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-3780) SESSION Table-valued Function

2020-05-09 Thread Feng Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feng Zhu resolved CALCITE-3780.
---
Fix Version/s: 1.23.0
   Resolution: Fixed

Fixed in 
[https://github.com/apache/calcite/commit/890eb61ef486e2192110cefe4cac5aa6f150],
 thanks for your work [~amaliujia]!

> SESSION Table-valued Function
> -
>
> Key: CALCITE-3780
> URL: https://issues.apache.org/jira/browse/CALCITE-3780
> Project: Calcite
>  Issue Type: Sub-task
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Fix For: 1.23.0
>
>
> We can create SESSION table-valued function to replace GROUP BY SESSION for 
> inactive gap session functionality:
> {code:sql}
> SELECT *
> FROM TABLE SESSION (
>   data => TABLE Bid ,
>   timecol => DESCRIPTOR ( bidtime ) ,
>   keycol => DESCRIPTOR(key),
>   inactive_gap => INTERVAL '10' MINUTES )
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (CALCITE-3737) HOP Table-valued Function

2020-05-09 Thread Feng Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17103283#comment-17103283
 ] 

Feng Zhu edited comment on CALCITE-3737 at 5/9/20, 12:26 PM:
-

Fixed via 
[https://github.com/apache/calcite/commit/40e588de5f999034e5030b12cdbc90f4073808fe],
 thanks for your PR [~amaliujia]!


was (Author: donnyzone):
Fixed via 
[https://github.com/apache/calcite/commit/890eb61ef486e2192110cefe4cac5aa6f150],
 thanks for your PR [~amaliujia]!

> HOP Table-valued Function
> -
>
> Key: CALCITE-3737
> URL: https://issues.apache.org/jira/browse/CALCITE-3737
> Project: Calcite
>  Issue Type: Sub-task
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.23.0
>
>  Time Spent: 20h
>  Remaining Estimate: 0h
>
> Hopping windows place intervals of a fixed size evenly spaced across event 
> time. Most importantly, in the most common use a given event time timestamp 
> will generally fall into more than one window.
> The table-valued function Hop may produce zero, one, or multiple rows 
> corresponding to each row of input.  Hop takes four required parameters and 
> one optional parameter. All parameters are analogous to those for Tumble 
> except for hopsize, which specifies the duration between the starting points 
> (and endpoints) of the hopping windows, allowing for overlapping windows 
> (hopsize < dur, common) or gaps in the data (hopsize > dur, rarely useful).
> {code:java}
> Hop (data , timecol , dur, hopsize)
> {code}
> The return value of Hop is a relation that includes all columns of data as 
> well as additional event time columns wstart and wend. Here is an example 
> (from https://s.apache.org/streaming-beam-sql ):
> {code:sql}
> SELECT *
>   FROM Hop (
> data=> TABLE Bids ,
> timecol => DESCRIPTOR ( bidtime ) ,
> dur => INTERVAL '10' MINUTES ,
> hopsize => INTERVAL '5' MINUTES );
> --
> | wstart | wend | bidtime | price | item |
> --
> | 8:00   | 8:10 | 8:07| $2| A|
> | 8:05   | 8:15 | 8:07| $2| A|
> | 8:05   | 8:15 | 8:11| $3| B|
> | 8:10   | 8:20 | 8:11| $3| B|
> | 8:00   | 8:10 | 8:05| $4| C|
> | 8:05   | 8:15 | 8:05| $4| C|
> | 8:00   | 8:10 | 8:09| $5| D|
> | 8:05   | 8:15 | 8:09| $5| D|
> | 8:05   | 8:15 | 8:13| $1| E|
> | 8:10   | 8:20 | 8:13| $1| E|
> | 8:10   | 8:20 | 8:17| $6| F|
> | 8:15   | 8:25 | 8:17| $6| F|
> --
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-3737) HOP Table-valued Function

2020-05-09 Thread Feng Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feng Zhu resolved CALCITE-3737.
---
Resolution: Fixed

Fixed via 
[https://github.com/apache/calcite/commit/890eb61ef486e2192110cefe4cac5aa6f150],
 thanks for your PR [~amaliujia]!

> HOP Table-valued Function
> -
>
> Key: CALCITE-3737
> URL: https://issues.apache.org/jira/browse/CALCITE-3737
> Project: Calcite
>  Issue Type: Sub-task
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.23.0
>
>  Time Spent: 20h
>  Remaining Estimate: 0h
>
> Hopping windows place intervals of a fixed size evenly spaced across event 
> time. Most importantly, in the most common use a given event time timestamp 
> will generally fall into more than one window.
> The table-valued function Hop may produce zero, one, or multiple rows 
> corresponding to each row of input.  Hop takes four required parameters and 
> one optional parameter. All parameters are analogous to those for Tumble 
> except for hopsize, which specifies the duration between the starting points 
> (and endpoints) of the hopping windows, allowing for overlapping windows 
> (hopsize < dur, common) or gaps in the data (hopsize > dur, rarely useful).
> {code:java}
> Hop (data , timecol , dur, hopsize)
> {code}
> The return value of Hop is a relation that includes all columns of data as 
> well as additional event time columns wstart and wend. Here is an example 
> (from https://s.apache.org/streaming-beam-sql ):
> {code:sql}
> SELECT *
>   FROM Hop (
> data=> TABLE Bids ,
> timecol => DESCRIPTOR ( bidtime ) ,
> dur => INTERVAL '10' MINUTES ,
> hopsize => INTERVAL '5' MINUTES );
> --
> | wstart | wend | bidtime | price | item |
> --
> | 8:00   | 8:10 | 8:07| $2| A|
> | 8:05   | 8:15 | 8:07| $2| A|
> | 8:05   | 8:15 | 8:11| $3| B|
> | 8:10   | 8:20 | 8:11| $3| B|
> | 8:00   | 8:10 | 8:05| $4| C|
> | 8:05   | 8:15 | 8:05| $4| C|
> | 8:00   | 8:10 | 8:09| $5| D|
> | 8:05   | 8:15 | 8:09| $5| D|
> | 8:05   | 8:15 | 8:13| $1| E|
> | 8:10   | 8:20 | 8:13| $1| E|
> | 8:10   | 8:20 | 8:17| $6| F|
> | 8:15   | 8:25 | 8:17| $6| F|
> --
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CALCITE-3866) "numeric field overflow" when running the generated SQL in PostgreSQL

2020-05-09 Thread Feng Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feng Zhu resolved CALCITE-3866.
---
Resolution: Fixed

Fixed via 
[https://github.com/apache/calcite/commit/e081c5b4227a57defe47246d8ff3e6f7cce838e4],
 thanks for your PR [~winipanda]!

> "numeric field overflow" when running the generated SQL in PostgreSQL
> -
>
> Key: CALCITE-3866
> URL: https://issues.apache.org/jira/browse/CALCITE-3866
> Project: Calcite
>  Issue Type: Bug
>Reporter: TANG Wen-hui
>Assignee: TANG Wen-hui
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.23.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When we try to generate a sql after applying 
> AggregateJoinTransposeRule.EXTENDED, the result sql can not run on 
> PostgreSQL, and throws the following exception:
> {code:java}
> PSQLException: ERROR: numeric field overflow Detail: A field with precision 
> 7, scale 2 must round to an absolute value less than 10^5.
> {code}
> I found that the main reason is that :
> the return types of sum may have wrong precision when the type of 
>  its operand is the decimal with precision, for example:
> {code:java}
> @Test public void testSum() {
>   String query =
>   "select sum(e1.\"store_sales\"), sum(e2.\"store_sales\") from 
> \"sales_fact_dec_1998\" as "
>   + "e1 , \"sales_fact_dec_1998\" as e2 where e1.\"product_id\" = 
> e2.\"product_id\"";
>   String expect = "";
>   HepProgramBuilder builder = new HepProgramBuilder();
>   builder.addRuleClass(FilterJoinRule.class);
>   builder.addRuleClass(AggregateProjectMergeRule.class);
>   builder.addRuleClass(AggregateJoinTransposeRule.class);
>   HepPlanner hepPlanner = new HepPlanner(builder.build());
>   RuleSet rules = RuleSets.ofList(FilterJoinRule.FILTER_ON_JOIN, 
> FilterJoinRule.JOIN,
>   AggregateProjectMergeRule.INSTANCE,
>   AggregateJoinTransposeRule.EXTENDED);
>   sql(query).withPostgresql().optimize(rules, hepPlanner).ok(expect);
> }
> {code}
> the result generated sql of the query is :
> {code:java}
> SELECT SUM(CAST(\"t\".\"EXPR$0\" * \"t0\".\"$f1\" AS DECIMAL(10, 4))), 
> SUM(CAST(\"t\".\"$f2\" * \"t0\".\"EXPR$1\" AS DECIMAL(10, 4)))
> FROM (SELECT \"product_id\", SUM(\"store_sales\") AS \"EXPR$0\", COUNT(*) AS 
> \"$f2\"
> FROM \"foodmart\".\"sales_fact_dec_1998\"
> GROUP BY \"product_id\") AS \"t\"
> INNER JOIN (SELECT \"product_id\", COUNT(*) AS \"$f1\", SUM(\"store_sales\") 
> AS \"EXPR$1\"
> FROM \"foodmart\".\"sales_fact_dec_1998\"
> GROUP BY \"product_id\") AS \"t0\" ON \"t\".\"product_id\" = 
> \"t0\".\"product_id\"
> {code}
> AggregateJoinTransposeRule.EXTENDED generates a Aggregate  to sum up the 
> sub-totals:
> {code:java}
> // Aggregate above to sum up the sub-totals
> final List newAggCalls = new ArrayList<>();
> final int groupCount = aggregate.getGroupCount();
> final int newLeftWidth = sides.get(0).newInput.getRowType().getFieldCount();
> final List projects =
> new ArrayList<>(
> rexBuilder.identityProjects(relBuilder.peek().getRowType()));
> for (Ord aggCall : Ord.zip(aggregate.getAggCallList())) {
>   final SqlAggFunction aggregation = aggCall.e.getAggregation();
>   final SqlSplittableAggFunction splitter =
>   Objects.requireNonNull(
>   aggregation.unwrap(SqlSplittableAggFunction.class));
>   final Integer leftSubTotal = sides.get(0).split.get(aggCall.i);
>   final Integer rightSubTotal = sides.get(1).split.get(aggCall.i);
>   newAggCalls.add(
>   splitter.topSplit(rexBuilder, registry(projects),
>   groupCount, relBuilder.peek().getRowType(), aggCall.e,
>   leftSubTotal == null ? -1 : leftSubTotal,
>   rightSubTotal == null ? -1 : rightSubTotal + newLeftWidth));
> }
> public AggregateCall topSplit(RexBuilder rexBuilder,
> Registry extra, int offset, RelDataType inputRowType,
> AggregateCall aggregateCall, int leftSubTotal, int rightSubTotal) {
>   final List merges = new ArrayList<>();
>   final List fieldList = inputRowType.getFieldList();
>   if (leftSubTotal >= 0) {
> final RelDataType type = fieldList.get(leftSubTotal).getType();
> merges.add(rexBuilder.makeInputRef(type, leftSubTotal));
>   }
>   if (rightSubTotal >= 0) {
> final RelDataType type = fieldList.get(rightSubTotal).getType();
> merges.add(rexBuilder.makeInputRef(type, rightSubTotal));
>   }
>   RexNode node;
>   switch (merges.size()) {
>   case 1:
> node = merges.get(0);
> break;
>   case 2:
> node = rexBuilder.makeCall(SqlStdOperatorTable.MULTIPLY, merges);
> node = rexBuilder.makeAbstractCast(aggregateCall.type, node);
> break;
>   default:
> throw new AssertionError("unexpected count " + merges);
>   }
>   int ordinal = extra.register(node);
>   return 

[jira] [Updated] (CALCITE-3866) "numeric field overflow" when running the generated SQL in PostgreSQL

2020-05-09 Thread Feng Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feng Zhu updated CALCITE-3866:
--
Summary: "numeric field overflow" when running the generated SQL in 
PostgreSQL  (was:  ReturnTypes.AGG_SUM may cause "numeric field overflow" on 
PostgreSQL when generate the sql after using the rule 
AggregateJoinTransposeRule.EXTENDED.)

> "numeric field overflow" when running the generated SQL in PostgreSQL
> -
>
> Key: CALCITE-3866
> URL: https://issues.apache.org/jira/browse/CALCITE-3866
> Project: Calcite
>  Issue Type: Bug
>Reporter: TANG Wen-hui
>Assignee: TANG Wen-hui
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.23.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When we try to generate a sql after applying 
> AggregateJoinTransposeRule.EXTENDED, the result sql can not run on 
> PostgreSQL, and throws the following exception:
> {code:java}
> PSQLException: ERROR: numeric field overflow Detail: A field with precision 
> 7, scale 2 must round to an absolute value less than 10^5.
> {code}
> I found that the main reason is that :
> the return types of sum may have wrong precision when the type of 
>  its operand is the decimal with precision, for example:
> {code:java}
> @Test public void testSum() {
>   String query =
>   "select sum(e1.\"store_sales\"), sum(e2.\"store_sales\") from 
> \"sales_fact_dec_1998\" as "
>   + "e1 , \"sales_fact_dec_1998\" as e2 where e1.\"product_id\" = 
> e2.\"product_id\"";
>   String expect = "";
>   HepProgramBuilder builder = new HepProgramBuilder();
>   builder.addRuleClass(FilterJoinRule.class);
>   builder.addRuleClass(AggregateProjectMergeRule.class);
>   builder.addRuleClass(AggregateJoinTransposeRule.class);
>   HepPlanner hepPlanner = new HepPlanner(builder.build());
>   RuleSet rules = RuleSets.ofList(FilterJoinRule.FILTER_ON_JOIN, 
> FilterJoinRule.JOIN,
>   AggregateProjectMergeRule.INSTANCE,
>   AggregateJoinTransposeRule.EXTENDED);
>   sql(query).withPostgresql().optimize(rules, hepPlanner).ok(expect);
> }
> {code}
> the result generated sql of the query is :
> {code:java}
> SELECT SUM(CAST(\"t\".\"EXPR$0\" * \"t0\".\"$f1\" AS DECIMAL(10, 4))), 
> SUM(CAST(\"t\".\"$f2\" * \"t0\".\"EXPR$1\" AS DECIMAL(10, 4)))
> FROM (SELECT \"product_id\", SUM(\"store_sales\") AS \"EXPR$0\", COUNT(*) AS 
> \"$f2\"
> FROM \"foodmart\".\"sales_fact_dec_1998\"
> GROUP BY \"product_id\") AS \"t\"
> INNER JOIN (SELECT \"product_id\", COUNT(*) AS \"$f1\", SUM(\"store_sales\") 
> AS \"EXPR$1\"
> FROM \"foodmart\".\"sales_fact_dec_1998\"
> GROUP BY \"product_id\") AS \"t0\" ON \"t\".\"product_id\" = 
> \"t0\".\"product_id\"
> {code}
> AggregateJoinTransposeRule.EXTENDED generates a Aggregate  to sum up the 
> sub-totals:
> {code:java}
> // Aggregate above to sum up the sub-totals
> final List newAggCalls = new ArrayList<>();
> final int groupCount = aggregate.getGroupCount();
> final int newLeftWidth = sides.get(0).newInput.getRowType().getFieldCount();
> final List projects =
> new ArrayList<>(
> rexBuilder.identityProjects(relBuilder.peek().getRowType()));
> for (Ord aggCall : Ord.zip(aggregate.getAggCallList())) {
>   final SqlAggFunction aggregation = aggCall.e.getAggregation();
>   final SqlSplittableAggFunction splitter =
>   Objects.requireNonNull(
>   aggregation.unwrap(SqlSplittableAggFunction.class));
>   final Integer leftSubTotal = sides.get(0).split.get(aggCall.i);
>   final Integer rightSubTotal = sides.get(1).split.get(aggCall.i);
>   newAggCalls.add(
>   splitter.topSplit(rexBuilder, registry(projects),
>   groupCount, relBuilder.peek().getRowType(), aggCall.e,
>   leftSubTotal == null ? -1 : leftSubTotal,
>   rightSubTotal == null ? -1 : rightSubTotal + newLeftWidth));
> }
> public AggregateCall topSplit(RexBuilder rexBuilder,
> Registry extra, int offset, RelDataType inputRowType,
> AggregateCall aggregateCall, int leftSubTotal, int rightSubTotal) {
>   final List merges = new ArrayList<>();
>   final List fieldList = inputRowType.getFieldList();
>   if (leftSubTotal >= 0) {
> final RelDataType type = fieldList.get(leftSubTotal).getType();
> merges.add(rexBuilder.makeInputRef(type, leftSubTotal));
>   }
>   if (rightSubTotal >= 0) {
> final RelDataType type = fieldList.get(rightSubTotal).getType();
> merges.add(rexBuilder.makeInputRef(type, rightSubTotal));
>   }
>   RexNode node;
>   switch (merges.size()) {
>   case 1:
> node = merges.get(0);
> break;
>   case 2:
> node = rexBuilder.makeCall(SqlStdOperatorTable.MULTIPLY, merges);
> node = rexBuilder.makeAbstractCast(aggregateCall.type, node);
> break;
>   default:
> throw new 

[jira] [Created] (CALCITE-3984) Support exchange operator in RelFieldTrimmer

2020-05-09 Thread xzh_dz (Jira)
xzh_dz created CALCITE-3984:
---

 Summary: Support exchange operator in RelFieldTrimmer
 Key: CALCITE-3984
 URL: https://issues.apache.org/jira/browse/CALCITE-3984
 Project: Calcite
  Issue Type: Wish
Reporter: xzh_dz


RelFieldTrimmer does not support trim unused fields in exchange operator now. 
Such as below:
{code:java}
final RelBuilder builder = RelBuilder.create(config().build());
final RelNode root =
builder.scan("EMP")
.project(builder.field("EMPNO"), builder.field("ENAME"), 
builder.field("DEPTNO"))
.exchange(RelDistributions.hash(Lists.newArrayList(1)))
.project(builder.field("EMPNO"), builder.field("ENAME"))
.build();
{code}
 RelNode root:
{code:java}
LogicalProject(EMPNO=[$0], ENAME=[$1])
  LogicalExchange(distribution=[hash[1]])
LogicalProject(EMPNO=[$0], ENAME=[$1], DEPTNO=[$7])
  LogicalTableScan(table=[[scott, EMP]])
{code}
The right result should be:
{code:java}
LogicalExchange(distribution=[hash[1]])
  LogicalProject(EMPNO=[$0], ENAME=[$1])
LogicalTableScan(table=[[scott, EMP]])
{code}






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3956) Unify comparison logic for RelOptCost

2020-05-09 Thread Liya Fan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17103185#comment-17103185
 ] 

Liya Fan commented on CALCITE-3956:
---

Thanks a lot for your feedback.

> I'm dubious about this change: Like Julian, I'm not sure the actual code use 
> RelOptCost would change much BUT at the same time, adding a compareCost 
> method might let people think that there's a total ordering where there is 
> none.

We have explictly stated in the JavaDoc that the {{compareCost}} is a partial 
order. In addition, the return type is not int, so the client code cannot use 
it mistakenly in scenarios where a total order is expected. 

> (The change also breaks API for those projects which implements their own 
> RelOptCost)
This is a problem. So we need to consider 1) if there is a beter way; 2) if the 
benefits justify the effort. 

> Unify comparison logic for RelOptCost
> -
>
> Key: CALCITE-3956
> URL: https://issues.apache.org/jira/browse/CALCITE-3956
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Liya Fan
>Assignee: Liya Fan
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, comparisons between RelOptCost objects are based on 3 methods:
> 1. {{boolean isLe(RelOptCost cost)}}
> 2. {{boolean isLt(RelOptCost cost)}}
> 3. {{boolean equals(RelOptCost cost)}}
> The 3 methods used in combination determine the relation between RelOptCost 
> objects. 
> There are some problems with this implementation:
> 1. Some logic is duplicate in the above methods, making it difficult to 
> maintain. 
> 2. To determine the relation between RelOptCost objects, we often need to 
> call more than one comparison methods, leading to performance overhead.
> 3. Since the logic is spread in multiple methods, it is easy to end up with 
> contradictive comparison logic, which will suprise the users. For example, 
> the following assertion should hold according to common sense:
> {{if a >=b, then we have a > b or a == b}}
> However, with the current implementation of {{VolcanoCost}}, we can easily 
> create instances that violate the above assertion. 
> To solve the problems, we want to make {{RelOptCost}} extends the 
> {{Comparable}}, so the comparison logic is unified in the 
> {{compareTo}} method, which solves the above problems. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-3976) Generify the DefaultEdge class

2020-05-09 Thread Liya Fan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17103166#comment-17103166
 ] 

Liya Fan commented on CALCITE-3976:
---

[~julianhyde] Sounds good. Thanks for your feedback. I have revised the PR 
accordingly. 

> Generify the DefaultEdge class
> --
>
> Key: CALCITE-3976
> URL: https://issues.apache.org/jira/browse/CALCITE-3976
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Liya Fan
>Assignee: Liya Fan
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently, the {{source}} and {{target}} fields of class {{DefaultEdge}} is 
> {{Object}}. This makes it necessary for some casts in the code base. In 
> addition, it does not enforce the assertion that in a graph, the vertices 
> have the same type as sources and targes in the edges. 
> To solve the problem, we generify the DefaultEdge class with the type of the 
> source/target vertices.
> The benefits of generfication includes type safety: the above assertion can 
> be enforced by the generified class. It also gives the compiler an 
> opportunity to detect type related problems at compilation time. Without 
> generification, some problems can only be detected at runtime, when a 
> ClassCastException is thrown. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)