[jira] [Commented] (CALCITE-3197) Convert data of Timestamp/Time/Date as original form when enumerating from ArrayTable

2019-07-15 Thread jin xing (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885816#comment-16885816
 ] 

jin xing commented on CALCITE-3197:
---

[~julianhyde]
THX a lot for comment. 
“The idea that a representation knows what it really should be seems very 
wrong”  -- This really helps me to get clear about the design.

I will refine my change by your comment.

> Convert data of Timestamp/Time/Date as original form when enumerating from 
> ArrayTable
> -
>
> Key: CALCITE-3197
> URL: https://issues.apache.org/jira/browse/CALCITE-3197
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: jin xing
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In current implementation ColumnLoader, data of 
> {{Rep.JAVA_SQL_TIMESTAMP/Rep.JAVA_SQL_TIME/Rep.JAVA_SQL_DATE}} are converted 
> as numeric during loading. 
> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/adapter/clone/ColumnLoader.java#L234
> But current code seems forgot to revert the data back to original form when 
> enumerating.
> As a result, below test is failing now
> {code:java}
> // MaterializationTest.java
> @Test public void testTimestampType() {
>   String sql = "select \"eventid\", \"ts\"\n"
> + "from \"events\"\n"
> + "where \"eventid\" > 5";
>   checkMaterialize(sql, sql);
> }{code}
> For type of {{Rep.JAVA_SQL_TIMESTAMP/Rep.JAVA_SQL_TIME/Rep.JAVA_SQL_DATE}}, 
> cursor acesses by {{TimestampAccessor/TimeAccessor/DateAccessor}}, which 
> expect column value as {{Timestamp/Time/Date}}.
> It make sense to 'unwrap' the data as original form when enumerating from 
> {{ArrayTable}}.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (CALCITE-3186) IN expressions in UPDATE statements throws Exceptions

2019-07-15 Thread Danny Chan (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Danny Chan updated CALCITE-3186:

Summary: IN expressions in UPDATE statements throws Exceptions  (was: IN 
expressions in UPDATE statements throwing Exceptions)

> IN expressions in UPDATE statements throws Exceptions
> -
>
> Key: CALCITE-3186
> URL: https://issues.apache.org/jira/browse/CALCITE-3186
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.20.0
>Reporter: Pressenna
>Priority: Major
>
> The patch to get correlated sub-queries working in UPDATE statements had this 
> side-effect.
>  
> {code:java}
> CREATE TABLE t1 (id1 INT, val1 TEXT);
> CREATE TABLE t2 (id2 INT, val2 TEXT);
> UPDATE t1 SET val1 = 't2' WHERE id1 IN (1, 2, 3);
> -- or
> UPDATE t1 SET val1 = 't2' WHERE id1 IN (SELECT id2 FROM t2);{code}
>  
> These seem to raise exceptions now.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CALCITE-3122) Convert Pig Latin scripts into Calcite logical plan

2019-07-15 Thread Khai Tran (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-3122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885562#comment-16885562
 ] 

Khai Tran commented on CALCITE-3122:


I dont need. But the planner will complaint that the current RelNode belongs to 
a cluster with a different planner. That throws an exception.

> Convert Pig Latin scripts into Calcite logical plan 
> 
>
> Key: CALCITE-3122
> URL: https://issues.apache.org/jira/browse/CALCITE-3122
> Project: Calcite
>  Issue Type: New Feature
>  Components: core, piglet
>Reporter: Khai Tran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.21.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> We create an internal Calcite repo at LinkedIn and develop APIs to parse any 
> Pig Latin scripts into Calcite logical plan. The code was tested in nearly 
> ~1000 Pig scripts written at LinkedIn.
> Changes:
> 1. piglet: main conversion code live there, include:
>  * APIs to convert any Pig scripts into RelNode plans or SQL statements
>  * Use Pig Grunt parser to parse Pig Latin scripts into Pig logical plan 
> (DAGs)
>  * Convert Pig schemas into RelDatatype
>  * Traverse through Pig expression plan and convert Pig expressions into 
> RexNodes
>  * Map some basic Pig UDFs to Calcite SQL operators
>  * Build Calcite UDFs for any other Pig UDFs, including UDFs written in both 
> Java and Python
>  * Traverse (DFS) through Pig logical plans to convert each Pig logical nodes 
> to RelNodes
>  * Have an optimizer rule to optimize Pig group/cogroup into Aggregate 
> operators
> 2. core:
>  * Implement other RelNode in Rel2Sql so that Pig can be translated into SQL
>  * Other minor changes in a few other classes to make Pig to Calcite works



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CALCITE-3178) RexSimplify.simplifyOrTerms slow with large OR filters

2019-07-15 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885556#comment-16885556
 ] 

Julian Hyde commented on CALCITE-3178:
--

The ideal fix, in my view, would be to make the simplify algorithm O(n log n). 
But short of that, a threshold, above which we do not attempt to simplify OR.

The test would be to create a large IN clause programmatically (say 20,000 
elements), set the threshold to say 100, and assume that if the threshold stops 
working we'd notice the test running slowly. Because the test exists someone 
would be able to poke at it and try to do the O(n log n) fix in future.

> RexSimplify.simplifyOrTerms slow with large OR filters
> --
>
> Key: CALCITE-3178
> URL: https://issues.apache.org/jira/browse/CALCITE-3178
> Project: Calcite
>  Issue Type: Improvement
>Affects Versions: 1.19.0
>Reporter: Gian Merlino
>Priority: Major
>
> In particular, once for each subpredicate within the OR, 
> RexSimplify.simplifyOrTerms calls {{simplify.predicates.union}} and adds the 
> freshly-unioned result to {{simplify.predicates}}. The most time-consuming 
> part of this seems to be {{RexUtil.predicateConstants}}, which re-examines 
> each previously-added entry. This is O(N^2) in the number of subpredicates 
> within the OR.
> I discovered this when someone tried to run a query with a 14,000-element IN 
> filter, and planning took about 45 seconds. In Druid, we always convert INs 
> to ORs, never allowing Calcite's subquery conversion to happen. This is 
> because as far as native Druid queries are concerned, a huge OR is going to 
> be more efficient than a join against a constant subquery.
> I'm not sure what the best way is to fix this. The only thing that comes to 
> mind immediately is the "quick fix" of limiting how many OR elements 
> RexSimplify might attempt to simplify at once (and potentially AND as well? I 
> haven't looked into that one.)



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CALCITE-3187) Derive all decimal return type through type factory

2019-07-15 Thread Laurent Goujon (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885548#comment-16885548
 ] 

Laurent Goujon commented on CALCITE-3187:
-

Will do.

> Derive all decimal return type through type factory
> ---
>
> Key: CALCITE-3187
> URL: https://issues.apache.org/jira/browse/CALCITE-3187
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Praveen Kumar Desabandu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Currently decimal product and quotient return types are derived through type 
> factory, this allows clients to override the return type if they so desire.
> But decimal sum is embedded in return types, also decimal mod does not have a 
> return type inference defined.
> This task is to derive all of the return types through type factory, so that 
> clients can override if they wish to.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CALCITE-3113) Equivalent MutableAggregates with different row types fail with AssertionError

2019-07-15 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-3113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885546#comment-16885546
 ] 

Julian Hyde commented on CALCITE-3113:
--

It seems necessary that the before and after row types have the same number and 
type of fields, not necessarily the same names. There's probably a check for 
that elsewhere in the code, since it's a common requirement.

I see that the PR has assert statements inside for loops. The goal is to avoid 
the effort of checking if asserts are disabled. Therefore the pattern {{assert 
xxxIsValid(x, Litmus.THROW)}}, where the method {{xxxIsValid()}} does the for 
loops, is a good and efficient one.

> Equivalent MutableAggregates with different row types fail with AssertionError
> --
>
> Key: CALCITE-3113
> URL: https://issues.apache.org/jira/browse/CALCITE-3113
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.19.0
>Reporter: Feng Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Add test case in MaterializationTest: 
> {code:java}
> @Test public void testAggregateAlias() {
>   checkMaterialize(
>   "select count(*) as c from \"emps\" group by \"empid\"",
>   "select count(*) + 1 as c from \"emps\" group by \"empid\"");
> }
> {code}
>  It fails due to different rowtype.
> {code:java}
> java.lang.AssertionError
>     at 
> org.apache.calcite.plan.SubstitutionVisitor.go(SubstitutionVisitor.java:504)
>     at 
> org.apache.calcite.plan.SubstitutionVisitor.go(SubstitutionVisitor.java:465)
>     at 
> org.apache.calcite.plan.MaterializedViewSubstitutionVisitor.go(MaterializedViewSubstitutionVisitor.java:56)
>     at 
> org.apache.calcite.plan.RelOptMaterializations.substitute(RelOptMaterializations.java:200)
>     at 
> org.apache.calcite.plan.RelOptMaterializations.useMaterializedViews(RelOptMaterializations.java:72)
>     at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.registerMaterializations(VolcanoPlanner.java:347)
> {code}
> However, according to MutableAggregate's hashCode implementation, this 
> materialization can be reused, i.e., queryDescedant=targetDescendant.
> {code:java}
> queryDescendant: RecordType(JavaType(int) empid, BIGINT $f1)
> =
> Aggregate(groupSet: {0}, groupSets: [{0}], calls: [COUNT()])
>   Project(projects: [$0])
>     Scan(table: [hr, emps])
> targetDescendant: RecordType(JavaType(int) empid, BIGINT C)
> =
> Aggregate(groupSet: {0}, groupSets: [{0}], calls: [COUNT()])
>   Project(projects: [$0])
>     Scan(table: [hr, emps])
> {code}
> So, how can we align them?
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CALCITE-3181) Support limit per group in Window

2019-07-15 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885545#comment-16885545
 ] 

Julian Hyde commented on CALCITE-3181:
--

The {{enum RelFieldCollation}} has values ASCENDING, STRICTLY_ASCENDING, 
DESCENDING, STRICTLY_DESCENDING and CLUSTERED. CLUSTERED exists for precisely 
this purpose.

We have no way of saying in a physical property (trait) "for a given deptno key 
there are no more than 10 rows". My instinct says we probably shouldn't add a 
way; it would make the trait more complicated for everyone else. So we'd do 
this by transformation (forward chaining) not by traits (backward chaining).

> Support limit per group in Window
> -
>
> Key: CALCITE-3181
> URL: https://issues.apache.org/jira/browse/CALCITE-3181
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Haisheng Yuan
>Priority: Major
>
> We have a lot of queries like the following to retrieve top N tuples per 
> group:
> {code:java}
> SELECT x, y FROM
>  (SELECT x, y, ROW_NUMBER() OVER (PARTITION BY x ORDER BY y) 
>  AS rn FROM t1) t2 WHERE rn <= 3;
> {code}
> The performance is not good if each group has a lot more tuples than wanted, 
> because we will retrieve and sort all the tuples, instead of just doing a 
> top-N heap sort.
> In order to do optimization for this kind of query, we need to extend window 
> to support limit, if and only if there is only 1 window function, and it is 
> {{row_number()}}. We also need a substitute rule to push the limit into 
> window. Of course, we also need to modify executor to support this 
> optimization (can be later).
> {code:java}
> Filter (rn <= 3)
>   +- Window (window#0={Partition by x order by y ROW_NUMBER()})
> {code}
> to
> {code:java}
> Filter (rn <= 3)
>   +- Window (window#0={Partition by x order by y limit 3 ROW_NUMBER()})
> {code}
> Thoughts? Objections?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CALCITE-3197) Convert data of Timestamp/Time/Date as original form when enumerating from ArrayTable

2019-07-15 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885520#comment-16885520
 ] 

Julian Hyde commented on CALCITE-3197:
--

This PR makes the code a lot less elegant. The idea that a representation knows 
what it "really should be" seems very wrong.

My gut tells me that there is a one or two line fix for this problem. This 
isn't it.

> Convert data of Timestamp/Time/Date as original form when enumerating from 
> ArrayTable
> -
>
> Key: CALCITE-3197
> URL: https://issues.apache.org/jira/browse/CALCITE-3197
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: jin xing
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In current implementation ColumnLoader, data of 
> {{Rep.JAVA_SQL_TIMESTAMP/Rep.JAVA_SQL_TIME/Rep.JAVA_SQL_DATE}} are converted 
> as numeric during loading. 
> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/adapter/clone/ColumnLoader.java#L234
> But current code seems forgot to revert the data back to original form when 
> enumerating.
> As a result, below test is failing now
> {code:java}
> // MaterializationTest.java
> @Test public void testTimestampType() {
>   String sql = "select \"eventid\", \"ts\"\n"
> + "from \"events\"\n"
> + "where \"eventid\" > 5";
>   checkMaterialize(sql, sql);
> }{code}
> For type of {{Rep.JAVA_SQL_TIMESTAMP/Rep.JAVA_SQL_TIME/Rep.JAVA_SQL_DATE}}, 
> cursor acesses by {{TimestampAccessor/TimeAccessor/DateAccessor}}, which 
> expect column value as {{Timestamp/Time/Date}}.
> It make sense to 'unwrap' the data as original form when enumerating from 
> {{ArrayTable}}.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CALCITE-3187) Derive all decimal return type through type factory

2019-07-15 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885486#comment-16885486
 ] 

Julian Hyde commented on CALCITE-3187:
--

[~laurent] There are still copy editing issues. E.g. how javadoc is written, 
how deprecation annotations are written. Can you drive these to completion?

Also when you commit please manually squash and remove the change-id messages 
(in the past you have github's auto-squash, which leaves a messy commit log).

> Derive all decimal return type through type factory
> ---
>
> Key: CALCITE-3187
> URL: https://issues.apache.org/jira/browse/CALCITE-3187
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Praveen Kumar Desabandu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Currently decimal product and quotient return types are derived through type 
> factory, this allows clients to override the return type if they so desire.
> But decimal sum is embedded in return types, also decimal mod does not have a 
> return type inference defined.
> This task is to derive all of the return types through type factory, so that 
> clients can override if they wish to.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CALCITE-3198) ReduceExpressionsRule.FILTER_INSTANCE does not reduce 'NOT(x=a AND x=b)'

2019-07-15 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-3198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885475#comment-16885475
 ] 

Julian Hyde commented on CALCITE-3198:
--

{{ReduceExpressionsRule}} is about constant reduction (aided by predicates). Is 
constant reduction the best framing for this optimization? (It might be, I 
haven't thought it through.) Simplification (as in {{RexSimplify}}, and 
automatically applied by {{RelBuilder.filter}} and other methods) is an 
alternative framing to consider.

> ReduceExpressionsRule.FILTER_INSTANCE does not reduce 'NOT(x=a AND x=b)'
> 
>
> Key: CALCITE-3198
> URL: https://issues.apache.org/jira/browse/CALCITE-3198
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.20.0
>Reporter: Ruben Quesada Lopez
>Priority: Minor
>
> Currently, ReduceExpressionsRule.FILTER_INSTANCE can successfully reduce a 
> query like this one (see RelOptRulesTest#testReduceConstantsDup):
> {code}
> // query:
> select d.deptno from dept d where d.deptno=7 and d.deptno=8
> // plan before:
> LogicalProject(DEPTNO=[$0])
>   LogicalFilter(condition=[AND(=($0, 7), =($0, 8))])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> // plan after:
> LogicalProject(DEPTNO=[$0])
>   LogicalValues(tuples=[[]])
> {code}
> As we can see, since the filter is 'always false', the 
> LogicalTableScan+LogicalFilter are correctly replaced by an empty 
> LogicalValues.
> However, the same filter with a NOT expression, is not correctly simplified:
> {code}
> // query:
> select d.deptno from dept d where not(d.deptno=7 and d.deptno=8)
> // plan before:
> LogicalProject(DEPTNO=[$0])
>   LogicalFilter(condition=[NOT(AND(=($0, 7), =($0, 8)))])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> // plan after (actual, NOT distributivity for AND):
> LogicalProject(DEPTNO=[$0])
>   LogicalFilter(condition=[OR(<>($0, 7), <>($0, 8))])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> // plan after (expected, filter removed):
> LogicalProject(DEPTNO=[$0])
>   LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {code}
> Since the filter is the negated of an 'always false filter' (the one used in 
> the previous query), it is therefore an 'always true filter', so the expected 
> behavior is that the LogicalFilter should be removed, and it is not.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CALCITE-3122) Convert Pig Latin scripts into Calcite logical plan

2019-07-15 Thread Khai Tran (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-3122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885472#comment-16885472
 ] 

Khai Tran commented on CALCITE-3122:


CALCITE-1681 is really what I need if we cannot reset planner for a cluster.

> Convert Pig Latin scripts into Calcite logical plan 
> 
>
> Key: CALCITE-3122
> URL: https://issues.apache.org/jira/browse/CALCITE-3122
> Project: Calcite
>  Issue Type: New Feature
>  Components: core, piglet
>Reporter: Khai Tran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.21.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> We create an internal Calcite repo at LinkedIn and develop APIs to parse any 
> Pig Latin scripts into Calcite logical plan. The code was tested in nearly 
> ~1000 Pig scripts written at LinkedIn.
> Changes:
> 1. piglet: main conversion code live there, include:
>  * APIs to convert any Pig scripts into RelNode plans or SQL statements
>  * Use Pig Grunt parser to parse Pig Latin scripts into Pig logical plan 
> (DAGs)
>  * Convert Pig schemas into RelDatatype
>  * Traverse through Pig expression plan and convert Pig expressions into 
> RexNodes
>  * Map some basic Pig UDFs to Calcite SQL operators
>  * Build Calcite UDFs for any other Pig UDFs, including UDFs written in both 
> Java and Python
>  * Traverse (DFS) through Pig logical plans to convert each Pig logical nodes 
> to RelNodes
>  * Have an optimizer rule to optimize Pig group/cogroup into Aggregate 
> operators
> 2. core:
>  * Implement other RelNode in Rel2Sql so that Pig can be translated into SQL
>  * Other minor changes in a few other classes to make Pig to Calcite works



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CALCITE-3122) Convert Pig Latin scripts into Calcite logical plan

2019-07-15 Thread Khai Tran (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-3122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885464#comment-16885464
 ] 

Khai Tran commented on CALCITE-3122:


Hi [~zabetak], thanks a lot for your suggestions.

In this feature, I get the Pig logical plan from Pig parser then traverse 
through this plan, use RelBuilder to construct Calcite logical plan, let's call 
it plan #1. After this step, I need to write an additional rule to optimize 
plan #1 into plan #2. As an example, Pig group-by + aggregate is literally 
translated as a Calcite aggregate with COLLECT() agg func, then applying a 
Project that use Pig aggregate UDFs (work on Pig DataBag or SQL multiset, which 
is result of COLLECT()). So the rule will convert COLLECT() + Pig aggregate 
UDFs into Calcite builtin aggregate operator.

And we have other use cases that need to optimize a RelNode plan using a given 
set of rules and probably using a customer planner to set costs in the way that 
can enforce certain rules. So coupling RelNode with planners make it hard to do 
so.

I will try to check out more about HepPlanner, but do you have any example of 
setting HepPlanner for RelBuilder?

> Convert Pig Latin scripts into Calcite logical plan 
> 
>
> Key: CALCITE-3122
> URL: https://issues.apache.org/jira/browse/CALCITE-3122
> Project: Calcite
>  Issue Type: New Feature
>  Components: core, piglet
>Reporter: Khai Tran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.21.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> We create an internal Calcite repo at LinkedIn and develop APIs to parse any 
> Pig Latin scripts into Calcite logical plan. The code was tested in nearly 
> ~1000 Pig scripts written at LinkedIn.
> Changes:
> 1. piglet: main conversion code live there, include:
>  * APIs to convert any Pig scripts into RelNode plans or SQL statements
>  * Use Pig Grunt parser to parse Pig Latin scripts into Pig logical plan 
> (DAGs)
>  * Convert Pig schemas into RelDatatype
>  * Traverse through Pig expression plan and convert Pig expressions into 
> RexNodes
>  * Map some basic Pig UDFs to Calcite SQL operators
>  * Build Calcite UDFs for any other Pig UDFs, including UDFs written in both 
> Java and Python
>  * Traverse (DFS) through Pig logical plans to convert each Pig logical nodes 
> to RelNodes
>  * Have an optimizer rule to optimize Pig group/cogroup into Aggregate 
> operators
> 2. core:
>  * Implement other RelNode in Rel2Sql so that Pig can be translated into SQL
>  * Other minor changes in a few other classes to make Pig to Calcite works



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (CALCITE-3145) RelBuilder.aggregate throws IndexOutOfBoundsException if groupKey is non-empty and there are duplicate aggregate functions

2019-07-15 Thread Julian Hyde (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde resolved CALCITE-3145.
--
   Resolution: Fixed
Fix Version/s: 1.21.0

Fixed in [0cce229|https://github.com/julianhyde/calcite/commit/0cce229].

> RelBuilder.aggregate throws IndexOutOfBoundsException if groupKey is 
> non-empty and there are duplicate aggregate functions
> --
>
> Key: CALCITE-3145
> URL: https://issues.apache.org/jira/browse/CALCITE-3145
> Project: Calcite
>  Issue Type: Bug
>Reporter: Steven Talbot
>Priority: Major
> Fix For: 1.21.0
>
>
> There is a bug with aggregate duplicate with group fields. Can repro with a 
> simple modification (adding more group fields than there are aggregate 
> fields) of the test added in 
> [https://github.com/apache/calcite/commit/e01ba5ab6e7c57348f9f7be2babf00ae007204b5]
> {noformat}
> /** Tests that {@link RelBuilder#aggregate} eliminates duplicate aggregate
>  * calls and creates a {@code Project} to compensate. */
> @Test public void testAggregateEliminatesDuplicateCalls2() {
>   final RelBuilder builder = RelBuilder.create(config().build());
>   RelNode root =
>   builder.scan("EMP")
>   .aggregate(builder.groupKey(builder.field(0), 
> builder.field(1), builder.field(2), builder.field(3), builder.field(4)),
>   builder.sum(builder.field(1)).as("S1"),
>   builder.count().as("C"),
>   builder.sum(builder.field(2)).as("S2"),
>   builder.sum(builder.field(1)).as("S1b"))
>   .build();
>   final String expected = ""
>   + "LogicalProject(S1=[$0], C=[$1], S2=[$2], S1b=[$0])\n"
>   + "  LogicalAggregate(group=[{}], S1=[SUM($1)], C=[COUNT()], 
> S2=[SUM($2)])\n"
>   + "LogicalTableScan(table=[[scott, EMP]])\n";
>   assertThat(root, hasTree(expected));
> }{noformat}
> Note that the test isn't quite right, as the final expectation would need to 
> be modified, but it reproduces the exception, which in this case is 
> `java.lang.IndexOutOfBoundsException: Index: 4, Size: 4`
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (CALCITE-3166) Make RelBuilder configurable

2019-07-15 Thread Julian Hyde (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde resolved CALCITE-3166.
--
Resolution: Fixed

Fixed in [15e6378 |https://github.com/julianhyde/calcite/commit/15e6378]. 
Thanks for review [~danny0405]!

> Make RelBuilder configurable
> 
>
> Key: CALCITE-3166
> URL: https://issues.apache.org/jira/browse/CALCITE-3166
> Project: Calcite
>  Issue Type: Bug
>Reporter: Julian Hyde
>Assignee: Julian Hyde
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.21.0
>
>
> Make {{RelBuilder}} configurable, so that particular optimizations can easily 
> be turned off.
> I propose to add a class {{RelBuilder.Config}}, which is immutable and has a 
> public final field for each configuration property; also a class 
> {{RelBuilder.ConfigBuilder}} to create a config.
> {{RelBuilder.create(FrameworkConfig frameworkConfig)}} will get a config by 
> calling {{frameworkConfig.getContext().unwrap(RelBuilder.Config.class)}}.
> Going forward, any new features that add "optimizations" to {{RelBuilder}} 
> would need to have a corresponding flag in {{Config}} to switch them off. A 
> feature would not be considered "complete" if it did not have tests and a 
> switch.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CALCITE-3144) Add rule, AggregateCaseToFilterRule, that converts "SUM(CASE WHEN b THEN x END)" to "SUM(x) FILTER (WHERE b)"

2019-07-15 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885460#comment-16885460
 ] 

Julian Hyde commented on CALCITE-3144:
--

For a follow-up task, it would be good if {{RelBuilder.aggregate}} was able to 
do this (if enabled by a flag in {{RelBuilder.Config}}).

One of the deficiencies in the current rule is that it does not remove the CASE 
expressions from the underlying Project. An approach based on {{RelBuilder}} 
would not have this limitation. (And {{AggregateExtractProjectRule}} ought to 
deal with this, but it cannot because it does not fire on {{Project}}, to avoid 
cycles.)

> Add rule, AggregateCaseToFilterRule, that converts "SUM(CASE WHEN b THEN x 
> END)" to "SUM(x) FILTER (WHERE b)"
> -
>
> Key: CALCITE-3144
> URL: https://issues.apache.org/jira/browse/CALCITE-3144
> Project: Calcite
>  Issue Type: Bug
>Reporter: Julian Hyde
>Assignee: Julian Hyde
>Priority: Major
> Fix For: 1.21.0
>
>
> Add a rule that converts "SUM(CASE WHEN b THEN x END)" to "SUM\(x) FILTER 
> (WHERE b)".
> Druid added {{CaseFilteredAggregatorRule}} in 
> https://github.com/apache/incubator-druid/pull/4360.
> Maybe {{AggregateCaseToFilterRule}} is a slightly better name. Or maybe this 
> transform could be done in {{RelBuilder.aggregate}}, and we wouldn't need a 
> rule.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (CALCITE-3144) Add rule, CaseFilteredAggregatorRule, that converts "SUM(CASE WHEN b THEN x END)" to "SUM(x) FILTER (WHERE b)"

2019-07-15 Thread Julian Hyde (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde resolved CALCITE-3144.
--
Resolution: Fixed

Fixed in [687b7d8|https://github.com/julianhyde/calcite/commit/687b7d8].

The rule is called {{AggregateCaseToFilterRule}}.

> Add rule, CaseFilteredAggregatorRule, that converts "SUM(CASE WHEN b THEN x 
> END)" to "SUM(x) FILTER (WHERE b)"
> --
>
> Key: CALCITE-3144
> URL: https://issues.apache.org/jira/browse/CALCITE-3144
> Project: Calcite
>  Issue Type: Bug
>Reporter: Julian Hyde
>Assignee: Julian Hyde
>Priority: Major
> Fix For: 1.21.0
>
>
> Add a rule that converts "SUM(CASE WHEN b THEN x END)" to "SUM\(x) FILTER 
> (WHERE b)".
> Druid added {{CaseFilteredAggregatorRule}} in 
> https://github.com/apache/incubator-druid/pull/4360.
> Maybe {{AggregateCaseToFilterRule}} is a slightly better name. Or maybe this 
> transform could be done in {{RelBuilder.aggregate}}, and we wouldn't need a 
> rule.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (CALCITE-3144) Add rule, AggregateCaseToFilterRule, that converts "SUM(CASE WHEN b THEN x END)" to "SUM(x) FILTER (WHERE b)"

2019-07-15 Thread Julian Hyde (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde updated CALCITE-3144:
-
Summary: Add rule, AggregateCaseToFilterRule, that converts "SUM(CASE WHEN 
b THEN x END)" to "SUM(x) FILTER (WHERE b)"  (was: Add rule, 
CaseFilteredAggregatorRule, that converts "SUM(CASE WHEN b THEN x END)" to 
"SUM(x) FILTER (WHERE b)")

> Add rule, AggregateCaseToFilterRule, that converts "SUM(CASE WHEN b THEN x 
> END)" to "SUM(x) FILTER (WHERE b)"
> -
>
> Key: CALCITE-3144
> URL: https://issues.apache.org/jira/browse/CALCITE-3144
> Project: Calcite
>  Issue Type: Bug
>Reporter: Julian Hyde
>Assignee: Julian Hyde
>Priority: Major
> Fix For: 1.21.0
>
>
> Add a rule that converts "SUM(CASE WHEN b THEN x END)" to "SUM\(x) FILTER 
> (WHERE b)".
> Druid added {{CaseFilteredAggregatorRule}} in 
> https://github.com/apache/incubator-druid/pull/4360.
> Maybe {{AggregateCaseToFilterRule}} is a slightly better name. Or maybe this 
> transform could be done in {{RelBuilder.aggregate}}, and we wouldn't need a 
> rule.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (CALCITE-3196) In Frameworks, add BasePrepareAction (a functional interface) and deprecate PrepareAction

2019-07-15 Thread Julian Hyde (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde resolved CALCITE-3196.
--
   Resolution: Fixed
Fix Version/s: 1.21.0

Fixed in [1c5de1c|https://github.com/julianhyde/calcite/commit/1c5de1c].

[~danny0405] I saw you logged a PR. I should have said that I was already 
working on this - in fact had already fixed it, just needed to test. Sorry we 
duplicated work.

> In Frameworks, add BasePrepareAction (a functional interface) and deprecate 
> PrepareAction
> -
>
> Key: CALCITE-3196
> URL: https://issues.apache.org/jira/browse/CALCITE-3196
> Project: Calcite
>  Issue Type: Bug
>Reporter: Julian Hyde
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.21.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In {{Frameworks}}, add {{interface BasePrepareAction}} (a functional 
> interface) and deprecate {{abstract class PrepareAction}}. Because 
> {{PrepareAction}} has a field ({{FrameworkConfig config}}), it cannot be 
> implemented using a lambda. It is simpler and clearer to pass {{config}} as 
> an argument to all methods where it is needed.
> {{PrepareAction}} was introduced in CALCITE-247.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (CALCITE-3183) Trimming method for Filter rel uses wrong traitSet

2019-07-15 Thread Julian Hyde (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde resolved CALCITE-3183.
--
   Resolution: Fixed
Fix Version/s: 1.21.0

Fixed in [bb0bae9|https://github.com/julianhyde/calcite/commit/bb0bae9]. Thanks 
for the PR, [~Juhwan]!

> Trimming method for Filter rel uses wrong traitSet 
> ---
>
> Key: CALCITE-3183
> URL: https://issues.apache.org/jira/browse/CALCITE-3183
> Project: Calcite
>  Issue Type: Bug
>Reporter: Juhwan Kim
>Assignee: Juhwan Kim
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.21.0
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> It seems like there is a bug here: 
> https://github.com/apache/calcite/blob/e8d598a434e8dbadaf756f8c57c748f4d7e16fdf/core/src/main/java/org/apache/calcite/sql2rel/RelFieldTrimmer.java#L487.
> Unlike other trimming methods, filter trim function copies the current filter 
> rel and directly pushes it to the builder instead of calling factory method 
> for filter rel. The problem with the current code is that it uses the same 
> traitSet even though it would no longer be valid after trimming its input. 
> For example, fields in collation might have been updated after trimming. We 
> should reflect this change when creating a new rel.
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Comment Edited] (CALCITE-3198) ReduceExpressionsRule.FILTER_INSTANCE does not reduce 'NOT(x=a AND x=b)'

2019-07-15 Thread Ruben Quesada Lopez (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-3198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885045#comment-16885045
 ] 

Ruben Quesada Lopez edited comment on CALCITE-3198 at 7/15/19 10:24 AM:


Same result is obtained with the following test, which is maybe more 
straightforward (and by solving this, the one in the description might be also 
solved as a side effect):
{code}
// query:
select d.deptno from dept d where d.deptno<>7 or d.deptno=8

// plan before:
LogicalProject(DEPTNO=[$0])
  LogicalFilter(condition=[OR(<>($0, 7), <>($0, 8))])
LogicalTableScan(table=[[CATALOG, SALES, DEPT]])

// plan after (actual, unchanged):
LogicalProject(DEPTNO=[$0])
  LogicalFilter(condition=[OR(<>($0, 7), <>($0, 8))])
LogicalTableScan(table=[[CATALOG, SALES, DEPT]])

// plan after (expected, filter removed):
LogicalProject(DEPTNO=[$0])
  LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
{code}


was (Author: rubenql):
Same result is obtained with the following test:
{code}
// query:
select d.deptno from dept d where d.deptno<>7 or d.deptno=8

// plan before:
LogicalProject(DEPTNO=[$0])
  LogicalFilter(condition=[OR(<>($0, 7), <>($0, 8))])
LogicalTableScan(table=[[CATALOG, SALES, DEPT]])

// plan after (actual, unchanged):
LogicalProject(DEPTNO=[$0])
  LogicalFilter(condition=[OR(<>($0, 7), <>($0, 8))])
LogicalTableScan(table=[[CATALOG, SALES, DEPT]])

// plan after (expected, filter removed):
LogicalProject(DEPTNO=[$0])
  LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
{code}

> ReduceExpressionsRule.FILTER_INSTANCE does not reduce 'NOT(x=a AND x=b)'
> 
>
> Key: CALCITE-3198
> URL: https://issues.apache.org/jira/browse/CALCITE-3198
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.20.0
>Reporter: Ruben Quesada Lopez
>Priority: Minor
>
> Currently, ReduceExpressionsRule.FILTER_INSTANCE can successfully reduce a 
> query like this one (see RelOptRulesTest#testReduceConstantsDup):
> {code}
> // query:
> select d.deptno from dept d where d.deptno=7 and d.deptno=8
> // plan before:
> LogicalProject(DEPTNO=[$0])
>   LogicalFilter(condition=[AND(=($0, 7), =($0, 8))])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> // plan after:
> LogicalProject(DEPTNO=[$0])
>   LogicalValues(tuples=[[]])
> {code}
> As we can see, since the filter is 'always false', the 
> LogicalTableScan+LogicalFilter are correctly replaced by an empty 
> LogicalValues.
> However, the same filter with a NOT expression, is not correctly simplified:
> {code}
> // query:
> select d.deptno from dept d where not(d.deptno=7 and d.deptno=8)
> // plan before:
> LogicalProject(DEPTNO=[$0])
>   LogicalFilter(condition=[NOT(AND(=($0, 7), =($0, 8)))])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> // plan after (actual, NOT distributivity for AND):
> LogicalProject(DEPTNO=[$0])
>   LogicalFilter(condition=[OR(<>($0, 7), <>($0, 8))])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> // plan after (expected, filter removed):
> LogicalProject(DEPTNO=[$0])
>   LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {code}
> Since the filter is the negated of an 'always false filter' (the one used in 
> the previous query), it is therefore an 'always true filter', so the expected 
> behavior is that the LogicalFilter should be removed, and it is not.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (CALCITE-3198) ReduceExpressionsRule.FILTER_INSTANCE does not reduce 'NOT(x=a AND x=b)'

2019-07-15 Thread Ruben Quesada Lopez (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Quesada Lopez updated CALCITE-3198:
-
Description: 
Currently, ReduceExpressionsRule.FILTER_INSTANCE can successfully reduce a 
query like this one (see RelOptRulesTest#testReduceConstantsDup):
{code}
// query:
select d.deptno from dept d where d.deptno=7 and d.deptno=8

// plan before:
LogicalProject(DEPTNO=[$0])
  LogicalFilter(condition=[AND(=($0, 7), =($0, 8))])
LogicalTableScan(table=[[CATALOG, SALES, DEPT]])

// plan after:
LogicalProject(DEPTNO=[$0])
  LogicalValues(tuples=[[]])
{code}
As we can see, since the filter is 'always false', the 
LogicalTableScan+LogicalFilter are correctly replaced by an empty LogicalValues.

However, the same filter with a NOT expression, is not correctly simplified:
{code}
// query:
select d.deptno from dept d where not(d.deptno=7 and d.deptno=8)

// plan before:
LogicalProject(DEPTNO=[$0])
  LogicalFilter(condition=[NOT(AND(=($0, 7), =($0, 8)))])
LogicalTableScan(table=[[CATALOG, SALES, DEPT]])

// plan after (actual, NOT distributivity for AND):
LogicalProject(DEPTNO=[$0])
  LogicalFilter(condition=[OR(<>($0, 7), <>($0, 8))])
LogicalTableScan(table=[[CATALOG, SALES, DEPT]])

// plan after (expected, filter removed):
LogicalProject(DEPTNO=[$0])
  LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
{code}
Since the filter is the negated of an 'always false filter' (the one used in 
the previous query), it is therefore an 'always true filter', so the expected 
behavior is that the LogicalFilter should be removed, and it is not.

  was:
Currently, ReduceExpressionsRule.FILTER_INSTANCE can successfully reduce a 
query like this one (see RelOptRulesTest#testReduceConstantsDup):
{code}
// query:
select d.deptno from dept d where d.deptno=7 and d.deptno=8

// plan before:
LogicalProject(DEPTNO=[$0])
  LogicalFilter(condition=[AND(=($0, 7), =($0, 8))])
LogicalTableScan(table=[[CATALOG, SALES, DEPT]])

// plan after:
LogicalProject(DEPTNO=[$0])
  LogicalValues(tuples=[[]])
{code}
As we can see, since the filter is 'always false', the 
LogicalTableScan+LogicalFilter are correctly replaced by an empty LogicalValues.

However, the same filter with a NOT expression, is not correctly simplified:
{code}
// query:
select d.deptno from dept d where not(d.deptno=7 and d.deptno=8)

// plan before:
LogicalProject(DEPTNO=[$0])
  LogicalFilter(condition=[NOT(AND(=($0, 7), =($0, 8)))])
LogicalTableScan(table=[[CATALOG, SALES, DEPT]])

// plan after (actual):
LogicalProject(DEPTNO=[$0])
  LogicalFilter(condition=[OR(<>($0, 7), <>($0, 8))])
LogicalTableScan(table=[[CATALOG, SALES, DEPT]])

// plan after (expected):
LogicalProject(DEPTNO=[$0])
  LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
{code}
Since the filter is the negated of an 'always false filter' (the one used in 
the previous query), it is therefore an 'always true filter', so the expected 
behavior is that the LogicalFilter should be removed, and it is not.


> ReduceExpressionsRule.FILTER_INSTANCE does not reduce 'NOT(x=a AND x=b)'
> 
>
> Key: CALCITE-3198
> URL: https://issues.apache.org/jira/browse/CALCITE-3198
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.20.0
>Reporter: Ruben Quesada Lopez
>Priority: Minor
>
> Currently, ReduceExpressionsRule.FILTER_INSTANCE can successfully reduce a 
> query like this one (see RelOptRulesTest#testReduceConstantsDup):
> {code}
> // query:
> select d.deptno from dept d where d.deptno=7 and d.deptno=8
> // plan before:
> LogicalProject(DEPTNO=[$0])
>   LogicalFilter(condition=[AND(=($0, 7), =($0, 8))])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> // plan after:
> LogicalProject(DEPTNO=[$0])
>   LogicalValues(tuples=[[]])
> {code}
> As we can see, since the filter is 'always false', the 
> LogicalTableScan+LogicalFilter are correctly replaced by an empty 
> LogicalValues.
> However, the same filter with a NOT expression, is not correctly simplified:
> {code}
> // query:
> select d.deptno from dept d where not(d.deptno=7 and d.deptno=8)
> // plan before:
> LogicalProject(DEPTNO=[$0])
>   LogicalFilter(condition=[NOT(AND(=($0, 7), =($0, 8)))])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> // plan after (actual, NOT distributivity for AND):
> LogicalProject(DEPTNO=[$0])
>   LogicalFilter(condition=[OR(<>($0, 7), <>($0, 8))])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> // plan after (expected, filter removed):
> LogicalProject(DEPTNO=[$0])
>   LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {code}
> Since the filter is the negated of an 'always false filter' (the one used in 
> the previous query), it is therefore an 'always true filter', so the expected 
> behavior is that the LogicalFilter should 

[jira] [Commented] (CALCITE-3198) ReduceExpressionsRule.FILTER_INSTANCE does not reduce 'NOT(x=a AND x=b)'

2019-07-15 Thread Ruben Quesada Lopez (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-3198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885045#comment-16885045
 ] 

Ruben Quesada Lopez commented on CALCITE-3198:
--

Same result is obtained with the following test:
{code}
// query:
select d.deptno from dept d where d.deptno<>7 or d.deptno=8

// plan before:
LogicalProject(DEPTNO=[$0])
  LogicalFilter(condition=[OR(<>($0, 7), <>($0, 8))])
LogicalTableScan(table=[[CATALOG, SALES, DEPT]])

// plan after (actual, unchanged):
LogicalProject(DEPTNO=[$0])
  LogicalFilter(condition=[OR(<>($0, 7), <>($0, 8))])
LogicalTableScan(table=[[CATALOG, SALES, DEPT]])

// plan after (expected, filter removed):
LogicalProject(DEPTNO=[$0])
  LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
{code}

> ReduceExpressionsRule.FILTER_INSTANCE does not reduce 'NOT(x=a AND x=b)'
> 
>
> Key: CALCITE-3198
> URL: https://issues.apache.org/jira/browse/CALCITE-3198
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.20.0
>Reporter: Ruben Quesada Lopez
>Priority: Minor
>
> Currently, ReduceExpressionsRule.FILTER_INSTANCE can successfully reduce a 
> query like this one (see RelOptRulesTest#testReduceConstantsDup):
> {code}
> // query:
> select d.deptno from dept d where d.deptno=7 and d.deptno=8
> // plan before:
> LogicalProject(DEPTNO=[$0])
>   LogicalFilter(condition=[AND(=($0, 7), =($0, 8))])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> // plan after:
> LogicalProject(DEPTNO=[$0])
>   LogicalValues(tuples=[[]])
> {code}
> As we can see, since the filter is 'always false', the 
> LogicalTableScan+LogicalFilter are correctly replaced by an empty 
> LogicalValues.
> However, the same filter with a NOT expression, is not correctly simplified:
> {code}
> // query:
> select d.deptno from dept d where not(d.deptno=7 and d.deptno=8)
> // plan before:
> LogicalProject(DEPTNO=[$0])
>   LogicalFilter(condition=[NOT(AND(=($0, 7), =($0, 8)))])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> // plan after (actual):
> LogicalProject(DEPTNO=[$0])
>   LogicalFilter(condition=[OR(<>($0, 7), <>($0, 8))])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> // plan after (expected):
> LogicalProject(DEPTNO=[$0])
>   LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {code}
> Since the filter is the negated of an 'always false filter' (the one used in 
> the previous query), it is therefore an 'always true filter', so the expected 
> behavior is that the LogicalFilter should be removed, and it is not.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CALCITE-2978) sorting not applied in subqueries

2019-07-15 Thread Pressenna (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884999#comment-16884999
 ] 

Pressenna commented on CALCITE-2978:


[~hyuan], yes, you are right, sorry for reopening.

> sorting not applied in subqueries
> -
>
> Key: CALCITE-2978
> URL: https://issues.apache.org/jira/browse/CALCITE-2978
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.19.0
>Reporter: Pressenna
>Assignee: Danny Chan
>Priority: Major
>
> {code:sql}
> CREATE TABLE test (id INT, val INT);
> INSERT INTO test VALUES (1,1);
> INSERT INTO test VALUES (2,2);
> INSERT INTO test VALUES (3,3);
> INSERT INTO test VALUES (4,4);
> SELECT id FROM (SELECT id, val FROM test ORDER BY val DESC);
>  {code}
> Looks like CALCITE-2798 removes the sorting in sub-queries too aggressively.
> Update:
> I might be wrong here and jumped the gun too early.
> Looks like SQL does not dictate that the outer query has to retain any order 
> of the inner query.
> The sort is applied if a {{LIMIT}} is specified in the inner query, to reduce 
> the inner result correctly.
> Happy to close as invalid.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (CALCITE-3198) ReduceExpressionsRule.FILTER_INSTANCE does not reduce 'NOT(x=a AND x=b)'

2019-07-15 Thread Ruben Quesada Lopez (JIRA)
Ruben Quesada Lopez created CALCITE-3198:


 Summary: ReduceExpressionsRule.FILTER_INSTANCE does not reduce 
'NOT(x=a AND x=b)'
 Key: CALCITE-3198
 URL: https://issues.apache.org/jira/browse/CALCITE-3198
 Project: Calcite
  Issue Type: Bug
Affects Versions: 1.20.0
Reporter: Ruben Quesada Lopez


Currently, ReduceExpressionsRule.FILTER_INSTANCE can successfully reduce a 
query like this one (see RelOptRulesTest#testReduceConstantsDup):
{code}
// query:
select d.deptno from dept d where d.deptno=7 and d.deptno=8

// plan before:
LogicalProject(DEPTNO=[$0])
  LogicalFilter(condition=[AND(=($0, 7), =($0, 8))])
LogicalTableScan(table=[[CATALOG, SALES, DEPT]])

// plan after:
LogicalProject(DEPTNO=[$0])
  LogicalValues(tuples=[[]])
{code}
As we can see, since the filter is 'always false', the 
LogicalTableScan+LogicalFilter are correctly replaced by an empty LogicalValues.

However, the same filter with a NOT expression, is not correctly simplified:
{code}
// query:
select d.deptno from dept d where not(d.deptno=7 and d.deptno=8)

// plan before:
LogicalProject(DEPTNO=[$0])
  LogicalFilter(condition=[NOT(AND(=($0, 7), =($0, 8)))])
LogicalTableScan(table=[[CATALOG, SALES, DEPT]])

// plan after (actual):
LogicalProject(DEPTNO=[$0])
  LogicalFilter(condition=[OR(<>($0, 7), <>($0, 8))])
LogicalTableScan(table=[[CATALOG, SALES, DEPT]])

// plan after (expected):
LogicalProject(DEPTNO=[$0])
  LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
{code}
Since the filter is the negated of an 'always false filter' (the one used in 
the previous query), it is therefore an 'always true filter', so the expected 
behavior is that the LogicalFilter should be removed, and it is not.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (CALCITE-2978) sorting not applied in subqueries

2019-07-15 Thread Pressenna (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pressenna resolved CALCITE-2978.

Resolution: Invalid

It works correctly when a LIMIT is applied (then the sorting is done correctly)

> sorting not applied in subqueries
> -
>
> Key: CALCITE-2978
> URL: https://issues.apache.org/jira/browse/CALCITE-2978
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.19.0
>Reporter: Pressenna
>Assignee: Danny Chan
>Priority: Major
>
> {code:sql}
> CREATE TABLE test (id INT, val INT);
> INSERT INTO test VALUES (1,1);
> INSERT INTO test VALUES (2,2);
> INSERT INTO test VALUES (3,3);
> INSERT INTO test VALUES (4,4);
> SELECT id FROM (SELECT id, val FROM test ORDER BY val DESC);
>  {code}
> Looks like CALCITE-2798 removes the sorting in sub-queries too aggressively.
> Update:
> I might be wrong here and jumped the gun too early.
> Looks like SQL does not dictate that the outer query has to retain any order 
> of the inner query.
> The sort is applied if a {{LIMIT}} is specified in the inner query, to reduce 
> the inner result correctly.
> Happy to close as invalid.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CALCITE-2978) sorting not applied in subqueries

2019-07-15 Thread Haisheng Yuan (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884987#comment-16884987
 ] 

Haisheng Yuan commented on CALCITE-2978:


This is still not a valid case, unless you add {{limit 1}}.

> sorting not applied in subqueries
> -
>
> Key: CALCITE-2978
> URL: https://issues.apache.org/jira/browse/CALCITE-2978
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.19.0
>Reporter: Pressenna
>Assignee: Danny Chan
>Priority: Major
>
> {code:sql}
> CREATE TABLE test (id INT, val INT);
> INSERT INTO test VALUES (1,1);
> INSERT INTO test VALUES (2,2);
> INSERT INTO test VALUES (3,3);
> INSERT INTO test VALUES (4,4);
> SELECT id FROM (SELECT id, val FROM test ORDER BY val DESC);
>  {code}
> Looks like CALCITE-2798 removes the sorting in sub-queries too aggressively.
> Update:
> I might be wrong here and jumped the gun too early.
> Looks like SQL does not dictate that the outer query has to retain any order 
> of the inner query.
> The sort is applied if a {{LIMIT}} is specified in the inner query, to reduce 
> the inner result correctly.
> Happy to close as invalid.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CALCITE-2978) sorting not applied in subqueries

2019-07-15 Thread Pressenna (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884979#comment-16884979
 ] 

Pressenna commented on CALCITE-2978:


So I think I finally found a valid use case for this:

 
{code:java}
CREATE TABLE lookup (id INT, name TEXT, rank INT);
INSERT INTO lookup VALUES (1, 'GOOD', 1);
INSERT INTO lookup VALUES (2, 'GOOD', 3);
INSERT INTO lookup VALUES (2, 'GOOD', 2);

CREATE TABLE facts (id INT, name TEXT);
INSERT INTO facts VALUES (1, 'GOOD');
INSERT INTO facts VALUES (2, 'BAD');

SELECT id, (SELECT lookup.id FROM lookup WHERE lookup.name=facts.name) FROM 
facts;

-- In this cases, the order must be preserved.
SELECT id, (SELECT lookup.id FROM lookup WHERE lookup.name=facts.name ORDER BY 
lookup.rank ASC) FROM facts;
-- In this cases, the order must be preserved.
SELECT id, (SELECT lookup.id FROM lookup WHERE lookup.name=facts.name ORDER BY 
lookup.rank DESC) FROM facts;
{code}
 

> sorting not applied in subqueries
> -
>
> Key: CALCITE-2978
> URL: https://issues.apache.org/jira/browse/CALCITE-2978
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.19.0
>Reporter: Pressenna
>Assignee: Danny Chan
>Priority: Major
>
> {code:sql}
> CREATE TABLE test (id INT, val INT);
> INSERT INTO test VALUES (1,1);
> INSERT INTO test VALUES (2,2);
> INSERT INTO test VALUES (3,3);
> INSERT INTO test VALUES (4,4);
> SELECT id FROM (SELECT id, val FROM test ORDER BY val DESC);
>  {code}
> Looks like CALCITE-2798 removes the sorting in sub-queries too aggressively.
> Update:
> I might be wrong here and jumped the gun too early.
> Looks like SQL does not dictate that the outer query has to retain any order 
> of the inner query.
> The sort is applied if a {{LIMIT}} is specified in the inner query, to reduce 
> the inner result correctly.
> Happy to close as invalid.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Reopened] (CALCITE-2978) sorting not applied in subqueries

2019-07-15 Thread Pressenna (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pressenna reopened CALCITE-2978:


> sorting not applied in subqueries
> -
>
> Key: CALCITE-2978
> URL: https://issues.apache.org/jira/browse/CALCITE-2978
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.19.0
>Reporter: Pressenna
>Assignee: Danny Chan
>Priority: Major
>
> {code:sql}
> CREATE TABLE test (id INT, val INT);
> INSERT INTO test VALUES (1,1);
> INSERT INTO test VALUES (2,2);
> INSERT INTO test VALUES (3,3);
> INSERT INTO test VALUES (4,4);
> SELECT id FROM (SELECT id, val FROM test ORDER BY val DESC);
>  {code}
> Looks like CALCITE-2798 removes the sorting in sub-queries too aggressively.
> Update:
> I might be wrong here and jumped the gun too early.
> Looks like SQL does not dictate that the outer query has to retain any order 
> of the inner query.
> The sort is applied if a {{LIMIT}} is specified in the inner query, to reduce 
> the inner result correctly.
> Happy to close as invalid.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (CALCITE-3196) In Frameworks, add BasePrepareAction (a functional interface) and deprecate PrepareAction

2019-07-15 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/CALCITE-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-3196:

Labels: pull-request-available  (was: )

> In Frameworks, add BasePrepareAction (a functional interface) and deprecate 
> PrepareAction
> -
>
> Key: CALCITE-3196
> URL: https://issues.apache.org/jira/browse/CALCITE-3196
> Project: Calcite
>  Issue Type: Bug
>Reporter: Julian Hyde
>Assignee: Danny Chan
>Priority: Major
>  Labels: pull-request-available
>
> In {{Frameworks}}, add {{interface BasePrepareAction}} (a functional 
> interface) and deprecate {{abstract class PrepareAction}}. Because 
> {{PrepareAction}} has a field ({{FrameworkConfig config}}), it cannot be 
> implemented using a lambda. It is simpler and clearer to pass {{config}} as 
> an argument to all methods where it is needed.
> {{PrepareAction}} was introduced in CALCITE-247.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)