[jira] [Commented] (CALCITE-3390) ITEM expression does not get pushed to the right input of left-outer-join

2019-10-08 Thread Aman Sinha (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946953#comment-16946953
 ] 

Aman Sinha commented on CALCITE-3390:
-

Thanks [~julianhyde] and [~jinxing6...@126.com] for your suggestions.  That's 
pretty much along the lines of what I discussed with [~volodymyr] and have a 
WIP branch here [1].  I am in the process of checking whether it breaks any of 
our tests in Drill before creating a PR. 

[1] 
https://github.com/amansinha100/incubator-calcite/commit/08e74a48932a8458d72cd5550c7c0ea9e677f7f5


> ITEM expression does not get pushed to the right input of left-outer-join
> -
>
> Key: CALCITE-3390
> URL: https://issues.apache.org/jira/browse/CALCITE-3390
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.21.0
>Reporter: Aman Sinha
>Assignee: Aman Sinha
>Priority: Major
>
> In the following query, the ITEM expression above the Left Outer Join does 
> not get pushed to the right input (null-preserving input) of the join whereas 
> it should since ITEM does not change the nullability.  
> {noformat}
> explain plan without implementation for select tt7.columns[0], tt8.columns[0] 
> as x from tt7 left outer join tt8  on tt7.columns[0] = tt8.columns[0];
>  DrillScreenRel
>   DrillProjectRel(EXPR$0=[$1], x=[ITEM($2, 0)])
> DrillJoinRel(condition=[=($0, $3)], joinType=[left])
>   DrillProjectRel($f2=[ITEM($0, 0)], ITEM=[ITEM($0, 0)])
> DrillScanRel(table=[[dfs, tmp, tt7]], groupscan=[EasyGroupScan 
> [selectionRoot=file:/tmp/tt7, numFiles=1, columns=[`columns`[0]], 
> files=[file:/tmp/tt7/0_0_0.csv]]])
>   DrillProjectRel(columns=[$0], $f2=[ITEM($0, 0)])
> DrillScanRel(table=[[dfs, tmp, tt8]], groupscan=[EasyGroupScan 
> [selectionRoot=file:/tmp/tt8, numFiles=1, columns=[`columns`, `columns`[0]], 
> files=[file:/tmp/tt8/0_0_0.csv]]])
> {noformat}
> From what I can tell, the change in behavior occurred with CALCITE-1753; 
> before that the ITEM was pushed on both sides of the Left Outer Join. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CALCITE-3390) ITEM expression does not get pushed to the right input of left-outer-join

2019-10-07 Thread Aman Sinha (Jira)
Aman Sinha created CALCITE-3390:
---

 Summary: ITEM expression does not get pushed to the right input of 
left-outer-join
 Key: CALCITE-3390
 URL: https://issues.apache.org/jira/browse/CALCITE-3390
 Project: Calcite
  Issue Type: Bug
  Components: core
Affects Versions: 1.21.0
Reporter: Aman Sinha
Assignee: Aman Sinha


In the following query, the ITEM expression above the Left Outer Join does not 
get pushed to the right input (null-preserving input) of the join whereas it 
should since ITEM does not change the nullability.  

{noformat}
explain plan without implementation for select tt7.columns[0], tt8.columns[0] 
as x from tt7 left outer join tt8  on tt7.columns[0] = tt8.columns[0];

 DrillScreenRel
  DrillProjectRel(EXPR$0=[$1], x=[ITEM($2, 0)])
DrillJoinRel(condition=[=($0, $3)], joinType=[left])
  DrillProjectRel($f2=[ITEM($0, 0)], ITEM=[ITEM($0, 0)])
DrillScanRel(table=[[dfs, tmp, tt7]], groupscan=[EasyGroupScan 
[selectionRoot=file:/tmp/tt7, numFiles=1, columns=[`columns`[0]], 
files=[file:/tmp/tt7/0_0_0.csv]]])
  DrillProjectRel(columns=[$0], $f2=[ITEM($0, 0)])
DrillScanRel(table=[[dfs, tmp, tt8]], groupscan=[EasyGroupScan 
[selectionRoot=file:/tmp/tt8, numFiles=1, columns=[`columns`, `columns`[0]], 
files=[file:/tmp/tt8/0_0_0.csv]]])
{noformat}

>From what I can tell, the change in behavior occurred with  
>https://issues.apache.org/jira/browse/CALCITE-1753 ; before that the ITEM was 
>pushed on both sides of the Left Outer Join. 




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CALCITE-2617) FilterProjectTransposeRule should allow filter conditions with correlated variables to be pushed down

2018-10-13 Thread Aman Sinha (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16649073#comment-16649073
 ] 

Aman Sinha commented on CALCITE-2617:
-

[~julianhyde], [~zabetak] if the tests pass with these changes (by using a 
separate constructor and passing in a predicate for checking correlation) I 
don't have an issue with it.  There are 2 somewhat competing requirements.  
Ideally, I believe the decorrelator needs to be run again after the Filter 
pushdown has occurred (via FPTRule).   

> FilterProjectTransposeRule should allow filter conditions with correlated 
> variables to be pushed down
> -
>
> Key: CALCITE-2617
> URL: https://issues.apache.org/jira/browse/CALCITE-2617
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.17.0
>Reporter: Stamatis Zampetakis
>Assignee: Julian Hyde
>Priority: Major
> Fix For: 1.18.0
>
>
> The rule always forbids conditions with correlated variables to be pushed 
> down (as of [CALCITE-769|https://issues.apache.org/jira/browse/CALCITE-769] 
> to avoid certain problems in the decorrelation of the query). However, in the 
> general context of query optimization, it is beneficial to push-down filters 
> and the fact that there is a correlated variable is not a reason to skip this 
> optimization. 
> In order to avoid regressions, and at the same time enable correlated 
> conditions to be pushed down we should make the pushing of correlated 
> variables configurable. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2191) Drop support for Guava versions earlier than 19

2018-02-24 Thread Aman Sinha (JIRA)

[ 
https://issues.apache.org/jira/browse/CALCITE-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16375951#comment-16375951
 ] 

Aman Sinha commented on CALCITE-2191:
-

[~julianhyde] currently Drill uses Guava 18.  I think the change in Calcite's 
guava version should not directly impact Drill.  I will check and get back if 
that's not the case. 

> Drop support for Guava versions earlier than 19
> ---
>
> Key: CALCITE-2191
> URL: https://issues.apache.org/jira/browse/CALCITE-2191
> Project: Calcite
>  Issue Type: Task
>Reporter: slim bouguerra
>Assignee: Julian Hyde
>Priority: Major
> Fix For: 1.16.0
>
>
> Currently, Calcite-1.15.0 version supports Guava versions from 23 to 14.
> Calcite-1.16.0-Snapshot is building against version 19.0.1 
> As far I know the only reason we support versions earlier to 19 is Hive 
> project depending on Guava 14.0.1 This is not true anymore after 
> https://issues.apache.org/jira/browse/HIVE-15393.
> Druid project is still using Guava 16.0.1 but [some 
> work|https://groups.google.com/forum/#!topic/druid-development/Dw2Qu1CWbuQ] 
> is under review to make sure it is not using deprecated API.   
> Thus I think it is time to Drop support for versions earlier than 19



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2069) RexSimplify.removeNullabilityCast() always removes cast for operand with ANY type

2017-11-29 Thread Aman Sinha (JIRA)

[ 
https://issues.apache.org/jira/browse/CALCITE-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271764#comment-16271764
 ] 

Aman Sinha commented on CALCITE-2069:
-

[~vvysotskyi] ok, makes sense.  It's good that you are adding a unit test for 
this.   Assuming you have run Drill regression tests with your change,  I am 
good with this.  +1.  

> RexSimplify.removeNullabilityCast() always removes cast for operand with ANY 
> type
> -
>
> Key: CALCITE-2069
> URL: https://issues.apache.org/jira/browse/CALCITE-2069
> Project: Calcite
>  Issue Type: Bug
>Reporter: Volodymyr Vysotskyi
>Assignee: Julian Hyde
>
> When a field is received from Dynamic Table, its type left {{ANY}}, and it is 
> used in the filter condition with the cast, which actually should produce 
> physical cast (for example we are trying to cast varchar to boolean) 
> {{RexSimplify.removeNullabilityCast()}} removes this cast and lefts only 
> field in condition.
> This test helps to observe this issue:
> {code:java}
>   @Test public void testFilterCastAny() {
> final RelBuilder builder = RelBuilder.create(config().build());
> final RelDataType intType = 
> builder.getTypeFactory().createSqlType(SqlTypeName.ANY);
> RelNode root =
> builder.scan("EMP")
> .filter(
> builder.cast(
> builder.patternField("varchar_field", intType, 0),
> SqlTypeName.BOOLEAN))
> .build();
> assertThat(str(root),
> is("LogicalFilter(condition=[CAST(varchar_field.$0):BOOLEAN NOT 
> NULL])\n"
> + "  LogicalTableScan(table=[[scott, EMP]])\n"));
>   }
> {code}
> It happens because {{SqlTypeUtil.equalSansNullability()}} returns true if any 
> of its arguments has {{ANY}} type.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (CALCITE-2069) RexSimplify.removeNullabilityCast() always removes cast for operand with ANY type

2017-11-29 Thread Aman Sinha (JIRA)

[ 
https://issues.apache.org/jira/browse/CALCITE-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271723#comment-16271723
 ] 

Aman Sinha commented on CALCITE-2069:
-

[~vvysotskyi] I didn't fully understand the motivation in the JIRA description. 
 Suppose I have a table with 2 columns containing the strings 'true' and 
'false'.  These columns will show as ANY type in Drill.  If I run the following 
query, I still see the CAST function; it is not dropped.  
{noformat}
explain plan for select b from dfs.tmp.test2 where cast(b as boolean) is false;
...
 Filter(condition=[IS FALSE(CAST($0):BOOLEAN)]) : rowType = 
RecordType(ANY b):
...
{noformat}
(note, I am working with the older calcite version, so it is possible this 
behavior may have changed). 


> RexSimplify.removeNullabilityCast() always removes cast for operand with ANY 
> type
> -
>
> Key: CALCITE-2069
> URL: https://issues.apache.org/jira/browse/CALCITE-2069
> Project: Calcite
>  Issue Type: Bug
>Reporter: Volodymyr Vysotskyi
>Assignee: Julian Hyde
>
> When a field is received from Dynamic Table, its type left {{ANY}}, and it is 
> used in the filter condition with the cast, which actually should produce 
> physical cast (for example we are trying to cast varchar to boolean) 
> {{RexSimplify.removeNullabilityCast()}} removes this cast and lefts only 
> field in condition.
> This test helps to observe this issue:
> {code:java}
>   @Test public void testFilterCastAny() {
> final RelBuilder builder = RelBuilder.create(config().build());
> final RelDataType intType = 
> builder.getTypeFactory().createSqlType(SqlTypeName.ANY);
> RelNode root =
> builder.scan("EMP")
> .filter(
> builder.cast(
> builder.patternField("varchar_field", intType, 0),
> SqlTypeName.BOOLEAN))
> .build();
> assertThat(str(root),
> is("LogicalFilter(condition=[CAST(varchar_field.$0):BOOLEAN NOT 
> NULL])\n"
> + "  LogicalTableScan(table=[[scott, EMP]])\n"));
>   }
> {code}
> It happens because {{SqlTypeUtil.equalSansNullability()}} returns true if any 
> of its arguments has {{ANY}} type.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (CALCITE-1048) Make metadata more robust

2017-11-09 Thread Aman Sinha (JIRA)

[ 
https://issues.apache.org/jira/browse/CALCITE-1048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16246869#comment-16246869
 ] 

Aman Sinha commented on CALCITE-1048:
-

The comments in RelMdMaxRowCount point to this JIRA: 
{noformat}
  public Double getMaxRowCount(RelSubset rel, RelMetadataQuery mq) {
// FIXME This is a short-term fix for [CALCITE-1018]. A complete
// solution will come with [CALCITE-1048].
Util.discard(Bug.CALCITE_1048_FIXED);
...
  }
{noformat}

It seems the goal of this JIRA is much broader and I am not sure of its status. 
  For the RelSubset's  max row count, should we consider adding specific 
implementations similar to what was done for CALCITE-1018 (for sort with limit) 
?  For example,  if the RelSubset contains Aggregate with no group-by it will 
have a max rowcount of 1.  

> Make metadata more robust
> -
>
> Key: CALCITE-1048
> URL: https://issues.apache.org/jira/browse/CALCITE-1048
> Project: Calcite
>  Issue Type: Bug
>Reporter: Julian Hyde
>Assignee: Julian Hyde
>
> Following CALCITE-794, make metadata more robust and performant, so we can 
> safely derive metadata from a large RelNode graph.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (CALCITE-1503) Infinite loop occurs during query planning

2016-11-22 Thread Aman Sinha (JIRA)

[ 
https://issues.apache.org/jira/browse/CALCITE-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15687620#comment-15687620
 ] 

Aman Sinha commented on CALCITE-1503:
-

[~migueltaoliveira] can you attach the jstack output ? It will help narrow down 
the issue. 

> Infinite loop occurs during query planning
> --
>
> Key: CALCITE-1503
> URL: https://issues.apache.org/jira/browse/CALCITE-1503
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.10.0
>Reporter: Miguel Oliveira
>Assignee: Julian Hyde
>
> The following query:
> {code}
> SELECT count(*) FROM (
> SELECT count(v1.`region_id`) `Count Region`, v6.`fullname` 
> `Customer (Name)`
> FROM `foodmart`.`region` v1
> JOIN `foodmart`.`store` v3 ON v1.`region_id` = v3.`region_id`
> JOIN `foodmart`.`customer` v6 ON v1.`region_id` = 
> v6.`customer_region_id`
> JOIN `foodmart`.`sales_fact_1998` v15 ON v3.`store_id` = 
> v15.`store_id` AND v6.`customer_id` = v15.`customer_id`
> WHERE v3.`store_name` LIKE '%Grocery%'
> GROUP BY v6.`customer_region_id`,v6.`fullname`)  a 
> {code}
> causes an infinite loop during query plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CALCITE-872) Add support for aborting the query optimization process

2016-07-01 Thread Aman Sinha (JIRA)

[ 
https://issues.apache.org/jira/browse/CALCITE-872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359486#comment-15359486
 ] 

Aman Sinha commented on CALCITE-872:


[~julianhyde] the proposal for the cancelFlag sounds reasonable to me.   It 
wasn't obvious to me why CALCITE-1227 (streaming CSV reader) would be related,  
but I see that it adds functionality for cancel there.   I would have thought 
some additional change would be needed in the VolcanoPlanner and HepPlanner to 
check for the cancel flag to interrupt planning for the 3 things mentioned in 
this JIRA description.  Any thoughts ?

> Add support for aborting the query optimization process
> ---
>
> Key: CALCITE-872
> URL: https://issues.apache.org/jira/browse/CALCITE-872
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.4.0-incubating
>Reporter: Aman Sinha
>Assignee: Julian Hyde
>
> We should have the facility to abort the query optimization process.  There 
> are several motivations for having this: 
> 1. The optimizer's join planning may take too long (order of minutes) when 
>  working with larger number of tables. 
> 2. Certain sequence of rule applications may cause a cycle. 
> 3. Operations related to metadata could potentially introduce a dependency 
> cycle. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CALCITE-1288) Avoid doing the same join twice if count(distinct) exists

2016-06-10 Thread Aman Sinha (JIRA)

[ 
https://issues.apache.org/jira/browse/CALCITE-1288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15325730#comment-15325730
 ] 

Aman Sinha commented on CALCITE-1288:
-

[~julianhyde],  for systems that don't support Grouping Sets (which is the 
enhancement you implemented in CALCITE-732), it would be useful to have this 
rewrite.  Drill currently does not have GS but I would imagine some other 
systems may also benefit, even though this rewrite is specific to a single 
agg(distinct) combined with other non-distinct aggregates.  What do you think ? 

> Avoid doing the same join twice if count(distinct) exists
> -
>
> Key: CALCITE-1288
> URL: https://issues.apache.org/jira/browse/CALCITE-1288
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Gautam Kumar Parai
>Assignee: Gautam Kumar Parai
>
> When the query has one distinct aggregate and one or more non-distinct 
> aggregates, the join instance need not produce the join-based plan. We can 
> generate multi-phase aggregates.
> {code}
> select emp.empno, count(*), avg(distinct dept.deptno) 
> from sales.emp emp inner join sales.dept dept 
> on emp.deptno = dept.deptno 
> group by emp.empno
> LogicalProject(EMPNO=[$0], EXPR$1=[$1], EXPR$2=[$3])
>   LogicalJoin(condition=[IS NOT DISTINCT FROM($0, $2)], joinType=[inner])
> LogicalAggregate(group=[{0}], EXPR$1=[COUNT()])
>   LogicalProject(EMPNO=[$0], DEPTNO0=[$9])
> LogicalJoin(condition=[=($7, $9)], joinType=[inner])
>   LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>   LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> LogicalAggregate(group=[{0}], EXPR$2=[AVG($1)])
>   LogicalAggregate(group=[{0, 1}])
> LogicalProject(EMPNO=[$0], DEPTNO0=[$9])
>   LogicalJoin(condition=[=($7, $9)], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {code}
> The more efficient form should look like 
> {code}
> select emp.empno, count(*), avg(distinct dept.deptno) 
> from sales.emp emp inner join sales.dept dept 
> on emp.deptno = dept.deptno 
> group by emp.empno
> LogicalAggregate(group=[{0}], EXPR$1=[SUM($2)], EXPR$2=[AVG($1)])
>   LogicalAggregate(group=[{0, 1}], EXPR$1=[COUNT()])
> LogicalProject(EMPNO=[$0], DEPTNO0=[$9])
>   LogicalJoin(condition=[=($7, $9)], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CALCITE-777) IS NOT NULL filter is incorrectly dropped for aggregates and window functions

2016-01-24 Thread Aman Sinha (JIRA)

[ 
https://issues.apache.org/jira/browse/CALCITE-777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15114823#comment-15114823
 ] 

Aman Sinha commented on CALCITE-777:


I'll assign to myself for further investigation. 

> IS NOT NULL filter is incorrectly dropped for aggregates and window functions
> -
>
> Key: CALCITE-777
> URL: https://issues.apache.org/jira/browse/CALCITE-777
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.3.0-incubating
>Reporter: Aman Sinha
>Assignee: Julian Hyde
>
> The below plans show the IS NOT NULL filter is incorrectly dropped.  
> {code}
> select wsum from (select sum(sal) over (partition by deptno) as wsum from 
> emp) where wsum is not null;
> LogicalProject(WSUM=[$0])
>   LogicalProject(WSUM=[SUM($5) OVER (PARTITION BY $7 RANGE BETWEEN UNBOUNDED 
> PRECEDING AND UNBOUNDED FOLLOWING)])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> {code}
> select wsum from (select sum(sal) as wsum from emp group by deptno) where 
> wsum is not null;
> LogicalProject(WSUM=[$0])
>   LogicalProject(WSUM=[$1])
> LogicalAggregate(group=[{0}], WSUM=[SUM($1)])
>   LogicalProject(DEPTNO=[$7], SAL=[$5])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CALCITE-777) IS NOT NULL filter is incorrectly dropped for aggregates and window functions

2016-01-24 Thread Aman Sinha (JIRA)

 [ 
https://issues.apache.org/jira/browse/CALCITE-777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aman Sinha reassigned CALCITE-777:
--

Assignee: Aman Sinha  (was: Julian Hyde)

> IS NOT NULL filter is incorrectly dropped for aggregates and window functions
> -
>
> Key: CALCITE-777
> URL: https://issues.apache.org/jira/browse/CALCITE-777
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.3.0-incubating
>Reporter: Aman Sinha
>Assignee: Aman Sinha
>
> The below plans show the IS NOT NULL filter is incorrectly dropped.  
> {code}
> select wsum from (select sum(sal) over (partition by deptno) as wsum from 
> emp) where wsum is not null;
> LogicalProject(WSUM=[$0])
>   LogicalProject(WSUM=[SUM($5) OVER (PARTITION BY $7 RANGE BETWEEN UNBOUNDED 
> PRECEDING AND UNBOUNDED FOLLOWING)])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}
> {code}
> select wsum from (select sum(sal) as wsum from emp group by deptno) where 
> wsum is not null;
> LogicalProject(WSUM=[$0])
>   LogicalProject(WSUM=[$1])
> LogicalAggregate(group=[{0}], WSUM=[SUM($1)])
>   LogicalProject(DEPTNO=[$7], SAL=[$5])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)