[jira] [Updated] (HIVE-17538) Enhance estimation of stats to estimate even if only one column is missing stats

2017-09-16 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17538:
---
Status: Patch Available  (was: Open)

> Enhance estimation of stats to estimate even if only one column is missing 
> stats
> 
>
> Key: HIVE-17538
> URL: https://issues.apache.org/jira/browse/HIVE-17538
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17538.1.patch
>
>
> HIVE-16811 provided support for estimating statistics in absence of stats. 
> But that estimation is done if and only if statistics are missing for all 
> columns. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17493) Improve PKFK cardinality estimation in Physical planning

2017-09-16 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17493:
---
Attachment: HIVE-17493.4.patch

Uploading rebased patch to trigger qtests.

> Improve PKFK cardinality estimation in Physical planning
> 
>
> Key: HIVE-17493
> URL: https://issues.apache.org/jira/browse/HIVE-17493
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17493.1.patch, HIVE-17493.2.patch, 
> HIVE-17493.3.patch, HIVE-17493.4.patch
>
>
> Cardinality estimation of a join, after PK-FK relation has been ascertained, 
> could be improved if parent of the join operator is LEFT outer or RIGHT outer 
> join.
> Currently estimation is done by estimating reduction of rows occurred on PK 
> side, then multiplying the reduction to FK side row count. This estimation of 
> reduction currently doesn't distinguish b/w INNER or OUTER joins. This could 
> be improved to handle outer joins better.
> TPC-DS query45 is impacted by this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17493) Improve PKFK cardinality estimation in Physical planning

2017-09-16 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17493:
---
Status: Patch Available  (was: Open)

> Improve PKFK cardinality estimation in Physical planning
> 
>
> Key: HIVE-17493
> URL: https://issues.apache.org/jira/browse/HIVE-17493
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17493.1.patch, HIVE-17493.2.patch, 
> HIVE-17493.3.patch, HIVE-17493.4.patch
>
>
> Cardinality estimation of a join, after PK-FK relation has been ascertained, 
> could be improved if parent of the join operator is LEFT outer or RIGHT outer 
> join.
> Currently estimation is done by estimating reduction of rows occurred on PK 
> side, then multiplying the reduction to FK side row count. This estimation of 
> reduction currently doesn't distinguish b/w INNER or OUTER joins. This could 
> be improved to handle outer joins better.
> TPC-DS query45 is impacted by this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17493) Improve PKFK cardinality estimation in Physical planning

2017-09-16 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17493:
---
Status: Open  (was: Patch Available)

> Improve PKFK cardinality estimation in Physical planning
> 
>
> Key: HIVE-17493
> URL: https://issues.apache.org/jira/browse/HIVE-17493
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17493.1.patch, HIVE-17493.2.patch, 
> HIVE-17493.3.patch
>
>
> Cardinality estimation of a join, after PK-FK relation has been ascertained, 
> could be improved if parent of the join operator is LEFT outer or RIGHT outer 
> join.
> Currently estimation is done by estimating reduction of rows occurred on PK 
> side, then multiplying the reduction to FK side row count. This estimation of 
> reduction currently doesn't distinguish b/w INNER or OUTER joins. This could 
> be improved to handle outer joins better.
> TPC-DS query45 is impacted by this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-16 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17536:
---
Status: Open  (was: Patch Available)

> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch
>
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively

2017-09-16 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17465:
---
Attachment: HIVE-17465.6.patch

For some reason ptests are not being scheduled for this issue, so rebasing and 
re-uploading new patch. Hopefully ptests will be triggered now.

> Statistics: Drill-down filters don't reduce row-counts progressively
> 
>
> Key: HIVE-17465
> URL: https://issues.apache.org/jira/browse/HIVE-17465
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Gopal V
>Assignee: Vineet Garg
> Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, 
> HIVE-17465.3.patch, HIVE-17465.4.patch, HIVE-17465.5.patch, HIVE-17465.6.patch
>
>
> {code}
> explain select count(d_date_sk) from date_dim where d_year=2001 ;
> explain select count(d_date_sk) from date_dim where d_year=2001  and d_moy = 
> 9;
> explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 
> and d_dom = 21;
> {code}
> All 3 queries end up with the same row-count estimates after the filter.
> {code}
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: (d_year = 2001) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (d_year = 2001) (type: boolean)
> Statistics: Num rows: 363 Data size: 4356 Basic stats: 
> COMPLETE Column stats: COMPLETE
>  
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
> Statistics: Num rows: 363 Data size: 5808 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
> Statistics: Num rows: 363 Data size: 7260 Basic stats: 
> COMPLETE Column stats: COMPLETE
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively

2017-09-16 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17465:
---
Status: Patch Available  (was: Open)

> Statistics: Drill-down filters don't reduce row-counts progressively
> 
>
> Key: HIVE-17465
> URL: https://issues.apache.org/jira/browse/HIVE-17465
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Gopal V
>Assignee: Vineet Garg
> Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, 
> HIVE-17465.3.patch, HIVE-17465.4.patch, HIVE-17465.5.patch, HIVE-17465.6.patch
>
>
> {code}
> explain select count(d_date_sk) from date_dim where d_year=2001 ;
> explain select count(d_date_sk) from date_dim where d_year=2001  and d_moy = 
> 9;
> explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 
> and d_dom = 21;
> {code}
> All 3 queries end up with the same row-count estimates after the filter.
> {code}
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: (d_year = 2001) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (d_year = 2001) (type: boolean)
> Statistics: Num rows: 363 Data size: 4356 Basic stats: 
> COMPLETE Column stats: COMPLETE
>  
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
> Statistics: Num rows: 363 Data size: 5808 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
> Statistics: Num rows: 363 Data size: 7260 Basic stats: 
> COMPLETE Column stats: COMPLETE
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE

2017-09-18 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17535:
---
Status: Patch Available  (was: Open)

> Select 1 EXCEPT Select 1 fails with NPE
> ---
>
> Key: HIVE-17535
> URL: https://issues.apache.org/jira/browse/HIVE-17535
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, 
> HIVE-17535.3.patch
>
>
> Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} 
> queries with SET operators fail (intersect requires CBO).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE

2017-09-18 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17535:
---
Attachment: HIVE-17535.3.patch

> Select 1 EXCEPT Select 1 fails with NPE
> ---
>
> Key: HIVE-17535
> URL: https://issues.apache.org/jira/browse/HIVE-17535
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, 
> HIVE-17535.3.patch
>
>
> Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} 
> queries with SET operators fail (intersect requires CBO).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE

2017-09-18 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17535:
---
Status: Open  (was: Patch Available)

> Select 1 EXCEPT Select 1 fails with NPE
> ---
>
> Key: HIVE-17535
> URL: https://issues.apache.org/jira/browse/HIVE-17535
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch
>
>
> Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} 
> queries with SET operators fail (intersect requires CBO).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively

2017-09-18 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17465:
---
Fix Version/s: 3.0.0

> Statistics: Drill-down filters don't reduce row-counts progressively
> 
>
> Key: HIVE-17465
> URL: https://issues.apache.org/jira/browse/HIVE-17465
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Gopal V
>Assignee: Vineet Garg
> Fix For: 3.0.0
>
> Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, 
> HIVE-17465.3.patch, HIVE-17465.4.patch, HIVE-17465.5.patch, 
> HIVE-17465.6.patch, HIVE-17465.7.patch
>
>
> {code}
> explain select count(d_date_sk) from date_dim where d_year=2001 ;
> explain select count(d_date_sk) from date_dim where d_year=2001  and d_moy = 
> 9;
> explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 
> and d_dom = 21;
> {code}
> All 3 queries end up with the same row-count estimates after the filter.
> {code}
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: (d_year = 2001) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (d_year = 2001) (type: boolean)
> Statistics: Num rows: 363 Data size: 4356 Basic stats: 
> COMPLETE Column stats: COMPLETE
>  
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
> Statistics: Num rows: 363 Data size: 5808 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
> Statistics: Num rows: 363 Data size: 7260 Basic stats: 
> COMPLETE Column stats: COMPLETE
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE

2017-09-18 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16170749#comment-16170749
 ] 

Vineet Garg commented on HIVE-17535:


Latest patch(3) has known failure {{min_structvalue}} which is a bug exposed by 
the patch. Queries such as {code:sql} select max(a), min(a) FROM (select 
named_struct("field",1) as a union all select named_struct("field",2) as a 
union all select named_struct("field",cast(null as int)) as a) tmp{code} fails 
with CBO because CBO ends up loosing {{CAST}} operation resulting in 
{{named_struct("field",cast(null as int)}} to just 
{{named_struct("field",null}}. This results in different schema structure b/w 
union statements which is semantically incorrect. {{
This could be reproduced using simple {code:sql}select 
named_struct("field",cast(null as int)) as a{code}. If we dump new ast after 
CBO we will notice missing CAST operation.

> Select 1 EXCEPT Select 1 fails with NPE
> ---
>
> Key: HIVE-17535
> URL: https://issues.apache.org/jira/browse/HIVE-17535
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, 
> HIVE-17535.3.patch
>
>
> Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} 
> queries with SET operators fail (intersect requires CBO).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE

2017-09-18 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16170764#comment-16170764
 ] 

Vineet Garg commented on HIVE-17535:


Good to know. I'll disable the test for now and will udpate HIVE-16511.

> Select 1 EXCEPT Select 1 fails with NPE
> ---
>
> Key: HIVE-17535
> URL: https://issues.apache.org/jira/browse/HIVE-17535
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, 
> HIVE-17535.3.patch
>
>
> Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} 
> queries with SET operators fail (intersect requires CBO).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE

2017-09-18 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17535:
---
Attachment: HIVE-17535.3.patch

> Select 1 EXCEPT Select 1 fails with NPE
> ---
>
> Key: HIVE-17535
> URL: https://issues.apache.org/jira/browse/HIVE-17535
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, 
> HIVE-17535.3.patch
>
>
> Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} 
> queries with SET operators fail (intersect requires CBO).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE

2017-09-18 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17535:
---
Attachment: (was: HIVE-17535.3.patch)

> Select 1 EXCEPT Select 1 fails with NPE
> ---
>
> Key: HIVE-17535
> URL: https://issues.apache.org/jira/browse/HIVE-17535
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, 
> HIVE-17535.3.patch
>
>
> Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} 
> queries with SET operators fail (intersect requires CBO).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16511) CBO looses inner casts on constants of complex type

2017-09-18 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16170770#comment-16170770
 ] 

Vineet Garg commented on HIVE-16511:


Test {{min_structvalue.q}} needs to be enabled once this issue is fixed.

> CBO looses inner casts on constants of complex type
> ---
>
> Key: HIVE-16511
> URL: https://issues.apache.org/jira/browse/HIVE-16511
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Query Planning
>Reporter: Ashutosh Chauhan
>
> type for map <10, cast(null as int)> becomes map 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE

2017-09-18 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17535:
---
Status: Open  (was: Patch Available)

> Select 1 EXCEPT Select 1 fails with NPE
> ---
>
> Key: HIVE-17535
> URL: https://issues.apache.org/jira/browse/HIVE-17535
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, 
> HIVE-17535.3.patch
>
>
> Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} 
> queries with SET operators fail (intersect requires CBO).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE

2017-09-18 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17535:
---
Status: Patch Available  (was: Open)

> Select 1 EXCEPT Select 1 fails with NPE
> ---
>
> Key: HIVE-17535
> URL: https://issues.apache.org/jira/browse/HIVE-17535
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, 
> HIVE-17535.3.patch
>
>
> Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} 
> queries with SET operators fail (intersect requires CBO).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE

2017-09-19 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17535:
---
Status: Open  (was: Patch Available)

> Select 1 EXCEPT Select 1 fails with NPE
> ---
>
> Key: HIVE-17535
> URL: https://issues.apache.org/jira/browse/HIVE-17535
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, 
> HIVE-17535.3.patch, HIVE-17535.4.patch, HIVE-17535.5.patch
>
>
> Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} 
> queries with SET operators fail (intersect requires CBO).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE

2017-09-19 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17535:
---
Attachment: HIVE-17535.5.patch

> Select 1 EXCEPT Select 1 fails with NPE
> ---
>
> Key: HIVE-17535
> URL: https://issues.apache.org/jira/browse/HIVE-17535
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, 
> HIVE-17535.3.patch, HIVE-17535.4.patch, HIVE-17535.5.patch
>
>
> Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} 
> queries with SET operators fail (intersect requires CBO).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17538) Enhance estimation of stats to estimate even if only one column is missing stats

2017-09-19 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172535#comment-16172535
 ] 

Vineet Garg commented on HIVE-17538:


Unfortunately the test report is not available anymore so will have to re-run 
tests to see if they are related or unrelated.

> Enhance estimation of stats to estimate even if only one column is missing 
> stats
> 
>
> Key: HIVE-17538
> URL: https://issues.apache.org/jira/browse/HIVE-17538
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17538.1.patch
>
>
> HIVE-16811 provided support for estimating statistics in absence of stats. 
> But that estimation is done if and only if statistics are missing for all 
> columns. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-19 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17536:
---
Status: Patch Available  (was: Open)

> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch, 
> HIVE-17536.3.patch, HIVE-17536.4.patch
>
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-19 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17536:
---
Attachment: HIVE-17536.4.patch

> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch, 
> HIVE-17536.3.patch, HIVE-17536.4.patch
>
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-19 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17536:
---
Status: Open  (was: Patch Available)

> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch, 
> HIVE-17536.3.patch
>
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE

2017-09-19 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17535:
---
Status: Patch Available  (was: Open)

Latest patch(5) should fix test failures

> Select 1 EXCEPT Select 1 fails with NPE
> ---
>
> Key: HIVE-17535
> URL: https://issues.apache.org/jira/browse/HIVE-17535
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, 
> HIVE-17535.3.patch, HIVE-17535.4.patch, HIVE-17535.5.patch
>
>
> Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} 
> queries with SET operators fail (intersect requires CBO).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-21 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17536:
---
Attachment: HIVE-17536.6.patch

> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch, 
> HIVE-17536.3.patch, HIVE-17536.4.patch, HIVE-17536.5.patch, HIVE-17536.6.patch
>
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-21 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17536:
---
Status: Open  (was: Patch Available)

> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch, 
> HIVE-17536.3.patch, HIVE-17536.4.patch, HIVE-17536.5.patch, HIVE-17536.6.patch
>
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-21 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17536:
---
Status: Patch Available  (was: Open)

> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch, 
> HIVE-17536.3.patch, HIVE-17536.4.patch, HIVE-17536.5.patch, HIVE-17536.6.patch
>
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17538) Enhance estimation of stats to estimate even if only one column is missing stats

2017-09-23 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17538:
---
Status: Open  (was: Patch Available)

> Enhance estimation of stats to estimate even if only one column is missing 
> stats
> 
>
> Key: HIVE-17538
> URL: https://issues.apache.org/jira/browse/HIVE-17538
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17538.1.patch, HIVE-17538.2.patch, 
> HIVE-17538.3.patch
>
>
> HIVE-16811 provided support for estimating statistics in absence of stats. 
> But that estimation is done if and only if statistics are missing for all 
> columns. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17538) Enhance estimation of stats to estimate even if only one column is missing stats

2017-09-23 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17538:
---
Attachment: HIVE-17538.3.patch

> Enhance estimation of stats to estimate even if only one column is missing 
> stats
> 
>
> Key: HIVE-17538
> URL: https://issues.apache.org/jira/browse/HIVE-17538
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17538.1.patch, HIVE-17538.2.patch, 
> HIVE-17538.3.patch
>
>
> HIVE-16811 provided support for estimating statistics in absence of stats. 
> But that estimation is done if and only if statistics are missing for all 
> columns. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17538) Enhance estimation of stats to estimate even if only one column is missing stats

2017-09-23 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16177976#comment-16177976
 ] 

Vineet Garg commented on HIVE-17538:


Review board link attached

> Enhance estimation of stats to estimate even if only one column is missing 
> stats
> 
>
> Key: HIVE-17538
> URL: https://issues.apache.org/jira/browse/HIVE-17538
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17538.1.patch, HIVE-17538.2.patch, 
> HIVE-17538.3.patch
>
>
> HIVE-16811 provided support for estimating statistics in absence of stats. 
> But that estimation is done if and only if statistics are missing for all 
> columns. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17538) Enhance estimation of stats to estimate even if only one column is missing stats

2017-09-23 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17538:
---
Status: Patch Available  (was: Open)

> Enhance estimation of stats to estimate even if only one column is missing 
> stats
> 
>
> Key: HIVE-17538
> URL: https://issues.apache.org/jira/browse/HIVE-17538
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17538.1.patch, HIVE-17538.2.patch, 
> HIVE-17538.3.patch
>
>
> HIVE-16811 provided support for estimating statistics in absence of stats. 
> But that estimation is done if and only if statistics are missing for all 
> columns. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-20 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17536:
---
Attachment: HIVE-17536.5.patch

> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch, 
> HIVE-17536.3.patch, HIVE-17536.4.patch, HIVE-17536.5.patch
>
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-20 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17536:
---
Status: Open  (was: Patch Available)

> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch, 
> HIVE-17536.3.patch, HIVE-17536.4.patch, HIVE-17536.5.patch
>
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-20 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17536:
---
Status: Patch Available  (was: Open)

> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch, 
> HIVE-17536.3.patch, HIVE-17536.4.patch, HIVE-17536.5.patch
>
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE

2017-09-20 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17535:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to master. Thanks for reviewing [~ashutoshc]

> Select 1 EXCEPT Select 1 fails with NPE
> ---
>
> Key: HIVE-17535
> URL: https://issues.apache.org/jira/browse/HIVE-17535
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, 
> HIVE-17535.3.patch, HIVE-17535.4.patch, HIVE-17535.5.patch
>
>
> Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} 
> queries with SET operators fail (intersect requires CBO).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17308) Improvement in join cardinality estimation

2017-09-20 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17308:
---
Fix Version/s: 3.0.0

> Improvement in join cardinality estimation
> --
>
> Key: HIVE-17308
> URL: https://issues.apache.org/jira/browse/HIVE-17308
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Fix For: 3.0.0
>
> Attachments: HIVE-17308.1.patch, HIVE-17308.2.patch, 
> HIVE-17308.3.patch, HIVE-17308.4.patch, HIVE-17308.5.patch, 
> HIVE-17308.6.patch, HIVE-17308.7.patch, HIVE-17308.8.patch
>
>
> Currently during logical planning join cardinality is estimated assuming no 
> correlation among join keys (This estimation is done using exponential 
> backoff). Physical planning on the other hand consider correlation for multi 
> keys and uses different estimation. We should consider correlation during 
> logical planning as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17308) Improvement in join cardinality estimation

2017-09-20 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16174206#comment-16174206
 ] 

Vineet Garg commented on HIVE-17308:


[~leftylev] Done.

> Improvement in join cardinality estimation
> --
>
> Key: HIVE-17308
> URL: https://issues.apache.org/jira/browse/HIVE-17308
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Fix For: 3.0.0
>
> Attachments: HIVE-17308.1.patch, HIVE-17308.2.patch, 
> HIVE-17308.3.patch, HIVE-17308.4.patch, HIVE-17308.5.patch, 
> HIVE-17308.6.patch, HIVE-17308.7.patch, HIVE-17308.8.patch
>
>
> Currently during logical planning join cardinality is estimated assuming no 
> correlation among join keys (This estimation is done using exponential 
> backoff). Physical planning on the other hand consider correlation for multi 
> keys and uses different estimation. We should consider correlation during 
> logical planning as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE

2017-09-20 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17535:
---
Fix Version/s: 3.0.0

> Select 1 EXCEPT Select 1 fails with NPE
> ---
>
> Key: HIVE-17535
> URL: https://issues.apache.org/jira/browse/HIVE-17535
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Fix For: 3.0.0
>
> Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, 
> HIVE-17535.3.patch, HIVE-17535.4.patch, HIVE-17535.5.patch
>
>
> Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} 
> queries with SET operators fail (intersect requires CBO).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively

2017-09-13 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165300#comment-16165300
 ] 

Vineet Garg commented on HIVE-17465:


[~ashutoshc] I am investigating few suspicious test failures. I'll create RB as 
soon as I am done with the investigation.

> Statistics: Drill-down filters don't reduce row-counts progressively
> 
>
> Key: HIVE-17465
> URL: https://issues.apache.org/jira/browse/HIVE-17465
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Gopal V
>Assignee: Vineet Garg
> Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, 
> HIVE-17465.3.patch
>
>
> {code}
> explain select count(d_date_sk) from date_dim where d_year=2001 ;
> explain select count(d_date_sk) from date_dim where d_year=2001  and d_moy = 
> 9;
> explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 
> and d_dom = 21;
> {code}
> All 3 queries end up with the same row-count estimates after the filter.
> {code}
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: (d_year = 2001) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (d_year = 2001) (type: boolean)
> Statistics: Num rows: 363 Data size: 4356 Basic stats: 
> COMPLETE Column stats: COMPLETE
>  
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
> Statistics: Num rows: 363 Data size: 5808 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
> Statistics: Num rows: 363 Data size: 7260 Basic stats: 
> COMPLETE Column stats: COMPLETE
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17493) Improve PKFK cardinality estimation in Physical planning

2017-09-13 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165298#comment-16165298
 ] 

Vineet Garg commented on HIVE-17493:


Yeah looks like it. I'll rebase and re-upload the patch.

> Improve PKFK cardinality estimation in Physical planning
> 
>
> Key: HIVE-17493
> URL: https://issues.apache.org/jira/browse/HIVE-17493
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17493.1.patch, HIVE-17493.2.patch
>
>
> Cardinality estimation of a join, after PK-FK relation has been ascertained, 
> could be improved if parent of the join operator is LEFT outer or RIGHT outer 
> join.
> Currently estimation is done by estimating reduction of rows occurred on PK 
> side, then multiplying the reduction to FK side row count. This estimation of 
> reduction currently doesn't distinguish b/w INNER or OUTER joins. This could 
> be improved to handle outer joins better.
> TPC-DS query45 is impacted by this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-15 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17536:
---
Attachment: HIVE-17536.2.patch

> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch
>
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-15 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17536:
---
Status: Open  (was: Patch Available)

> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch
>
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively

2017-09-15 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17465:
---
Status: Patch Available  (was: Open)

> Statistics: Drill-down filters don't reduce row-counts progressively
> 
>
> Key: HIVE-17465
> URL: https://issues.apache.org/jira/browse/HIVE-17465
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Gopal V
>Assignee: Vineet Garg
> Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, 
> HIVE-17465.3.patch, HIVE-17465.4.patch, HIVE-17465.5.patch
>
>
> {code}
> explain select count(d_date_sk) from date_dim where d_year=2001 ;
> explain select count(d_date_sk) from date_dim where d_year=2001  and d_moy = 
> 9;
> explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 
> and d_dom = 21;
> {code}
> All 3 queries end up with the same row-count estimates after the filter.
> {code}
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: (d_year = 2001) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (d_year = 2001) (type: boolean)
> Statistics: Num rows: 363 Data size: 4356 Basic stats: 
> COMPLETE Column stats: COMPLETE
>  
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
> Statistics: Num rows: 363 Data size: 5808 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
> Statistics: Num rows: 363 Data size: 7260 Basic stats: 
> COMPLETE Column stats: COMPLETE
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-15 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168219#comment-16168219
 ] 

Vineet Garg commented on HIVE-17536:


[~ashutoshc] if stat key is missing following code will throw 
NumberFormatException {code}Long.parseLong(params.get(statType)){code} since 
{code} params.get(statType){code} will return null on missing stat key. So this 
is already accounted for.

> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17536.1.patch
>
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-15 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17536:
---
Status: Patch Available  (was: Open)

> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch
>
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE

2017-09-16 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17535:
---
Status: Open  (was: Patch Available)

> Select 1 EXCEPT Select 1 fails with NPE
> ---
>
> Key: HIVE-17535
> URL: https://issues.apache.org/jira/browse/HIVE-17535
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch
>
>
> Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} 
> queries with SET operators fail (intersect requires CBO).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE

2017-09-16 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17535:
---
Status: Patch Available  (was: Open)

> Select 1 EXCEPT Select 1 fails with NPE
> ---
>
> Key: HIVE-17535
> URL: https://issues.apache.org/jira/browse/HIVE-17535
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch
>
>
> Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} 
> queries with SET operators fail (intersect requires CBO).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE

2017-09-16 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17535:
---
Attachment: HIVE-17535.2.patch

> Select 1 EXCEPT Select 1 fails with NPE
> ---
>
> Key: HIVE-17535
> URL: https://issues.apache.org/jira/browse/HIVE-17535
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch
>
>
> Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} 
> queries with SET operators fail (intersect requires CBO).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively

2017-09-17 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169469#comment-16169469
 ] 

Vineet Garg commented on HIVE-17465:


[~ashutoshc] can you take a look at updated review?

> Statistics: Drill-down filters don't reduce row-counts progressively
> 
>
> Key: HIVE-17465
> URL: https://issues.apache.org/jira/browse/HIVE-17465
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Gopal V
>Assignee: Vineet Garg
> Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, 
> HIVE-17465.3.patch, HIVE-17465.4.patch, HIVE-17465.5.patch, 
> HIVE-17465.6.patch, HIVE-17465.7.patch
>
>
> {code}
> explain select count(d_date_sk) from date_dim where d_year=2001 ;
> explain select count(d_date_sk) from date_dim where d_year=2001  and d_moy = 
> 9;
> explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 
> and d_dom = 21;
> {code}
> All 3 queries end up with the same row-count estimates after the filter.
> {code}
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: (d_year = 2001) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (d_year = 2001) (type: boolean)
> Statistics: Num rows: 363 Data size: 4356 Basic stats: 
> COMPLETE Column stats: COMPLETE
>  
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
> Statistics: Num rows: 363 Data size: 5808 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
> Statistics: Num rows: 363 Data size: 7260 Basic stats: 
> COMPLETE Column stats: COMPLETE
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively

2017-09-17 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17465:
---
Status: Patch Available  (was: Open)

> Statistics: Drill-down filters don't reduce row-counts progressively
> 
>
> Key: HIVE-17465
> URL: https://issues.apache.org/jira/browse/HIVE-17465
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Gopal V
>Assignee: Vineet Garg
> Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, 
> HIVE-17465.3.patch, HIVE-17465.4.patch, HIVE-17465.5.patch, 
> HIVE-17465.6.patch, HIVE-17465.7.patch
>
>
> {code}
> explain select count(d_date_sk) from date_dim where d_year=2001 ;
> explain select count(d_date_sk) from date_dim where d_year=2001  and d_moy = 
> 9;
> explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 
> and d_dom = 21;
> {code}
> All 3 queries end up with the same row-count estimates after the filter.
> {code}
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: (d_year = 2001) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (d_year = 2001) (type: boolean)
> Statistics: Num rows: 363 Data size: 4356 Basic stats: 
> COMPLETE Column stats: COMPLETE
>  
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
> Statistics: Num rows: 363 Data size: 5808 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
> Statistics: Num rows: 363 Data size: 7260 Basic stats: 
> COMPLETE Column stats: COMPLETE
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively

2017-09-17 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17465:
---
Status: Open  (was: Patch Available)

> Statistics: Drill-down filters don't reduce row-counts progressively
> 
>
> Key: HIVE-17465
> URL: https://issues.apache.org/jira/browse/HIVE-17465
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Gopal V
>Assignee: Vineet Garg
> Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, 
> HIVE-17465.3.patch, HIVE-17465.4.patch, HIVE-17465.5.patch, 
> HIVE-17465.6.patch, HIVE-17465.7.patch
>
>
> {code}
> explain select count(d_date_sk) from date_dim where d_year=2001 ;
> explain select count(d_date_sk) from date_dim where d_year=2001  and d_moy = 
> 9;
> explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 
> and d_dom = 21;
> {code}
> All 3 queries end up with the same row-count estimates after the filter.
> {code}
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: (d_year = 2001) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (d_year = 2001) (type: boolean)
> Statistics: Num rows: 363 Data size: 4356 Basic stats: 
> COMPLETE Column stats: COMPLETE
>  
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
> Statistics: Num rows: 363 Data size: 5808 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
> Statistics: Num rows: 363 Data size: 7260 Basic stats: 
> COMPLETE Column stats: COMPLETE
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively

2017-09-17 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17465:
---
Attachment: HIVE-17465.7.patch

> Statistics: Drill-down filters don't reduce row-counts progressively
> 
>
> Key: HIVE-17465
> URL: https://issues.apache.org/jira/browse/HIVE-17465
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Gopal V
>Assignee: Vineet Garg
> Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, 
> HIVE-17465.3.patch, HIVE-17465.4.patch, HIVE-17465.5.patch, 
> HIVE-17465.6.patch, HIVE-17465.7.patch
>
>
> {code}
> explain select count(d_date_sk) from date_dim where d_year=2001 ;
> explain select count(d_date_sk) from date_dim where d_year=2001  and d_moy = 
> 9;
> explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 
> and d_dom = 21;
> {code}
> All 3 queries end up with the same row-count estimates after the filter.
> {code}
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: (d_year = 2001) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (d_year = 2001) (type: boolean)
> Statistics: Num rows: 363 Data size: 4356 Basic stats: 
> COMPLETE Column stats: COMPLETE
>  
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
> Statistics: Num rows: 363 Data size: 5808 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
> Statistics: Num rows: 363 Data size: 7260 Basic stats: 
> COMPLETE Column stats: COMPLETE
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17493) Improve PKFK cardinality estimation in Physical planning

2017-09-17 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17493:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to master

> Improve PKFK cardinality estimation in Physical planning
> 
>
> Key: HIVE-17493
> URL: https://issues.apache.org/jira/browse/HIVE-17493
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17493.1.patch, HIVE-17493.2.patch, 
> HIVE-17493.3.patch, HIVE-17493.4.patch
>
>
> Cardinality estimation of a join, after PK-FK relation has been ascertained, 
> could be improved if parent of the join operator is LEFT outer or RIGHT outer 
> join.
> Currently estimation is done by estimating reduction of rows occurred on PK 
> side, then multiplying the reduction to FK side row count. This estimation of 
> reduction currently doesn't distinguish b/w INNER or OUTER joins. This could 
> be improved to handle outer joins better.
> TPC-DS query45 is impacted by this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively

2017-09-13 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17465:
---
Status: Patch Available  (was: Open)

> Statistics: Drill-down filters don't reduce row-counts progressively
> 
>
> Key: HIVE-17465
> URL: https://issues.apache.org/jira/browse/HIVE-17465
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Gopal V
>Assignee: Vineet Garg
> Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, 
> HIVE-17465.3.patch, HIVE-17465.4.patch
>
>
> {code}
> explain select count(d_date_sk) from date_dim where d_year=2001 ;
> explain select count(d_date_sk) from date_dim where d_year=2001  and d_moy = 
> 9;
> explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 
> and d_dom = 21;
> {code}
> All 3 queries end up with the same row-count estimates after the filter.
> {code}
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: (d_year = 2001) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (d_year = 2001) (type: boolean)
> Statistics: Num rows: 363 Data size: 4356 Basic stats: 
> COMPLETE Column stats: COMPLETE
>  
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
> Statistics: Num rows: 363 Data size: 5808 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
> Statistics: Num rows: 363 Data size: 7260 Basic stats: 
> COMPLETE Column stats: COMPLETE
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively

2017-09-13 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17465:
---
Status: Open  (was: Patch Available)

> Statistics: Drill-down filters don't reduce row-counts progressively
> 
>
> Key: HIVE-17465
> URL: https://issues.apache.org/jira/browse/HIVE-17465
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Gopal V
>Assignee: Vineet Garg
> Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, 
> HIVE-17465.3.patch, HIVE-17465.4.patch
>
>
> {code}
> explain select count(d_date_sk) from date_dim where d_year=2001 ;
> explain select count(d_date_sk) from date_dim where d_year=2001  and d_moy = 
> 9;
> explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 
> and d_dom = 21;
> {code}
> All 3 queries end up with the same row-count estimates after the filter.
> {code}
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: (d_year = 2001) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (d_year = 2001) (type: boolean)
> Statistics: Num rows: 363 Data size: 4356 Basic stats: 
> COMPLETE Column stats: COMPLETE
>  
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
> Statistics: Num rows: 363 Data size: 5808 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
> Statistics: Num rows: 363 Data size: 7260 Basic stats: 
> COMPLETE Column stats: COMPLETE
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively

2017-09-13 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17465:
---
Attachment: HIVE-17465.4.patch

> Statistics: Drill-down filters don't reduce row-counts progressively
> 
>
> Key: HIVE-17465
> URL: https://issues.apache.org/jira/browse/HIVE-17465
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Gopal V
>Assignee: Vineet Garg
> Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, 
> HIVE-17465.3.patch, HIVE-17465.4.patch
>
>
> {code}
> explain select count(d_date_sk) from date_dim where d_year=2001 ;
> explain select count(d_date_sk) from date_dim where d_year=2001  and d_moy = 
> 9;
> explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 
> and d_dom = 21;
> {code}
> All 3 queries end up with the same row-count estimates after the filter.
> {code}
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: (d_year = 2001) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (d_year = 2001) (type: boolean)
> Statistics: Num rows: 363 Data size: 4356 Basic stats: 
> COMPLETE Column stats: COMPLETE
>  
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
> Statistics: Num rows: 363 Data size: 5808 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
> Statistics: Num rows: 363 Data size: 7260 Basic stats: 
> COMPLETE Column stats: COMPLETE
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively

2017-09-13 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165381#comment-16165381
 ] 

Vineet Garg commented on HIVE-17465:


[~ashutoshc] New patch is uploaded and review board link is linked to the JIRA.

> Statistics: Drill-down filters don't reduce row-counts progressively
> 
>
> Key: HIVE-17465
> URL: https://issues.apache.org/jira/browse/HIVE-17465
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Gopal V
>Assignee: Vineet Garg
> Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, 
> HIVE-17465.3.patch, HIVE-17465.4.patch
>
>
> {code}
> explain select count(d_date_sk) from date_dim where d_year=2001 ;
> explain select count(d_date_sk) from date_dim where d_year=2001  and d_moy = 
> 9;
> explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 
> and d_dom = 21;
> {code}
> All 3 queries end up with the same row-count estimates after the filter.
> {code}
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: (d_year = 2001) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (d_year = 2001) (type: boolean)
> Statistics: Num rows: 363 Data size: 4356 Basic stats: 
> COMPLETE Column stats: COMPLETE
>  
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
> Statistics: Num rows: 363 Data size: 5808 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
> Statistics: Num rows: 363 Data size: 7260 Basic stats: 
> COMPLETE Column stats: COMPLETE
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17493) Improve PKFK cardinality estimation in Physical planning

2017-09-13 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17493:
---
Status: Patch Available  (was: Open)

> Improve PKFK cardinality estimation in Physical planning
> 
>
> Key: HIVE-17493
> URL: https://issues.apache.org/jira/browse/HIVE-17493
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17493.1.patch, HIVE-17493.2.patch, 
> HIVE-17493.3.patch
>
>
> Cardinality estimation of a join, after PK-FK relation has been ascertained, 
> could be improved if parent of the join operator is LEFT outer or RIGHT outer 
> join.
> Currently estimation is done by estimating reduction of rows occurred on PK 
> side, then multiplying the reduction to FK side row count. This estimation of 
> reduction currently doesn't distinguish b/w INNER or OUTER joins. This could 
> be improved to handle outer joins better.
> TPC-DS query45 is impacted by this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17493) Improve PKFK cardinality estimation in Physical planning

2017-09-13 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17493:
---
Status: Open  (was: Patch Available)

> Improve PKFK cardinality estimation in Physical planning
> 
>
> Key: HIVE-17493
> URL: https://issues.apache.org/jira/browse/HIVE-17493
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17493.1.patch, HIVE-17493.2.patch, 
> HIVE-17493.3.patch
>
>
> Cardinality estimation of a join, after PK-FK relation has been ascertained, 
> could be improved if parent of the join operator is LEFT outer or RIGHT outer 
> join.
> Currently estimation is done by estimating reduction of rows occurred on PK 
> side, then multiplying the reduction to FK side row count. This estimation of 
> reduction currently doesn't distinguish b/w INNER or OUTER joins. This could 
> be improved to handle outer joins better.
> TPC-DS query45 is impacted by this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17493) Improve PKFK cardinality estimation in Physical planning

2017-09-13 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17493:
---
Attachment: HIVE-17493.3.patch

> Improve PKFK cardinality estimation in Physical planning
> 
>
> Key: HIVE-17493
> URL: https://issues.apache.org/jira/browse/HIVE-17493
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17493.1.patch, HIVE-17493.2.patch, 
> HIVE-17493.3.patch
>
>
> Cardinality estimation of a join, after PK-FK relation has been ascertained, 
> could be improved if parent of the join operator is LEFT outer or RIGHT outer 
> join.
> Currently estimation is done by estimating reduction of rows occurred on PK 
> side, then multiplying the reduction to FK side row count. This estimation of 
> reduction currently doesn't distinguish b/w INNER or OUTER joins. This could 
> be improved to handle outer joins better.
> TPC-DS query45 is impacted by this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively

2017-09-13 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17465:
---
Status: Open  (was: Patch Available)

> Statistics: Drill-down filters don't reduce row-counts progressively
> 
>
> Key: HIVE-17465
> URL: https://issues.apache.org/jira/browse/HIVE-17465
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Gopal V
>Assignee: Vineet Garg
> Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, 
> HIVE-17465.3.patch, HIVE-17465.4.patch
>
>
> {code}
> explain select count(d_date_sk) from date_dim where d_year=2001 ;
> explain select count(d_date_sk) from date_dim where d_year=2001  and d_moy = 
> 9;
> explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 
> and d_dom = 21;
> {code}
> All 3 queries end up with the same row-count estimates after the filter.
> {code}
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: (d_year = 2001) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (d_year = 2001) (type: boolean)
> Statistics: Num rows: 363 Data size: 4356 Basic stats: 
> COMPLETE Column stats: COMPLETE
>  
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
> Statistics: Num rows: 363 Data size: 5808 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
> Statistics: Num rows: 363 Data size: 7260 Basic stats: 
> COMPLETE Column stats: COMPLETE
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively

2017-09-13 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17465:
---
Status: Patch Available  (was: Open)

> Statistics: Drill-down filters don't reduce row-counts progressively
> 
>
> Key: HIVE-17465
> URL: https://issues.apache.org/jira/browse/HIVE-17465
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Gopal V
>Assignee: Vineet Garg
> Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, 
> HIVE-17465.3.patch, HIVE-17465.4.patch, HIVE-17465.5.patch
>
>
> {code}
> explain select count(d_date_sk) from date_dim where d_year=2001 ;
> explain select count(d_date_sk) from date_dim where d_year=2001  and d_moy = 
> 9;
> explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 
> and d_dom = 21;
> {code}
> All 3 queries end up with the same row-count estimates after the filter.
> {code}
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: (d_year = 2001) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (d_year = 2001) (type: boolean)
> Statistics: Num rows: 363 Data size: 4356 Basic stats: 
> COMPLETE Column stats: COMPLETE
>  
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
> Statistics: Num rows: 363 Data size: 5808 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
> Statistics: Num rows: 363 Data size: 7260 Basic stats: 
> COMPLETE Column stats: COMPLETE
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively

2017-09-13 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17465:
---
Attachment: HIVE-17465.5.patch

Latest patch(5) addresses review comments.

> Statistics: Drill-down filters don't reduce row-counts progressively
> 
>
> Key: HIVE-17465
> URL: https://issues.apache.org/jira/browse/HIVE-17465
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Gopal V
>Assignee: Vineet Garg
> Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, 
> HIVE-17465.3.patch, HIVE-17465.4.patch, HIVE-17465.5.patch
>
>
> {code}
> explain select count(d_date_sk) from date_dim where d_year=2001 ;
> explain select count(d_date_sk) from date_dim where d_year=2001  and d_moy = 
> 9;
> explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 
> and d_dom = 21;
> {code}
> All 3 queries end up with the same row-count estimates after the filter.
> {code}
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: (d_year = 2001) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (d_year = 2001) (type: boolean)
> Statistics: Num rows: 363 Data size: 4356 Basic stats: 
> COMPLETE Column stats: COMPLETE
>  
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
> Statistics: Num rows: 363 Data size: 5808 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
> Statistics: Num rows: 363 Data size: 7260 Basic stats: 
> COMPLETE Column stats: COMPLETE
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE

2017-09-14 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17535:
---
Attachment: HIVE-17535.1.patch

> Select 1 EXCEPT Select 1 fails with NPE
> ---
>
> Key: HIVE-17535
> URL: https://issues.apache.org/jira/browse/HIVE-17535
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17535.1.patch
>
>
> Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} 
> queries with SET operators fail (intersect requires CBO).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE

2017-09-14 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17535:
---
Status: Patch Available  (was: Open)

> Select 1 EXCEPT Select 1 fails with NPE
> ---
>
> Key: HIVE-17535
> URL: https://issues.apache.org/jira/browse/HIVE-17535
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17535.1.patch
>
>
> Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} 
> queries with SET operators fail (intersect requires CBO).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE

2017-09-14 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg reassigned HIVE-17535:
--


> Select 1 EXCEPT Select 1 fails with NPE
> ---
>
> Key: HIVE-17535
> URL: https://issues.apache.org/jira/browse/HIVE-17535
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>
> Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} 
> queries with SET operators fail (intersect requires CBO).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-14 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg reassigned HIVE-17536:
--


> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-14 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17536:
---
Status: Patch Available  (was: Open)

> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17536.1.patch
>
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-14 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17536:
---
Component/s: Statistics

> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17536.1.patch
>
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-14 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17536:
---
Attachment: HIVE-17536.1.patch

> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17536.1.patch
>
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17538) Enhance estimation of stats to estimate even if only one column is missing stats

2017-09-14 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17538:
---
Attachment: HIVE-17538.1.patch

> Enhance estimation of stats to estimate even if only one column is missing 
> stats
> 
>
> Key: HIVE-17538
> URL: https://issues.apache.org/jira/browse/HIVE-17538
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17538.1.patch
>
>
> HIVE-16811 provided support for estimating statistics in absence of stats. 
> But that estimation is done if and only if statistics are missing for all 
> columns. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17538) Enhance estimation of stats to estimate even if only one column is missing stats

2017-09-14 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg reassigned HIVE-17538:
--


> Enhance estimation of stats to estimate even if only one column is missing 
> stats
> 
>
> Key: HIVE-17538
> URL: https://issues.apache.org/jira/browse/HIVE-17538
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>
> HIVE-16811 provided support for estimating statistics in absence of stats. 
> But that estimation is done if and only if statistics are missing for all 
> columns. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17538) Enhance estimation of stats to estimate even if only one column is missing stats

2017-09-21 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17538:
---
Status: Patch Available  (was: Open)

Updated the logic in latest patch(2). Running it to get failures. [~ashutoshc] 
I'll update the code to use sets in next patch.

> Enhance estimation of stats to estimate even if only one column is missing 
> stats
> 
>
> Key: HIVE-17538
> URL: https://issues.apache.org/jira/browse/HIVE-17538
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17538.1.patch, HIVE-17538.2.patch
>
>
> HIVE-16811 provided support for estimating statistics in absence of stats. 
> But that estimation is done if and only if statistics are missing for all 
> columns. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17538) Enhance estimation of stats to estimate even if only one column is missing stats

2017-09-21 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17538:
---
Status: Open  (was: Patch Available)

> Enhance estimation of stats to estimate even if only one column is missing 
> stats
> 
>
> Key: HIVE-17538
> URL: https://issues.apache.org/jira/browse/HIVE-17538
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17538.1.patch, HIVE-17538.2.patch
>
>
> HIVE-16811 provided support for estimating statistics in absence of stats. 
> But that estimation is done if and only if statistics are missing for all 
> columns. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17538) Enhance estimation of stats to estimate even if only one column is missing stats

2017-09-21 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17538:
---
Attachment: HIVE-17538.2.patch

> Enhance estimation of stats to estimate even if only one column is missing 
> stats
> 
>
> Key: HIVE-17538
> URL: https://issues.apache.org/jira/browse/HIVE-17538
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17538.1.patch, HIVE-17538.2.patch
>
>
> HIVE-16811 provided support for estimating statistics in absence of stats. 
> But that estimation is done if and only if statistics are missing for all 
> columns. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-22 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176734#comment-16176734
 ] 

Vineet Garg commented on HIVE-17536:


Test failures are unrelated.

> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch, 
> HIVE-17536.3.patch, HIVE-17536.4.patch, HIVE-17536.5.patch, HIVE-17536.6.patch
>
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively

2017-09-18 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17465:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to master. Thanks for reviewing [~ashutoshc]

> Statistics: Drill-down filters don't reduce row-counts progressively
> 
>
> Key: HIVE-17465
> URL: https://issues.apache.org/jira/browse/HIVE-17465
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Gopal V
>Assignee: Vineet Garg
> Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, 
> HIVE-17465.3.patch, HIVE-17465.4.patch, HIVE-17465.5.patch, 
> HIVE-17465.6.patch, HIVE-17465.7.patch
>
>
> {code}
> explain select count(d_date_sk) from date_dim where d_year=2001 ;
> explain select count(d_date_sk) from date_dim where d_year=2001  and d_moy = 
> 9;
> explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 
> and d_dom = 21;
> {code}
> All 3 queries end up with the same row-count estimates after the filter.
> {code}
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: (d_year = 2001) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: (d_year = 2001) (type: boolean)
> Statistics: Num rows: 363 Data size: 4356 Basic stats: 
> COMPLETE Column stats: COMPLETE
>  
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9)) (type: 
> boolean)
> Statistics: Num rows: 363 Data size: 5808 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: date_dim
>   filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
>   Statistics: Num rows: 73049 Data size: 82034027 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Filter Operator
> predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = 
> 21)) (type: boolean)
> Statistics: Num rows: 363 Data size: 7260 Basic stats: 
> COMPLETE Column stats: COMPLETE
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE

2017-09-18 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17535:
---
Attachment: HIVE-17535.4.patch

> Select 1 EXCEPT Select 1 fails with NPE
> ---
>
> Key: HIVE-17535
> URL: https://issues.apache.org/jira/browse/HIVE-17535
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, 
> HIVE-17535.3.patch, HIVE-17535.4.patch
>
>
> Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} 
> queries with SET operators fail (intersect requires CBO).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE

2017-09-18 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17535:
---
Status: Open  (was: Patch Available)

> Select 1 EXCEPT Select 1 fails with NPE
> ---
>
> Key: HIVE-17535
> URL: https://issues.apache.org/jira/browse/HIVE-17535
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, 
> HIVE-17535.3.patch, HIVE-17535.4.patch
>
>
> Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} 
> queries with SET operators fail (intersect requires CBO).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE

2017-09-18 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17535:
---
Status: Patch Available  (was: Open)

> Select 1 EXCEPT Select 1 fails with NPE
> ---
>
> Key: HIVE-17535
> URL: https://issues.apache.org/jira/browse/HIVE-17535
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, 
> HIVE-17535.3.patch, HIVE-17535.4.patch
>
>
> Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} 
> queries with SET operators fail (intersect requires CBO).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-15 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168751#comment-16168751
 ] 

Vineet Garg commented on HIVE-17536:


This is for queries without from e.g. {{select 1}}. getBasicStats return -1 and 
then {{estimateRowSizeFromSchema}} isn't able to estimate since there is no 
data so we end up returning -1. Previously on countering 0 number of rows we 
were instead returning 1.
One way of handling this is to change getNumRows to return 1 instead of -1.

> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch
>
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17553) CBO wrongly type cast decimal literal to int

2017-10-06 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17553:
---
Attachment: HIVE-17553.2.patch

> CBO wrongly type cast decimal literal to int
> 
>
> Key: HIVE-17553
> URL: https://issues.apache.org/jira/browse/HIVE-17553
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17553.1.patch, HIVE-17553.2.patch
>
>
> {code:sql}explain select 100.000BD from f{code}
> {noformat}
> STAGE PLANS:
>   Stage: Stage-0
> Fetch Operator
>   limit: -1
>   Processor Tree:
> TableScan
>   alias: f
>   Select Operator
> expressions: 100 (type: int)
> outputColumnNames: _col0
> ListSink
> {noformat}
> Notice that the expression 100.000BD is of type int instead of decimal.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17553) CBO wrongly type cast decimal literal to int

2017-10-06 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17553:
---
Status: Open  (was: Patch Available)

> CBO wrongly type cast decimal literal to int
> 
>
> Key: HIVE-17553
> URL: https://issues.apache.org/jira/browse/HIVE-17553
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17553.1.patch
>
>
> {code:sql}explain select 100.000BD from f{code}
> {noformat}
> STAGE PLANS:
>   Stage: Stage-0
> Fetch Operator
>   limit: -1
>   Processor Tree:
> TableScan
>   alias: f
>   Select Operator
> expressions: 100 (type: int)
> outputColumnNames: _col0
> ListSink
> {noformat}
> Notice that the expression 100.000BD is of type int instead of decimal.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17553) CBO wrongly type cast decimal literal to int

2017-10-06 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17553:
---
Status: Patch Available  (was: Open)

> CBO wrongly type cast decimal literal to int
> 
>
> Key: HIVE-17553
> URL: https://issues.apache.org/jira/browse/HIVE-17553
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17553.1.patch, HIVE-17553.2.patch
>
>
> {code:sql}explain select 100.000BD from f{code}
> {noformat}
> STAGE PLANS:
>   Stage: Stage-0
> Fetch Operator
>   limit: -1
>   Processor Tree:
> TableScan
>   alias: f
>   Select Operator
> expressions: 100 (type: int)
> outputColumnNames: _col0
> ListSink
> {noformat}
> Notice that the expression 100.000BD is of type int instead of decimal.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17602) Explain plan not working

2017-10-02 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17602:
---
Attachment: HIVE-17602.3.patch

> Explain plan not working
> 
>
> Key: HIVE-17602
> URL: https://issues.apache.org/jira/browse/HIVE-17602
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 3.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-17602.1.patch, HIVE-17602.2.patch, 
> HIVE-17602.3.patch
>
>
> {code:sql}
> hive> CREATE TABLE src (key STRING COMMENT 'default', value STRING COMMENT 
> 'default') STORED AS TEXTFILE;
> hive> explain select * from src where key > '4';
> Failed with exception wrong number of arguments
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.ExplainTask
> {code}
> Error stack in hive.log
> {noformat}
> 2017-09-25T21:18:59,591 ERROR [726b5e51-f470-4a79-be8c-95b82a6aa85d main] 
> exec.Task: Failed with exception wrong number of arguments
> java.lang.IllegalArgumentException: wrong number of arguments
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:896)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:774)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:797)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputList(ExplainTask.java:635)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:968)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputMap(ExplainTask.java:569)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:954)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:1052)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputStagePlans(ExplainTask.java:1197)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:275)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:220)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.execute(ExplainTask.java:368)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:204)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2190)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1832)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1549)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1304)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1294)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:827)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:765)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:234)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17602) Explain plan not working

2017-10-02 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17602:
---
Status: Patch Available  (was: Open)

> Explain plan not working
> 
>
> Key: HIVE-17602
> URL: https://issues.apache.org/jira/browse/HIVE-17602
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 3.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-17602.1.patch, HIVE-17602.2.patch, 
> HIVE-17602.3.patch
>
>
> {code:sql}
> hive> CREATE TABLE src (key STRING COMMENT 'default', value STRING COMMENT 
> 'default') STORED AS TEXTFILE;
> hive> explain select * from src where key > '4';
> Failed with exception wrong number of arguments
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.ExplainTask
> {code}
> Error stack in hive.log
> {noformat}
> 2017-09-25T21:18:59,591 ERROR [726b5e51-f470-4a79-be8c-95b82a6aa85d main] 
> exec.Task: Failed with exception wrong number of arguments
> java.lang.IllegalArgumentException: wrong number of arguments
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:896)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:774)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:797)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputList(ExplainTask.java:635)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:968)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputMap(ExplainTask.java:569)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:954)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:1052)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputStagePlans(ExplainTask.java:1197)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:275)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:220)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.execute(ExplainTask.java:368)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:204)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2190)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1832)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1549)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1304)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1294)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:827)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:765)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:234)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17602) Explain plan not working

2017-10-02 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17602:
---
Status: Open  (was: Patch Available)

> Explain plan not working
> 
>
> Key: HIVE-17602
> URL: https://issues.apache.org/jira/browse/HIVE-17602
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 3.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-17602.1.patch, HIVE-17602.2.patch, 
> HIVE-17602.3.patch
>
>
> {code:sql}
> hive> CREATE TABLE src (key STRING COMMENT 'default', value STRING COMMENT 
> 'default') STORED AS TEXTFILE;
> hive> explain select * from src where key > '4';
> Failed with exception wrong number of arguments
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.ExplainTask
> {code}
> Error stack in hive.log
> {noformat}
> 2017-09-25T21:18:59,591 ERROR [726b5e51-f470-4a79-be8c-95b82a6aa85d main] 
> exec.Task: Failed with exception wrong number of arguments
> java.lang.IllegalArgumentException: wrong number of arguments
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:896)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:774)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:797)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputList(ExplainTask.java:635)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:968)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputMap(ExplainTask.java:569)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:954)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:1052)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputStagePlans(ExplainTask.java:1197)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:275)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:220)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.execute(ExplainTask.java:368)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:204)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2190)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1832)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1549)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1304)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1294)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:827)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:765)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:234)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16511) CBO looses inner casts on constants of complex type

2017-10-03 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-16511:
---
Attachment: HIVE-16511.2.patch

> CBO looses inner casts on constants of complex type
> ---
>
> Key: HIVE-16511
> URL: https://issues.apache.org/jira/browse/HIVE-16511
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Query Planning
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
> Attachments: HIVE-16511.1.patch, HIVE-16511.2.patch
>
>
> type for map <10, cast(null as int)> becomes map 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16511) CBO looses inner casts on constants of complex type

2017-10-03 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-16511:
---
Status: Open  (was: Patch Available)

> CBO looses inner casts on constants of complex type
> ---
>
> Key: HIVE-16511
> URL: https://issues.apache.org/jira/browse/HIVE-16511
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Query Planning
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
> Attachments: HIVE-16511.1.patch
>
>
> type for map <10, cast(null as int)> becomes map 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16511) CBO looses inner casts on constants of complex type

2017-10-03 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-16511:
---
Status: Patch Available  (was: Open)

> CBO looses inner casts on constants of complex type
> ---
>
> Key: HIVE-16511
> URL: https://issues.apache.org/jira/browse/HIVE-16511
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Query Planning
>Reporter: Ashutosh Chauhan
>Assignee: Vineet Garg
> Attachments: HIVE-16511.1.patch, HIVE-16511.2.patch
>
>
> type for map <10, cast(null as int)> becomes map 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17726) Using exists may lead to incorrect results

2017-10-09 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg reassigned HIVE-17726:
--

Assignee: Vineet Garg

> Using exists may lead to incorrect results
> --
>
> Key: HIVE-17726
> URL: https://issues.apache.org/jira/browse/HIVE-17726
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Zoltan Haindrich
>Assignee: Vineet Garg
>
> {code}
> drop table if exists tx1;
> create table tx1 (a integer,b integer);
> insert into tx1   values  (1, 1),
> (1, 2),
> (1, 3);
> select count(*) as result,3 as expected from tx1 u
> where exists (select * from tx1 v where u.a=v.a and u.b <> v.b);
> select count(*) as result,3 as expected from tx1 u
> where exists (select * from tx1 v where u.a=v.a and u.b <> v.b limit 1);
> {code}
> current results are 6 and 2.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17767) Rewrite correlated EXISTS/IN subqueries into LEFT SEMI JOIN

2017-10-10 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg reassigned HIVE-17767:
--


> Rewrite correlated EXISTS/IN subqueries into LEFT SEMI JOIN
> ---
>
> Key: HIVE-17767
> URL: https://issues.apache.org/jira/browse/HIVE-17767
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>
> Currently such queries are written into group by + inner join with value 
> generator and is inefficient. Value generator consists of join with outer 
> query to fetch all correlated values. This value generator could be 
> completely eliminated if such queries are instead rewritten into LEFT SEMI 
> JOIN.
> Note that to do this first hive need to support LEFT SEMI JOIN with non-equi 
> condition (HIVE-17766).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17553) CBO wrongly type cast decimal literal to int

2017-10-05 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193898#comment-16193898
 ] 

Vineet Garg commented on HIVE-17553:


This happens due to RexBuilder::makeExactLiteral creating Integer/BigInteger if 
scale happens to be zero. Hive probably need to create the type own its own and 
pass it to makeExactLiteral to preserve the type.

> CBO wrongly type cast decimal literal to int
> 
>
> Key: HIVE-17553
> URL: https://issues.apache.org/jira/browse/HIVE-17553
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>
> {code:sql}explain select 100.000BD from f{code}
> {noformat}
> STAGE PLANS:
>   Stage: Stage-0
> Fetch Operator
>   limit: -1
>   Processor Tree:
> TableScan
>   alias: f
>   Select Operator
> expressions: 100 (type: int)
> outputColumnNames: _col0
> ListSink
> {noformat}
> Notice that the expression 100.000BD is of type int instead of decimal.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17553) CBO wrongly type cast decimal literal to int

2017-10-05 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17553:
---
Status: Patch Available  (was: Open)

> CBO wrongly type cast decimal literal to int
> 
>
> Key: HIVE-17553
> URL: https://issues.apache.org/jira/browse/HIVE-17553
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17553.1.patch
>
>
> {code:sql}explain select 100.000BD from f{code}
> {noformat}
> STAGE PLANS:
>   Stage: Stage-0
> Fetch Operator
>   limit: -1
>   Processor Tree:
> TableScan
>   alias: f
>   Select Operator
> expressions: 100 (type: int)
> outputColumnNames: _col0
> ListSink
> {noformat}
> Notice that the expression 100.000BD is of type int instead of decimal.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17553) CBO wrongly type cast decimal literal to int

2017-10-05 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17553:
---
Attachment: HIVE-17553.1.patch

> CBO wrongly type cast decimal literal to int
> 
>
> Key: HIVE-17553
> URL: https://issues.apache.org/jira/browse/HIVE-17553
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17553.1.patch
>
>
> {code:sql}explain select 100.000BD from f{code}
> {noformat}
> STAGE PLANS:
>   Stage: Stage-0
> Fetch Operator
>   limit: -1
>   Processor Tree:
> TableScan
>   alias: f
>   Select Operator
> expressions: 100 (type: int)
> outputColumnNames: _col0
> ListSink
> {noformat}
> Notice that the expression 100.000BD is of type int instead of decimal.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17553) CBO wrongly type cast decimal literal to int

2017-10-05 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg reassigned HIVE-17553:
--

Assignee: Vineet Garg

> CBO wrongly type cast decimal literal to int
> 
>
> Key: HIVE-17553
> URL: https://issues.apache.org/jira/browse/HIVE-17553
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>
> {code:sql}explain select 100.000BD from f{code}
> {noformat}
> STAGE PLANS:
>   Stage: Stage-0
> Fetch Operator
>   limit: -1
>   Processor Tree:
> TableScan
>   alias: f
>   Select Operator
> expressions: 100 (type: int)
> outputColumnNames: _col0
> ListSink
> {noformat}
> Notice that the expression 100.000BD is of type int instead of decimal.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17553) CBO wrongly type cast decimal literal to int

2017-10-05 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193898#comment-16193898
 ] 

Vineet Garg edited comment on HIVE-17553 at 10/5/17 11:24 PM:
--

This happens due to RexBuilder::makeExactLiteral (in calcite) creating 
Integer/BigInteger if scale happens to be zero. Hive probably need to create 
the type own its own and pass it to makeExactLiteral to preserve the type.


was (Author: vgarg):
This happens due to RexBuilder::makeExactLiteral creating Integer/BigInteger if 
scale happens to be zero. Hive probably need to create the type own its own and 
pass it to makeExactLiteral to preserve the type.

> CBO wrongly type cast decimal literal to int
> 
>
> Key: HIVE-17553
> URL: https://issues.apache.org/jira/browse/HIVE-17553
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>
> {code:sql}explain select 100.000BD from f{code}
> {noformat}
> STAGE PLANS:
>   Stage: Stage-0
> Fetch Operator
>   limit: -1
>   Processor Tree:
> TableScan
>   alias: f
>   Select Operator
> expressions: 100 (type: int)
> outputColumnNames: _col0
> ListSink
> {noformat}
> Notice that the expression 100.000BD is of type int instead of decimal.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17726) Using exists may lead to incorrect results

2017-10-16 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206298#comment-16206298
 ] 

Vineet Garg commented on HIVE-17726:


Hi [~pvary].  This does look like caused by HIVE-17726. Is there a JIRA 
associated with the failure? I'll update the test and will commit it.

> Using exists may lead to incorrect results
> --
>
> Key: HIVE-17726
> URL: https://issues.apache.org/jira/browse/HIVE-17726
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Zoltan Haindrich
>Assignee: Vineet Garg
> Attachments: HIVE-17726.1.patch, HIVE-17726.2.patch, 
> HIVE-17726.3.patch
>
>
> {code}
> drop table if exists tx1;
> create table tx1 (a integer,b integer);
> insert into tx1   values  (1, 1),
> (1, 2),
> (1, 3);
> select count(*) as result,3 as expected from tx1 u
> where exists (select * from tx1 v where u.a=v.a and u.b <> v.b);
> select count(*) as result,3 as expected from tx1 u
> where exists (select * from tx1 v where u.a=v.a and u.b <> v.b limit 1);
> {code}
> current results are 6 and 2.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17756) Enable subquery related Qtests for Hive on Spark

2017-10-16 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206531#comment-16206531
 ] 

Vineet Garg commented on HIVE-17756:


[~dapengsun] Can you open a jira and regenerate failing tests? I can take a 
look then.

> Enable subquery related Qtests for Hive on Spark
> 
>
> Key: HIVE-17756
> URL: https://issues.apache.org/jira/browse/HIVE-17756
> Project: Hive
>  Issue Type: Sub-task
>  Components: Logical Optimizer
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Fix For: 3.0.0
>
> Attachments: HIVE-17756.001.patch
>
>
> HIVE-15456 and HIVE-15192 using Calsite to decorrelate and plan subqueries. 
> This JIRA is to indroduce subquery test and verify the subqueries plan for 
> Hive on Spark



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


<    3   4   5   6   7   8   9   10   11   12   >