[jira] [Updated] (HIVE-17538) Enhance estimation of stats to estimate even if only one column is missing stats
[ https://issues.apache.org/jira/browse/HIVE-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17538: --- Status: Patch Available (was: Open) > Enhance estimation of stats to estimate even if only one column is missing > stats > > > Key: HIVE-17538 > URL: https://issues.apache.org/jira/browse/HIVE-17538 > Project: Hive > Issue Type: Improvement >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17538.1.patch > > > HIVE-16811 provided support for estimating statistics in absence of stats. > But that estimation is done if and only if statistics are missing for all > columns. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17493) Improve PKFK cardinality estimation in Physical planning
[ https://issues.apache.org/jira/browse/HIVE-17493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17493: --- Attachment: HIVE-17493.4.patch Uploading rebased patch to trigger qtests. > Improve PKFK cardinality estimation in Physical planning > > > Key: HIVE-17493 > URL: https://issues.apache.org/jira/browse/HIVE-17493 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17493.1.patch, HIVE-17493.2.patch, > HIVE-17493.3.patch, HIVE-17493.4.patch > > > Cardinality estimation of a join, after PK-FK relation has been ascertained, > could be improved if parent of the join operator is LEFT outer or RIGHT outer > join. > Currently estimation is done by estimating reduction of rows occurred on PK > side, then multiplying the reduction to FK side row count. This estimation of > reduction currently doesn't distinguish b/w INNER or OUTER joins. This could > be improved to handle outer joins better. > TPC-DS query45 is impacted by this. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17493) Improve PKFK cardinality estimation in Physical planning
[ https://issues.apache.org/jira/browse/HIVE-17493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17493: --- Status: Patch Available (was: Open) > Improve PKFK cardinality estimation in Physical planning > > > Key: HIVE-17493 > URL: https://issues.apache.org/jira/browse/HIVE-17493 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17493.1.patch, HIVE-17493.2.patch, > HIVE-17493.3.patch, HIVE-17493.4.patch > > > Cardinality estimation of a join, after PK-FK relation has been ascertained, > could be improved if parent of the join operator is LEFT outer or RIGHT outer > join. > Currently estimation is done by estimating reduction of rows occurred on PK > side, then multiplying the reduction to FK side row count. This estimation of > reduction currently doesn't distinguish b/w INNER or OUTER joins. This could > be improved to handle outer joins better. > TPC-DS query45 is impacted by this. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17493) Improve PKFK cardinality estimation in Physical planning
[ https://issues.apache.org/jira/browse/HIVE-17493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17493: --- Status: Open (was: Patch Available) > Improve PKFK cardinality estimation in Physical planning > > > Key: HIVE-17493 > URL: https://issues.apache.org/jira/browse/HIVE-17493 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17493.1.patch, HIVE-17493.2.patch, > HIVE-17493.3.patch > > > Cardinality estimation of a join, after PK-FK relation has been ascertained, > could be improved if parent of the join operator is LEFT outer or RIGHT outer > join. > Currently estimation is done by estimating reduction of rows occurred on PK > side, then multiplying the reduction to FK side row count. This estimation of > reduction currently doesn't distinguish b/w INNER or OUTER joins. This could > be improved to handle outer joins better. > TPC-DS query45 is impacted by this. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats
[ https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17536: --- Status: Open (was: Patch Available) > StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics > or zero stats > --- > > Key: HIVE-17536 > URL: https://issues.apache.org/jira/browse/HIVE-17536 > Project: Hive > Issue Type: Improvement > Components: Statistics >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch > > > This method returns zero for both of the following cases: > * Statistics are missing in metastore > * Actual stats e.g. number of rows are zero > It'll be good for this method to return e.g. -1 in absence of statistics > instead of assuming it to be zero. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively
[ https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17465: --- Attachment: HIVE-17465.6.patch For some reason ptests are not being scheduled for this issue, so rebasing and re-uploading new patch. Hopefully ptests will be triggered now. > Statistics: Drill-down filters don't reduce row-counts progressively > > > Key: HIVE-17465 > URL: https://issues.apache.org/jira/browse/HIVE-17465 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer, Statistics >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, > HIVE-17465.3.patch, HIVE-17465.4.patch, HIVE-17465.5.patch, HIVE-17465.6.patch > > > {code} > explain select count(d_date_sk) from date_dim where d_year=2001 ; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = > 9; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 > and d_dom = 21; > {code} > All 3 queries end up with the same row-count estimates after the filter. > {code} > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: (d_year = 2001) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (d_year = 2001) (type: boolean) > Statistics: Num rows: 363 Data size: 4356 Basic stats: > COMPLETE Column stats: COMPLETE > > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 363 Data size: 5808 Basic stats: > COMPLETE Column stats: COMPLETE > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 363 Data size: 7260 Basic stats: > COMPLETE Column stats: COMPLETE > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively
[ https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17465: --- Status: Patch Available (was: Open) > Statistics: Drill-down filters don't reduce row-counts progressively > > > Key: HIVE-17465 > URL: https://issues.apache.org/jira/browse/HIVE-17465 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer, Statistics >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, > HIVE-17465.3.patch, HIVE-17465.4.patch, HIVE-17465.5.patch, HIVE-17465.6.patch > > > {code} > explain select count(d_date_sk) from date_dim where d_year=2001 ; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = > 9; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 > and d_dom = 21; > {code} > All 3 queries end up with the same row-count estimates after the filter. > {code} > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: (d_year = 2001) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (d_year = 2001) (type: boolean) > Statistics: Num rows: 363 Data size: 4356 Basic stats: > COMPLETE Column stats: COMPLETE > > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 363 Data size: 5808 Basic stats: > COMPLETE Column stats: COMPLETE > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 363 Data size: 7260 Basic stats: > COMPLETE Column stats: COMPLETE > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17535: --- Status: Patch Available (was: Open) > Select 1 EXCEPT Select 1 fails with NPE > --- > > Key: HIVE-17535 > URL: https://issues.apache.org/jira/browse/HIVE-17535 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, > HIVE-17535.3.patch > > > Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} > queries with SET operators fail (intersect requires CBO). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17535: --- Attachment: HIVE-17535.3.patch > Select 1 EXCEPT Select 1 fails with NPE > --- > > Key: HIVE-17535 > URL: https://issues.apache.org/jira/browse/HIVE-17535 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, > HIVE-17535.3.patch > > > Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} > queries with SET operators fail (intersect requires CBO). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17535: --- Status: Open (was: Patch Available) > Select 1 EXCEPT Select 1 fails with NPE > --- > > Key: HIVE-17535 > URL: https://issues.apache.org/jira/browse/HIVE-17535 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch > > > Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} > queries with SET operators fail (intersect requires CBO). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively
[ https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17465: --- Fix Version/s: 3.0.0 > Statistics: Drill-down filters don't reduce row-counts progressively > > > Key: HIVE-17465 > URL: https://issues.apache.org/jira/browse/HIVE-17465 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer, Statistics >Reporter: Gopal V >Assignee: Vineet Garg > Fix For: 3.0.0 > > Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, > HIVE-17465.3.patch, HIVE-17465.4.patch, HIVE-17465.5.patch, > HIVE-17465.6.patch, HIVE-17465.7.patch > > > {code} > explain select count(d_date_sk) from date_dim where d_year=2001 ; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = > 9; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 > and d_dom = 21; > {code} > All 3 queries end up with the same row-count estimates after the filter. > {code} > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: (d_year = 2001) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (d_year = 2001) (type: boolean) > Statistics: Num rows: 363 Data size: 4356 Basic stats: > COMPLETE Column stats: COMPLETE > > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 363 Data size: 5808 Basic stats: > COMPLETE Column stats: COMPLETE > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 363 Data size: 7260 Basic stats: > COMPLETE Column stats: COMPLETE > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16170749#comment-16170749 ] Vineet Garg commented on HIVE-17535: Latest patch(3) has known failure {{min_structvalue}} which is a bug exposed by the patch. Queries such as {code:sql} select max(a), min(a) FROM (select named_struct("field",1) as a union all select named_struct("field",2) as a union all select named_struct("field",cast(null as int)) as a) tmp{code} fails with CBO because CBO ends up loosing {{CAST}} operation resulting in {{named_struct("field",cast(null as int)}} to just {{named_struct("field",null}}. This results in different schema structure b/w union statements which is semantically incorrect. {{ This could be reproduced using simple {code:sql}select named_struct("field",cast(null as int)) as a{code}. If we dump new ast after CBO we will notice missing CAST operation. > Select 1 EXCEPT Select 1 fails with NPE > --- > > Key: HIVE-17535 > URL: https://issues.apache.org/jira/browse/HIVE-17535 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, > HIVE-17535.3.patch > > > Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} > queries with SET operators fail (intersect requires CBO). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16170764#comment-16170764 ] Vineet Garg commented on HIVE-17535: Good to know. I'll disable the test for now and will udpate HIVE-16511. > Select 1 EXCEPT Select 1 fails with NPE > --- > > Key: HIVE-17535 > URL: https://issues.apache.org/jira/browse/HIVE-17535 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, > HIVE-17535.3.patch > > > Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} > queries with SET operators fail (intersect requires CBO). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17535: --- Attachment: HIVE-17535.3.patch > Select 1 EXCEPT Select 1 fails with NPE > --- > > Key: HIVE-17535 > URL: https://issues.apache.org/jira/browse/HIVE-17535 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, > HIVE-17535.3.patch > > > Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} > queries with SET operators fail (intersect requires CBO). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17535: --- Attachment: (was: HIVE-17535.3.patch) > Select 1 EXCEPT Select 1 fails with NPE > --- > > Key: HIVE-17535 > URL: https://issues.apache.org/jira/browse/HIVE-17535 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, > HIVE-17535.3.patch > > > Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} > queries with SET operators fail (intersect requires CBO). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16511) CBO looses inner casts on constants of complex type
[ https://issues.apache.org/jira/browse/HIVE-16511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16170770#comment-16170770 ] Vineet Garg commented on HIVE-16511: Test {{min_structvalue.q}} needs to be enabled once this issue is fixed. > CBO looses inner casts on constants of complex type > --- > > Key: HIVE-16511 > URL: https://issues.apache.org/jira/browse/HIVE-16511 > Project: Hive > Issue Type: Bug > Components: CBO, Query Planning >Reporter: Ashutosh Chauhan > > type for map <10, cast(null as int)> becomes map-- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17535: --- Status: Open (was: Patch Available) > Select 1 EXCEPT Select 1 fails with NPE > --- > > Key: HIVE-17535 > URL: https://issues.apache.org/jira/browse/HIVE-17535 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, > HIVE-17535.3.patch > > > Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} > queries with SET operators fail (intersect requires CBO). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17535: --- Status: Patch Available (was: Open) > Select 1 EXCEPT Select 1 fails with NPE > --- > > Key: HIVE-17535 > URL: https://issues.apache.org/jira/browse/HIVE-17535 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, > HIVE-17535.3.patch > > > Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} > queries with SET operators fail (intersect requires CBO). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17535: --- Status: Open (was: Patch Available) > Select 1 EXCEPT Select 1 fails with NPE > --- > > Key: HIVE-17535 > URL: https://issues.apache.org/jira/browse/HIVE-17535 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, > HIVE-17535.3.patch, HIVE-17535.4.patch, HIVE-17535.5.patch > > > Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} > queries with SET operators fail (intersect requires CBO). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17535: --- Attachment: HIVE-17535.5.patch > Select 1 EXCEPT Select 1 fails with NPE > --- > > Key: HIVE-17535 > URL: https://issues.apache.org/jira/browse/HIVE-17535 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, > HIVE-17535.3.patch, HIVE-17535.4.patch, HIVE-17535.5.patch > > > Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} > queries with SET operators fail (intersect requires CBO). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17538) Enhance estimation of stats to estimate even if only one column is missing stats
[ https://issues.apache.org/jira/browse/HIVE-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16172535#comment-16172535 ] Vineet Garg commented on HIVE-17538: Unfortunately the test report is not available anymore so will have to re-run tests to see if they are related or unrelated. > Enhance estimation of stats to estimate even if only one column is missing > stats > > > Key: HIVE-17538 > URL: https://issues.apache.org/jira/browse/HIVE-17538 > Project: Hive > Issue Type: Improvement >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17538.1.patch > > > HIVE-16811 provided support for estimating statistics in absence of stats. > But that estimation is done if and only if statistics are missing for all > columns. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats
[ https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17536: --- Status: Patch Available (was: Open) > StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics > or zero stats > --- > > Key: HIVE-17536 > URL: https://issues.apache.org/jira/browse/HIVE-17536 > Project: Hive > Issue Type: Improvement > Components: Statistics >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch, > HIVE-17536.3.patch, HIVE-17536.4.patch > > > This method returns zero for both of the following cases: > * Statistics are missing in metastore > * Actual stats e.g. number of rows are zero > It'll be good for this method to return e.g. -1 in absence of statistics > instead of assuming it to be zero. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats
[ https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17536: --- Attachment: HIVE-17536.4.patch > StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics > or zero stats > --- > > Key: HIVE-17536 > URL: https://issues.apache.org/jira/browse/HIVE-17536 > Project: Hive > Issue Type: Improvement > Components: Statistics >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch, > HIVE-17536.3.patch, HIVE-17536.4.patch > > > This method returns zero for both of the following cases: > * Statistics are missing in metastore > * Actual stats e.g. number of rows are zero > It'll be good for this method to return e.g. -1 in absence of statistics > instead of assuming it to be zero. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats
[ https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17536: --- Status: Open (was: Patch Available) > StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics > or zero stats > --- > > Key: HIVE-17536 > URL: https://issues.apache.org/jira/browse/HIVE-17536 > Project: Hive > Issue Type: Improvement > Components: Statistics >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch, > HIVE-17536.3.patch > > > This method returns zero for both of the following cases: > * Statistics are missing in metastore > * Actual stats e.g. number of rows are zero > It'll be good for this method to return e.g. -1 in absence of statistics > instead of assuming it to be zero. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17535: --- Status: Patch Available (was: Open) Latest patch(5) should fix test failures > Select 1 EXCEPT Select 1 fails with NPE > --- > > Key: HIVE-17535 > URL: https://issues.apache.org/jira/browse/HIVE-17535 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, > HIVE-17535.3.patch, HIVE-17535.4.patch, HIVE-17535.5.patch > > > Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} > queries with SET operators fail (intersect requires CBO). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats
[ https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17536: --- Attachment: HIVE-17536.6.patch > StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics > or zero stats > --- > > Key: HIVE-17536 > URL: https://issues.apache.org/jira/browse/HIVE-17536 > Project: Hive > Issue Type: Improvement > Components: Statistics >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch, > HIVE-17536.3.patch, HIVE-17536.4.patch, HIVE-17536.5.patch, HIVE-17536.6.patch > > > This method returns zero for both of the following cases: > * Statistics are missing in metastore > * Actual stats e.g. number of rows are zero > It'll be good for this method to return e.g. -1 in absence of statistics > instead of assuming it to be zero. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats
[ https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17536: --- Status: Open (was: Patch Available) > StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics > or zero stats > --- > > Key: HIVE-17536 > URL: https://issues.apache.org/jira/browse/HIVE-17536 > Project: Hive > Issue Type: Improvement > Components: Statistics >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch, > HIVE-17536.3.patch, HIVE-17536.4.patch, HIVE-17536.5.patch, HIVE-17536.6.patch > > > This method returns zero for both of the following cases: > * Statistics are missing in metastore > * Actual stats e.g. number of rows are zero > It'll be good for this method to return e.g. -1 in absence of statistics > instead of assuming it to be zero. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats
[ https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17536: --- Status: Patch Available (was: Open) > StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics > or zero stats > --- > > Key: HIVE-17536 > URL: https://issues.apache.org/jira/browse/HIVE-17536 > Project: Hive > Issue Type: Improvement > Components: Statistics >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch, > HIVE-17536.3.patch, HIVE-17536.4.patch, HIVE-17536.5.patch, HIVE-17536.6.patch > > > This method returns zero for both of the following cases: > * Statistics are missing in metastore > * Actual stats e.g. number of rows are zero > It'll be good for this method to return e.g. -1 in absence of statistics > instead of assuming it to be zero. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17538) Enhance estimation of stats to estimate even if only one column is missing stats
[ https://issues.apache.org/jira/browse/HIVE-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17538: --- Status: Open (was: Patch Available) > Enhance estimation of stats to estimate even if only one column is missing > stats > > > Key: HIVE-17538 > URL: https://issues.apache.org/jira/browse/HIVE-17538 > Project: Hive > Issue Type: Improvement >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17538.1.patch, HIVE-17538.2.patch, > HIVE-17538.3.patch > > > HIVE-16811 provided support for estimating statistics in absence of stats. > But that estimation is done if and only if statistics are missing for all > columns. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17538) Enhance estimation of stats to estimate even if only one column is missing stats
[ https://issues.apache.org/jira/browse/HIVE-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17538: --- Attachment: HIVE-17538.3.patch > Enhance estimation of stats to estimate even if only one column is missing > stats > > > Key: HIVE-17538 > URL: https://issues.apache.org/jira/browse/HIVE-17538 > Project: Hive > Issue Type: Improvement >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17538.1.patch, HIVE-17538.2.patch, > HIVE-17538.3.patch > > > HIVE-16811 provided support for estimating statistics in absence of stats. > But that estimation is done if and only if statistics are missing for all > columns. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17538) Enhance estimation of stats to estimate even if only one column is missing stats
[ https://issues.apache.org/jira/browse/HIVE-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16177976#comment-16177976 ] Vineet Garg commented on HIVE-17538: Review board link attached > Enhance estimation of stats to estimate even if only one column is missing > stats > > > Key: HIVE-17538 > URL: https://issues.apache.org/jira/browse/HIVE-17538 > Project: Hive > Issue Type: Improvement >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17538.1.patch, HIVE-17538.2.patch, > HIVE-17538.3.patch > > > HIVE-16811 provided support for estimating statistics in absence of stats. > But that estimation is done if and only if statistics are missing for all > columns. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17538) Enhance estimation of stats to estimate even if only one column is missing stats
[ https://issues.apache.org/jira/browse/HIVE-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17538: --- Status: Patch Available (was: Open) > Enhance estimation of stats to estimate even if only one column is missing > stats > > > Key: HIVE-17538 > URL: https://issues.apache.org/jira/browse/HIVE-17538 > Project: Hive > Issue Type: Improvement >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17538.1.patch, HIVE-17538.2.patch, > HIVE-17538.3.patch > > > HIVE-16811 provided support for estimating statistics in absence of stats. > But that estimation is done if and only if statistics are missing for all > columns. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats
[ https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17536: --- Attachment: HIVE-17536.5.patch > StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics > or zero stats > --- > > Key: HIVE-17536 > URL: https://issues.apache.org/jira/browse/HIVE-17536 > Project: Hive > Issue Type: Improvement > Components: Statistics >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch, > HIVE-17536.3.patch, HIVE-17536.4.patch, HIVE-17536.5.patch > > > This method returns zero for both of the following cases: > * Statistics are missing in metastore > * Actual stats e.g. number of rows are zero > It'll be good for this method to return e.g. -1 in absence of statistics > instead of assuming it to be zero. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats
[ https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17536: --- Status: Open (was: Patch Available) > StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics > or zero stats > --- > > Key: HIVE-17536 > URL: https://issues.apache.org/jira/browse/HIVE-17536 > Project: Hive > Issue Type: Improvement > Components: Statistics >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch, > HIVE-17536.3.patch, HIVE-17536.4.patch, HIVE-17536.5.patch > > > This method returns zero for both of the following cases: > * Statistics are missing in metastore > * Actual stats e.g. number of rows are zero > It'll be good for this method to return e.g. -1 in absence of statistics > instead of assuming it to be zero. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats
[ https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17536: --- Status: Patch Available (was: Open) > StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics > or zero stats > --- > > Key: HIVE-17536 > URL: https://issues.apache.org/jira/browse/HIVE-17536 > Project: Hive > Issue Type: Improvement > Components: Statistics >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch, > HIVE-17536.3.patch, HIVE-17536.4.patch, HIVE-17536.5.patch > > > This method returns zero for both of the following cases: > * Statistics are missing in metastore > * Actual stats e.g. number of rows are zero > It'll be good for this method to return e.g. -1 in absence of statistics > instead of assuming it to be zero. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17535: --- Resolution: Fixed Status: Resolved (was: Patch Available) Pushed to master. Thanks for reviewing [~ashutoshc] > Select 1 EXCEPT Select 1 fails with NPE > --- > > Key: HIVE-17535 > URL: https://issues.apache.org/jira/browse/HIVE-17535 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, > HIVE-17535.3.patch, HIVE-17535.4.patch, HIVE-17535.5.patch > > > Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} > queries with SET operators fail (intersect requires CBO). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17308) Improvement in join cardinality estimation
[ https://issues.apache.org/jira/browse/HIVE-17308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17308: --- Fix Version/s: 3.0.0 > Improvement in join cardinality estimation > -- > > Key: HIVE-17308 > URL: https://issues.apache.org/jira/browse/HIVE-17308 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Fix For: 3.0.0 > > Attachments: HIVE-17308.1.patch, HIVE-17308.2.patch, > HIVE-17308.3.patch, HIVE-17308.4.patch, HIVE-17308.5.patch, > HIVE-17308.6.patch, HIVE-17308.7.patch, HIVE-17308.8.patch > > > Currently during logical planning join cardinality is estimated assuming no > correlation among join keys (This estimation is done using exponential > backoff). Physical planning on the other hand consider correlation for multi > keys and uses different estimation. We should consider correlation during > logical planning as well. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17308) Improvement in join cardinality estimation
[ https://issues.apache.org/jira/browse/HIVE-17308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16174206#comment-16174206 ] Vineet Garg commented on HIVE-17308: [~leftylev] Done. > Improvement in join cardinality estimation > -- > > Key: HIVE-17308 > URL: https://issues.apache.org/jira/browse/HIVE-17308 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Fix For: 3.0.0 > > Attachments: HIVE-17308.1.patch, HIVE-17308.2.patch, > HIVE-17308.3.patch, HIVE-17308.4.patch, HIVE-17308.5.patch, > HIVE-17308.6.patch, HIVE-17308.7.patch, HIVE-17308.8.patch > > > Currently during logical planning join cardinality is estimated assuming no > correlation among join keys (This estimation is done using exponential > backoff). Physical planning on the other hand consider correlation for multi > keys and uses different estimation. We should consider correlation during > logical planning as well. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17535: --- Fix Version/s: 3.0.0 > Select 1 EXCEPT Select 1 fails with NPE > --- > > Key: HIVE-17535 > URL: https://issues.apache.org/jira/browse/HIVE-17535 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Fix For: 3.0.0 > > Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, > HIVE-17535.3.patch, HIVE-17535.4.patch, HIVE-17535.5.patch > > > Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} > queries with SET operators fail (intersect requires CBO). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively
[ https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165300#comment-16165300 ] Vineet Garg commented on HIVE-17465: [~ashutoshc] I am investigating few suspicious test failures. I'll create RB as soon as I am done with the investigation. > Statistics: Drill-down filters don't reduce row-counts progressively > > > Key: HIVE-17465 > URL: https://issues.apache.org/jira/browse/HIVE-17465 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer, Statistics >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, > HIVE-17465.3.patch > > > {code} > explain select count(d_date_sk) from date_dim where d_year=2001 ; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = > 9; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 > and d_dom = 21; > {code} > All 3 queries end up with the same row-count estimates after the filter. > {code} > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: (d_year = 2001) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (d_year = 2001) (type: boolean) > Statistics: Num rows: 363 Data size: 4356 Basic stats: > COMPLETE Column stats: COMPLETE > > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 363 Data size: 5808 Basic stats: > COMPLETE Column stats: COMPLETE > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 363 Data size: 7260 Basic stats: > COMPLETE Column stats: COMPLETE > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17493) Improve PKFK cardinality estimation in Physical planning
[ https://issues.apache.org/jira/browse/HIVE-17493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165298#comment-16165298 ] Vineet Garg commented on HIVE-17493: Yeah looks like it. I'll rebase and re-upload the patch. > Improve PKFK cardinality estimation in Physical planning > > > Key: HIVE-17493 > URL: https://issues.apache.org/jira/browse/HIVE-17493 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17493.1.patch, HIVE-17493.2.patch > > > Cardinality estimation of a join, after PK-FK relation has been ascertained, > could be improved if parent of the join operator is LEFT outer or RIGHT outer > join. > Currently estimation is done by estimating reduction of rows occurred on PK > side, then multiplying the reduction to FK side row count. This estimation of > reduction currently doesn't distinguish b/w INNER or OUTER joins. This could > be improved to handle outer joins better. > TPC-DS query45 is impacted by this. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats
[ https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17536: --- Attachment: HIVE-17536.2.patch > StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics > or zero stats > --- > > Key: HIVE-17536 > URL: https://issues.apache.org/jira/browse/HIVE-17536 > Project: Hive > Issue Type: Improvement > Components: Statistics >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch > > > This method returns zero for both of the following cases: > * Statistics are missing in metastore > * Actual stats e.g. number of rows are zero > It'll be good for this method to return e.g. -1 in absence of statistics > instead of assuming it to be zero. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats
[ https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17536: --- Status: Open (was: Patch Available) > StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics > or zero stats > --- > > Key: HIVE-17536 > URL: https://issues.apache.org/jira/browse/HIVE-17536 > Project: Hive > Issue Type: Improvement > Components: Statistics >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch > > > This method returns zero for both of the following cases: > * Statistics are missing in metastore > * Actual stats e.g. number of rows are zero > It'll be good for this method to return e.g. -1 in absence of statistics > instead of assuming it to be zero. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively
[ https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17465: --- Status: Patch Available (was: Open) > Statistics: Drill-down filters don't reduce row-counts progressively > > > Key: HIVE-17465 > URL: https://issues.apache.org/jira/browse/HIVE-17465 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer, Statistics >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, > HIVE-17465.3.patch, HIVE-17465.4.patch, HIVE-17465.5.patch > > > {code} > explain select count(d_date_sk) from date_dim where d_year=2001 ; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = > 9; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 > and d_dom = 21; > {code} > All 3 queries end up with the same row-count estimates after the filter. > {code} > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: (d_year = 2001) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (d_year = 2001) (type: boolean) > Statistics: Num rows: 363 Data size: 4356 Basic stats: > COMPLETE Column stats: COMPLETE > > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 363 Data size: 5808 Basic stats: > COMPLETE Column stats: COMPLETE > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 363 Data size: 7260 Basic stats: > COMPLETE Column stats: COMPLETE > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats
[ https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168219#comment-16168219 ] Vineet Garg commented on HIVE-17536: [~ashutoshc] if stat key is missing following code will throw NumberFormatException {code}Long.parseLong(params.get(statType)){code} since {code} params.get(statType){code} will return null on missing stat key. So this is already accounted for. > StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics > or zero stats > --- > > Key: HIVE-17536 > URL: https://issues.apache.org/jira/browse/HIVE-17536 > Project: Hive > Issue Type: Improvement > Components: Statistics >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17536.1.patch > > > This method returns zero for both of the following cases: > * Statistics are missing in metastore > * Actual stats e.g. number of rows are zero > It'll be good for this method to return e.g. -1 in absence of statistics > instead of assuming it to be zero. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats
[ https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17536: --- Status: Patch Available (was: Open) > StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics > or zero stats > --- > > Key: HIVE-17536 > URL: https://issues.apache.org/jira/browse/HIVE-17536 > Project: Hive > Issue Type: Improvement > Components: Statistics >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch > > > This method returns zero for both of the following cases: > * Statistics are missing in metastore > * Actual stats e.g. number of rows are zero > It'll be good for this method to return e.g. -1 in absence of statistics > instead of assuming it to be zero. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17535: --- Status: Open (was: Patch Available) > Select 1 EXCEPT Select 1 fails with NPE > --- > > Key: HIVE-17535 > URL: https://issues.apache.org/jira/browse/HIVE-17535 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch > > > Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} > queries with SET operators fail (intersect requires CBO). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17535: --- Status: Patch Available (was: Open) > Select 1 EXCEPT Select 1 fails with NPE > --- > > Key: HIVE-17535 > URL: https://issues.apache.org/jira/browse/HIVE-17535 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch > > > Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} > queries with SET operators fail (intersect requires CBO). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17535: --- Attachment: HIVE-17535.2.patch > Select 1 EXCEPT Select 1 fails with NPE > --- > > Key: HIVE-17535 > URL: https://issues.apache.org/jira/browse/HIVE-17535 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch > > > Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} > queries with SET operators fail (intersect requires CBO). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively
[ https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169469#comment-16169469 ] Vineet Garg commented on HIVE-17465: [~ashutoshc] can you take a look at updated review? > Statistics: Drill-down filters don't reduce row-counts progressively > > > Key: HIVE-17465 > URL: https://issues.apache.org/jira/browse/HIVE-17465 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer, Statistics >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, > HIVE-17465.3.patch, HIVE-17465.4.patch, HIVE-17465.5.patch, > HIVE-17465.6.patch, HIVE-17465.7.patch > > > {code} > explain select count(d_date_sk) from date_dim where d_year=2001 ; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = > 9; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 > and d_dom = 21; > {code} > All 3 queries end up with the same row-count estimates after the filter. > {code} > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: (d_year = 2001) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (d_year = 2001) (type: boolean) > Statistics: Num rows: 363 Data size: 4356 Basic stats: > COMPLETE Column stats: COMPLETE > > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 363 Data size: 5808 Basic stats: > COMPLETE Column stats: COMPLETE > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 363 Data size: 7260 Basic stats: > COMPLETE Column stats: COMPLETE > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively
[ https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17465: --- Status: Patch Available (was: Open) > Statistics: Drill-down filters don't reduce row-counts progressively > > > Key: HIVE-17465 > URL: https://issues.apache.org/jira/browse/HIVE-17465 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer, Statistics >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, > HIVE-17465.3.patch, HIVE-17465.4.patch, HIVE-17465.5.patch, > HIVE-17465.6.patch, HIVE-17465.7.patch > > > {code} > explain select count(d_date_sk) from date_dim where d_year=2001 ; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = > 9; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 > and d_dom = 21; > {code} > All 3 queries end up with the same row-count estimates after the filter. > {code} > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: (d_year = 2001) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (d_year = 2001) (type: boolean) > Statistics: Num rows: 363 Data size: 4356 Basic stats: > COMPLETE Column stats: COMPLETE > > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 363 Data size: 5808 Basic stats: > COMPLETE Column stats: COMPLETE > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 363 Data size: 7260 Basic stats: > COMPLETE Column stats: COMPLETE > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively
[ https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17465: --- Status: Open (was: Patch Available) > Statistics: Drill-down filters don't reduce row-counts progressively > > > Key: HIVE-17465 > URL: https://issues.apache.org/jira/browse/HIVE-17465 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer, Statistics >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, > HIVE-17465.3.patch, HIVE-17465.4.patch, HIVE-17465.5.patch, > HIVE-17465.6.patch, HIVE-17465.7.patch > > > {code} > explain select count(d_date_sk) from date_dim where d_year=2001 ; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = > 9; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 > and d_dom = 21; > {code} > All 3 queries end up with the same row-count estimates after the filter. > {code} > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: (d_year = 2001) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (d_year = 2001) (type: boolean) > Statistics: Num rows: 363 Data size: 4356 Basic stats: > COMPLETE Column stats: COMPLETE > > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 363 Data size: 5808 Basic stats: > COMPLETE Column stats: COMPLETE > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 363 Data size: 7260 Basic stats: > COMPLETE Column stats: COMPLETE > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively
[ https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17465: --- Attachment: HIVE-17465.7.patch > Statistics: Drill-down filters don't reduce row-counts progressively > > > Key: HIVE-17465 > URL: https://issues.apache.org/jira/browse/HIVE-17465 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer, Statistics >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, > HIVE-17465.3.patch, HIVE-17465.4.patch, HIVE-17465.5.patch, > HIVE-17465.6.patch, HIVE-17465.7.patch > > > {code} > explain select count(d_date_sk) from date_dim where d_year=2001 ; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = > 9; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 > and d_dom = 21; > {code} > All 3 queries end up with the same row-count estimates after the filter. > {code} > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: (d_year = 2001) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (d_year = 2001) (type: boolean) > Statistics: Num rows: 363 Data size: 4356 Basic stats: > COMPLETE Column stats: COMPLETE > > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 363 Data size: 5808 Basic stats: > COMPLETE Column stats: COMPLETE > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 363 Data size: 7260 Basic stats: > COMPLETE Column stats: COMPLETE > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17493) Improve PKFK cardinality estimation in Physical planning
[ https://issues.apache.org/jira/browse/HIVE-17493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17493: --- Resolution: Fixed Status: Resolved (was: Patch Available) Pushed to master > Improve PKFK cardinality estimation in Physical planning > > > Key: HIVE-17493 > URL: https://issues.apache.org/jira/browse/HIVE-17493 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17493.1.patch, HIVE-17493.2.patch, > HIVE-17493.3.patch, HIVE-17493.4.patch > > > Cardinality estimation of a join, after PK-FK relation has been ascertained, > could be improved if parent of the join operator is LEFT outer or RIGHT outer > join. > Currently estimation is done by estimating reduction of rows occurred on PK > side, then multiplying the reduction to FK side row count. This estimation of > reduction currently doesn't distinguish b/w INNER or OUTER joins. This could > be improved to handle outer joins better. > TPC-DS query45 is impacted by this. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively
[ https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17465: --- Status: Patch Available (was: Open) > Statistics: Drill-down filters don't reduce row-counts progressively > > > Key: HIVE-17465 > URL: https://issues.apache.org/jira/browse/HIVE-17465 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer, Statistics >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, > HIVE-17465.3.patch, HIVE-17465.4.patch > > > {code} > explain select count(d_date_sk) from date_dim where d_year=2001 ; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = > 9; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 > and d_dom = 21; > {code} > All 3 queries end up with the same row-count estimates after the filter. > {code} > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: (d_year = 2001) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (d_year = 2001) (type: boolean) > Statistics: Num rows: 363 Data size: 4356 Basic stats: > COMPLETE Column stats: COMPLETE > > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 363 Data size: 5808 Basic stats: > COMPLETE Column stats: COMPLETE > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 363 Data size: 7260 Basic stats: > COMPLETE Column stats: COMPLETE > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively
[ https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17465: --- Status: Open (was: Patch Available) > Statistics: Drill-down filters don't reduce row-counts progressively > > > Key: HIVE-17465 > URL: https://issues.apache.org/jira/browse/HIVE-17465 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer, Statistics >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, > HIVE-17465.3.patch, HIVE-17465.4.patch > > > {code} > explain select count(d_date_sk) from date_dim where d_year=2001 ; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = > 9; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 > and d_dom = 21; > {code} > All 3 queries end up with the same row-count estimates after the filter. > {code} > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: (d_year = 2001) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (d_year = 2001) (type: boolean) > Statistics: Num rows: 363 Data size: 4356 Basic stats: > COMPLETE Column stats: COMPLETE > > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 363 Data size: 5808 Basic stats: > COMPLETE Column stats: COMPLETE > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 363 Data size: 7260 Basic stats: > COMPLETE Column stats: COMPLETE > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively
[ https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17465: --- Attachment: HIVE-17465.4.patch > Statistics: Drill-down filters don't reduce row-counts progressively > > > Key: HIVE-17465 > URL: https://issues.apache.org/jira/browse/HIVE-17465 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer, Statistics >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, > HIVE-17465.3.patch, HIVE-17465.4.patch > > > {code} > explain select count(d_date_sk) from date_dim where d_year=2001 ; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = > 9; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 > and d_dom = 21; > {code} > All 3 queries end up with the same row-count estimates after the filter. > {code} > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: (d_year = 2001) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (d_year = 2001) (type: boolean) > Statistics: Num rows: 363 Data size: 4356 Basic stats: > COMPLETE Column stats: COMPLETE > > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 363 Data size: 5808 Basic stats: > COMPLETE Column stats: COMPLETE > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 363 Data size: 7260 Basic stats: > COMPLETE Column stats: COMPLETE > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively
[ https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16165381#comment-16165381 ] Vineet Garg commented on HIVE-17465: [~ashutoshc] New patch is uploaded and review board link is linked to the JIRA. > Statistics: Drill-down filters don't reduce row-counts progressively > > > Key: HIVE-17465 > URL: https://issues.apache.org/jira/browse/HIVE-17465 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer, Statistics >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, > HIVE-17465.3.patch, HIVE-17465.4.patch > > > {code} > explain select count(d_date_sk) from date_dim where d_year=2001 ; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = > 9; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 > and d_dom = 21; > {code} > All 3 queries end up with the same row-count estimates after the filter. > {code} > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: (d_year = 2001) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (d_year = 2001) (type: boolean) > Statistics: Num rows: 363 Data size: 4356 Basic stats: > COMPLETE Column stats: COMPLETE > > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 363 Data size: 5808 Basic stats: > COMPLETE Column stats: COMPLETE > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 363 Data size: 7260 Basic stats: > COMPLETE Column stats: COMPLETE > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17493) Improve PKFK cardinality estimation in Physical planning
[ https://issues.apache.org/jira/browse/HIVE-17493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17493: --- Status: Patch Available (was: Open) > Improve PKFK cardinality estimation in Physical planning > > > Key: HIVE-17493 > URL: https://issues.apache.org/jira/browse/HIVE-17493 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17493.1.patch, HIVE-17493.2.patch, > HIVE-17493.3.patch > > > Cardinality estimation of a join, after PK-FK relation has been ascertained, > could be improved if parent of the join operator is LEFT outer or RIGHT outer > join. > Currently estimation is done by estimating reduction of rows occurred on PK > side, then multiplying the reduction to FK side row count. This estimation of > reduction currently doesn't distinguish b/w INNER or OUTER joins. This could > be improved to handle outer joins better. > TPC-DS query45 is impacted by this. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17493) Improve PKFK cardinality estimation in Physical planning
[ https://issues.apache.org/jira/browse/HIVE-17493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17493: --- Status: Open (was: Patch Available) > Improve PKFK cardinality estimation in Physical planning > > > Key: HIVE-17493 > URL: https://issues.apache.org/jira/browse/HIVE-17493 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17493.1.patch, HIVE-17493.2.patch, > HIVE-17493.3.patch > > > Cardinality estimation of a join, after PK-FK relation has been ascertained, > could be improved if parent of the join operator is LEFT outer or RIGHT outer > join. > Currently estimation is done by estimating reduction of rows occurred on PK > side, then multiplying the reduction to FK side row count. This estimation of > reduction currently doesn't distinguish b/w INNER or OUTER joins. This could > be improved to handle outer joins better. > TPC-DS query45 is impacted by this. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17493) Improve PKFK cardinality estimation in Physical planning
[ https://issues.apache.org/jira/browse/HIVE-17493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17493: --- Attachment: HIVE-17493.3.patch > Improve PKFK cardinality estimation in Physical planning > > > Key: HIVE-17493 > URL: https://issues.apache.org/jira/browse/HIVE-17493 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17493.1.patch, HIVE-17493.2.patch, > HIVE-17493.3.patch > > > Cardinality estimation of a join, after PK-FK relation has been ascertained, > could be improved if parent of the join operator is LEFT outer or RIGHT outer > join. > Currently estimation is done by estimating reduction of rows occurred on PK > side, then multiplying the reduction to FK side row count. This estimation of > reduction currently doesn't distinguish b/w INNER or OUTER joins. This could > be improved to handle outer joins better. > TPC-DS query45 is impacted by this. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively
[ https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17465: --- Status: Open (was: Patch Available) > Statistics: Drill-down filters don't reduce row-counts progressively > > > Key: HIVE-17465 > URL: https://issues.apache.org/jira/browse/HIVE-17465 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer, Statistics >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, > HIVE-17465.3.patch, HIVE-17465.4.patch > > > {code} > explain select count(d_date_sk) from date_dim where d_year=2001 ; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = > 9; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 > and d_dom = 21; > {code} > All 3 queries end up with the same row-count estimates after the filter. > {code} > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: (d_year = 2001) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (d_year = 2001) (type: boolean) > Statistics: Num rows: 363 Data size: 4356 Basic stats: > COMPLETE Column stats: COMPLETE > > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 363 Data size: 5808 Basic stats: > COMPLETE Column stats: COMPLETE > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 363 Data size: 7260 Basic stats: > COMPLETE Column stats: COMPLETE > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively
[ https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17465: --- Status: Patch Available (was: Open) > Statistics: Drill-down filters don't reduce row-counts progressively > > > Key: HIVE-17465 > URL: https://issues.apache.org/jira/browse/HIVE-17465 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer, Statistics >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, > HIVE-17465.3.patch, HIVE-17465.4.patch, HIVE-17465.5.patch > > > {code} > explain select count(d_date_sk) from date_dim where d_year=2001 ; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = > 9; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 > and d_dom = 21; > {code} > All 3 queries end up with the same row-count estimates after the filter. > {code} > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: (d_year = 2001) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (d_year = 2001) (type: boolean) > Statistics: Num rows: 363 Data size: 4356 Basic stats: > COMPLETE Column stats: COMPLETE > > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 363 Data size: 5808 Basic stats: > COMPLETE Column stats: COMPLETE > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 363 Data size: 7260 Basic stats: > COMPLETE Column stats: COMPLETE > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively
[ https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17465: --- Attachment: HIVE-17465.5.patch Latest patch(5) addresses review comments. > Statistics: Drill-down filters don't reduce row-counts progressively > > > Key: HIVE-17465 > URL: https://issues.apache.org/jira/browse/HIVE-17465 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer, Statistics >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, > HIVE-17465.3.patch, HIVE-17465.4.patch, HIVE-17465.5.patch > > > {code} > explain select count(d_date_sk) from date_dim where d_year=2001 ; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = > 9; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 > and d_dom = 21; > {code} > All 3 queries end up with the same row-count estimates after the filter. > {code} > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: (d_year = 2001) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (d_year = 2001) (type: boolean) > Statistics: Num rows: 363 Data size: 4356 Basic stats: > COMPLETE Column stats: COMPLETE > > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 363 Data size: 5808 Basic stats: > COMPLETE Column stats: COMPLETE > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 363 Data size: 7260 Basic stats: > COMPLETE Column stats: COMPLETE > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17535: --- Attachment: HIVE-17535.1.patch > Select 1 EXCEPT Select 1 fails with NPE > --- > > Key: HIVE-17535 > URL: https://issues.apache.org/jira/browse/HIVE-17535 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17535.1.patch > > > Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} > queries with SET operators fail (intersect requires CBO). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17535: --- Status: Patch Available (was: Open) > Select 1 EXCEPT Select 1 fails with NPE > --- > > Key: HIVE-17535 > URL: https://issues.apache.org/jira/browse/HIVE-17535 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17535.1.patch > > > Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} > queries with SET operators fail (intersect requires CBO). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg reassigned HIVE-17535: -- > Select 1 EXCEPT Select 1 fails with NPE > --- > > Key: HIVE-17535 > URL: https://issues.apache.org/jira/browse/HIVE-17535 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > > Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} > queries with SET operators fail (intersect requires CBO). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats
[ https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg reassigned HIVE-17536: -- > StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics > or zero stats > --- > > Key: HIVE-17536 > URL: https://issues.apache.org/jira/browse/HIVE-17536 > Project: Hive > Issue Type: Improvement >Reporter: Vineet Garg >Assignee: Vineet Garg > > This method returns zero for both of the following cases: > * Statistics are missing in metastore > * Actual stats e.g. number of rows are zero > It'll be good for this method to return e.g. -1 in absence of statistics > instead of assuming it to be zero. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats
[ https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17536: --- Status: Patch Available (was: Open) > StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics > or zero stats > --- > > Key: HIVE-17536 > URL: https://issues.apache.org/jira/browse/HIVE-17536 > Project: Hive > Issue Type: Improvement > Components: Statistics >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17536.1.patch > > > This method returns zero for both of the following cases: > * Statistics are missing in metastore > * Actual stats e.g. number of rows are zero > It'll be good for this method to return e.g. -1 in absence of statistics > instead of assuming it to be zero. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats
[ https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17536: --- Component/s: Statistics > StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics > or zero stats > --- > > Key: HIVE-17536 > URL: https://issues.apache.org/jira/browse/HIVE-17536 > Project: Hive > Issue Type: Improvement > Components: Statistics >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17536.1.patch > > > This method returns zero for both of the following cases: > * Statistics are missing in metastore > * Actual stats e.g. number of rows are zero > It'll be good for this method to return e.g. -1 in absence of statistics > instead of assuming it to be zero. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats
[ https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17536: --- Attachment: HIVE-17536.1.patch > StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics > or zero stats > --- > > Key: HIVE-17536 > URL: https://issues.apache.org/jira/browse/HIVE-17536 > Project: Hive > Issue Type: Improvement > Components: Statistics >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17536.1.patch > > > This method returns zero for both of the following cases: > * Statistics are missing in metastore > * Actual stats e.g. number of rows are zero > It'll be good for this method to return e.g. -1 in absence of statistics > instead of assuming it to be zero. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17538) Enhance estimation of stats to estimate even if only one column is missing stats
[ https://issues.apache.org/jira/browse/HIVE-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17538: --- Attachment: HIVE-17538.1.patch > Enhance estimation of stats to estimate even if only one column is missing > stats > > > Key: HIVE-17538 > URL: https://issues.apache.org/jira/browse/HIVE-17538 > Project: Hive > Issue Type: Improvement >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17538.1.patch > > > HIVE-16811 provided support for estimating statistics in absence of stats. > But that estimation is done if and only if statistics are missing for all > columns. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17538) Enhance estimation of stats to estimate even if only one column is missing stats
[ https://issues.apache.org/jira/browse/HIVE-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg reassigned HIVE-17538: -- > Enhance estimation of stats to estimate even if only one column is missing > stats > > > Key: HIVE-17538 > URL: https://issues.apache.org/jira/browse/HIVE-17538 > Project: Hive > Issue Type: Improvement >Reporter: Vineet Garg >Assignee: Vineet Garg > > HIVE-16811 provided support for estimating statistics in absence of stats. > But that estimation is done if and only if statistics are missing for all > columns. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17538) Enhance estimation of stats to estimate even if only one column is missing stats
[ https://issues.apache.org/jira/browse/HIVE-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17538: --- Status: Patch Available (was: Open) Updated the logic in latest patch(2). Running it to get failures. [~ashutoshc] I'll update the code to use sets in next patch. > Enhance estimation of stats to estimate even if only one column is missing > stats > > > Key: HIVE-17538 > URL: https://issues.apache.org/jira/browse/HIVE-17538 > Project: Hive > Issue Type: Improvement >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17538.1.patch, HIVE-17538.2.patch > > > HIVE-16811 provided support for estimating statistics in absence of stats. > But that estimation is done if and only if statistics are missing for all > columns. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17538) Enhance estimation of stats to estimate even if only one column is missing stats
[ https://issues.apache.org/jira/browse/HIVE-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17538: --- Status: Open (was: Patch Available) > Enhance estimation of stats to estimate even if only one column is missing > stats > > > Key: HIVE-17538 > URL: https://issues.apache.org/jira/browse/HIVE-17538 > Project: Hive > Issue Type: Improvement >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17538.1.patch, HIVE-17538.2.patch > > > HIVE-16811 provided support for estimating statistics in absence of stats. > But that estimation is done if and only if statistics are missing for all > columns. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17538) Enhance estimation of stats to estimate even if only one column is missing stats
[ https://issues.apache.org/jira/browse/HIVE-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17538: --- Attachment: HIVE-17538.2.patch > Enhance estimation of stats to estimate even if only one column is missing > stats > > > Key: HIVE-17538 > URL: https://issues.apache.org/jira/browse/HIVE-17538 > Project: Hive > Issue Type: Improvement >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17538.1.patch, HIVE-17538.2.patch > > > HIVE-16811 provided support for estimating statistics in absence of stats. > But that estimation is done if and only if statistics are missing for all > columns. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats
[ https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176734#comment-16176734 ] Vineet Garg commented on HIVE-17536: Test failures are unrelated. > StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics > or zero stats > --- > > Key: HIVE-17536 > URL: https://issues.apache.org/jira/browse/HIVE-17536 > Project: Hive > Issue Type: Improvement > Components: Statistics >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch, > HIVE-17536.3.patch, HIVE-17536.4.patch, HIVE-17536.5.patch, HIVE-17536.6.patch > > > This method returns zero for both of the following cases: > * Statistics are missing in metastore > * Actual stats e.g. number of rows are zero > It'll be good for this method to return e.g. -1 in absence of statistics > instead of assuming it to be zero. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17465) Statistics: Drill-down filters don't reduce row-counts progressively
[ https://issues.apache.org/jira/browse/HIVE-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17465: --- Resolution: Fixed Status: Resolved (was: Patch Available) Pushed to master. Thanks for reviewing [~ashutoshc] > Statistics: Drill-down filters don't reduce row-counts progressively > > > Key: HIVE-17465 > URL: https://issues.apache.org/jira/browse/HIVE-17465 > Project: Hive > Issue Type: Bug > Components: Physical Optimizer, Statistics >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-17465.1.patch, HIVE-17465.2.patch, > HIVE-17465.3.patch, HIVE-17465.4.patch, HIVE-17465.5.patch, > HIVE-17465.6.patch, HIVE-17465.7.patch > > > {code} > explain select count(d_date_sk) from date_dim where d_year=2001 ; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = > 9; > explain select count(d_date_sk) from date_dim where d_year=2001 and d_moy = 9 > and d_dom = 21; > {code} > All 3 queries end up with the same row-count estimates after the filter. > {code} > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: (d_year = 2001) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (d_year = 2001) (type: boolean) > Statistics: Num rows: 363 Data size: 4356 Basic stats: > COMPLETE Column stats: COMPLETE > > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9)) (type: > boolean) > Statistics: Num rows: 363 Data size: 5808 Basic stats: > COMPLETE Column stats: COMPLETE > Map 1 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 73049 Data size: 82034027 Basic > stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 2001) and (d_moy = 9) and (d_dom = > 21)) (type: boolean) > Statistics: Num rows: 363 Data size: 7260 Basic stats: > COMPLETE Column stats: COMPLETE > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17535: --- Attachment: HIVE-17535.4.patch > Select 1 EXCEPT Select 1 fails with NPE > --- > > Key: HIVE-17535 > URL: https://issues.apache.org/jira/browse/HIVE-17535 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, > HIVE-17535.3.patch, HIVE-17535.4.patch > > > Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} > queries with SET operators fail (intersect requires CBO). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17535: --- Status: Open (was: Patch Available) > Select 1 EXCEPT Select 1 fails with NPE > --- > > Key: HIVE-17535 > URL: https://issues.apache.org/jira/browse/HIVE-17535 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, > HIVE-17535.3.patch, HIVE-17535.4.patch > > > Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} > queries with SET operators fail (intersect requires CBO). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17535) Select 1 EXCEPT Select 1 fails with NPE
[ https://issues.apache.org/jira/browse/HIVE-17535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17535: --- Status: Patch Available (was: Open) > Select 1 EXCEPT Select 1 fails with NPE > --- > > Key: HIVE-17535 > URL: https://issues.apache.org/jira/browse/HIVE-17535 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17535.1.patch, HIVE-17535.2.patch, > HIVE-17535.3.patch, HIVE-17535.4.patch > > > Since Hive CBO isn't able to handle queries with no table e.g. {{select 1}} > queries with SET operators fail (intersect requires CBO). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats
[ https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16168751#comment-16168751 ] Vineet Garg commented on HIVE-17536: This is for queries without from e.g. {{select 1}}. getBasicStats return -1 and then {{estimateRowSizeFromSchema}} isn't able to estimate since there is no data so we end up returning -1. Previously on countering 0 number of rows we were instead returning 1. One way of handling this is to change getNumRows to return 1 instead of -1. > StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics > or zero stats > --- > > Key: HIVE-17536 > URL: https://issues.apache.org/jira/browse/HIVE-17536 > Project: Hive > Issue Type: Improvement > Components: Statistics >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch > > > This method returns zero for both of the following cases: > * Statistics are missing in metastore > * Actual stats e.g. number of rows are zero > It'll be good for this method to return e.g. -1 in absence of statistics > instead of assuming it to be zero. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17553) CBO wrongly type cast decimal literal to int
[ https://issues.apache.org/jira/browse/HIVE-17553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17553: --- Attachment: HIVE-17553.2.patch > CBO wrongly type cast decimal literal to int > > > Key: HIVE-17553 > URL: https://issues.apache.org/jira/browse/HIVE-17553 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17553.1.patch, HIVE-17553.2.patch > > > {code:sql}explain select 100.000BD from f{code} > {noformat} > STAGE PLANS: > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > TableScan > alias: f > Select Operator > expressions: 100 (type: int) > outputColumnNames: _col0 > ListSink > {noformat} > Notice that the expression 100.000BD is of type int instead of decimal. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17553) CBO wrongly type cast decimal literal to int
[ https://issues.apache.org/jira/browse/HIVE-17553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17553: --- Status: Open (was: Patch Available) > CBO wrongly type cast decimal literal to int > > > Key: HIVE-17553 > URL: https://issues.apache.org/jira/browse/HIVE-17553 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17553.1.patch > > > {code:sql}explain select 100.000BD from f{code} > {noformat} > STAGE PLANS: > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > TableScan > alias: f > Select Operator > expressions: 100 (type: int) > outputColumnNames: _col0 > ListSink > {noformat} > Notice that the expression 100.000BD is of type int instead of decimal. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17553) CBO wrongly type cast decimal literal to int
[ https://issues.apache.org/jira/browse/HIVE-17553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17553: --- Status: Patch Available (was: Open) > CBO wrongly type cast decimal literal to int > > > Key: HIVE-17553 > URL: https://issues.apache.org/jira/browse/HIVE-17553 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17553.1.patch, HIVE-17553.2.patch > > > {code:sql}explain select 100.000BD from f{code} > {noformat} > STAGE PLANS: > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > TableScan > alias: f > Select Operator > expressions: 100 (type: int) > outputColumnNames: _col0 > ListSink > {noformat} > Notice that the expression 100.000BD is of type int instead of decimal. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17602) Explain plan not working
[ https://issues.apache.org/jira/browse/HIVE-17602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17602: --- Attachment: HIVE-17602.3.patch > Explain plan not working > > > Key: HIVE-17602 > URL: https://issues.apache.org/jira/browse/HIVE-17602 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 3.0.0 >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Critical > Fix For: 3.0.0 > > Attachments: HIVE-17602.1.patch, HIVE-17602.2.patch, > HIVE-17602.3.patch > > > {code:sql} > hive> CREATE TABLE src (key STRING COMMENT 'default', value STRING COMMENT > 'default') STORED AS TEXTFILE; > hive> explain select * from src where key > '4'; > Failed with exception wrong number of arguments > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.ExplainTask > {code} > Error stack in hive.log > {noformat} > 2017-09-25T21:18:59,591 ERROR [726b5e51-f470-4a79-be8c-95b82a6aa85d main] > exec.Task: Failed with exception wrong number of arguments > java.lang.IllegalArgumentException: wrong number of arguments > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:896) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:774) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:797) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputList(ExplainTask.java:635) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:968) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputMap(ExplainTask.java:569) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:954) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:1052) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputStagePlans(ExplainTask.java:1197) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:275) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:220) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.execute(ExplainTask.java:368) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:204) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2190) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1832) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1549) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1304) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1294) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:827) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:765) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:234) > at org.apache.hadoop.util.RunJar.main(RunJar.java:148) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17602) Explain plan not working
[ https://issues.apache.org/jira/browse/HIVE-17602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17602: --- Status: Patch Available (was: Open) > Explain plan not working > > > Key: HIVE-17602 > URL: https://issues.apache.org/jira/browse/HIVE-17602 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 3.0.0 >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Critical > Fix For: 3.0.0 > > Attachments: HIVE-17602.1.patch, HIVE-17602.2.patch, > HIVE-17602.3.patch > > > {code:sql} > hive> CREATE TABLE src (key STRING COMMENT 'default', value STRING COMMENT > 'default') STORED AS TEXTFILE; > hive> explain select * from src where key > '4'; > Failed with exception wrong number of arguments > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.ExplainTask > {code} > Error stack in hive.log > {noformat} > 2017-09-25T21:18:59,591 ERROR [726b5e51-f470-4a79-be8c-95b82a6aa85d main] > exec.Task: Failed with exception wrong number of arguments > java.lang.IllegalArgumentException: wrong number of arguments > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:896) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:774) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:797) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputList(ExplainTask.java:635) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:968) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputMap(ExplainTask.java:569) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:954) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:1052) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputStagePlans(ExplainTask.java:1197) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:275) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:220) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.execute(ExplainTask.java:368) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:204) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2190) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1832) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1549) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1304) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1294) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:827) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:765) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:234) > at org.apache.hadoop.util.RunJar.main(RunJar.java:148) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17602) Explain plan not working
[ https://issues.apache.org/jira/browse/HIVE-17602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17602: --- Status: Open (was: Patch Available) > Explain plan not working > > > Key: HIVE-17602 > URL: https://issues.apache.org/jira/browse/HIVE-17602 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 3.0.0 >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Critical > Fix For: 3.0.0 > > Attachments: HIVE-17602.1.patch, HIVE-17602.2.patch, > HIVE-17602.3.patch > > > {code:sql} > hive> CREATE TABLE src (key STRING COMMENT 'default', value STRING COMMENT > 'default') STORED AS TEXTFILE; > hive> explain select * from src where key > '4'; > Failed with exception wrong number of arguments > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.ExplainTask > {code} > Error stack in hive.log > {noformat} > 2017-09-25T21:18:59,591 ERROR [726b5e51-f470-4a79-be8c-95b82a6aa85d main] > exec.Task: Failed with exception wrong number of arguments > java.lang.IllegalArgumentException: wrong number of arguments > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:896) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:774) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:797) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputList(ExplainTask.java:635) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:968) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputMap(ExplainTask.java:569) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:954) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:1052) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputStagePlans(ExplainTask.java:1197) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:275) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:220) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.execute(ExplainTask.java:368) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:204) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2190) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1832) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1549) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1304) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1294) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:827) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:765) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:234) > at org.apache.hadoop.util.RunJar.main(RunJar.java:148) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16511) CBO looses inner casts on constants of complex type
[ https://issues.apache.org/jira/browse/HIVE-16511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16511: --- Attachment: HIVE-16511.2.patch > CBO looses inner casts on constants of complex type > --- > > Key: HIVE-16511 > URL: https://issues.apache.org/jira/browse/HIVE-16511 > Project: Hive > Issue Type: Bug > Components: CBO, Query Planning >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg > Attachments: HIVE-16511.1.patch, HIVE-16511.2.patch > > > type for map <10, cast(null as int)> becomes map-- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16511) CBO looses inner casts on constants of complex type
[ https://issues.apache.org/jira/browse/HIVE-16511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16511: --- Status: Open (was: Patch Available) > CBO looses inner casts on constants of complex type > --- > > Key: HIVE-16511 > URL: https://issues.apache.org/jira/browse/HIVE-16511 > Project: Hive > Issue Type: Bug > Components: CBO, Query Planning >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg > Attachments: HIVE-16511.1.patch > > > type for map <10, cast(null as int)> becomes map-- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16511) CBO looses inner casts on constants of complex type
[ https://issues.apache.org/jira/browse/HIVE-16511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16511: --- Status: Patch Available (was: Open) > CBO looses inner casts on constants of complex type > --- > > Key: HIVE-16511 > URL: https://issues.apache.org/jira/browse/HIVE-16511 > Project: Hive > Issue Type: Bug > Components: CBO, Query Planning >Reporter: Ashutosh Chauhan >Assignee: Vineet Garg > Attachments: HIVE-16511.1.patch, HIVE-16511.2.patch > > > type for map <10, cast(null as int)> becomes map-- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17726) Using exists may lead to incorrect results
[ https://issues.apache.org/jira/browse/HIVE-17726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg reassigned HIVE-17726: -- Assignee: Vineet Garg > Using exists may lead to incorrect results > -- > > Key: HIVE-17726 > URL: https://issues.apache.org/jira/browse/HIVE-17726 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Zoltan Haindrich >Assignee: Vineet Garg > > {code} > drop table if exists tx1; > create table tx1 (a integer,b integer); > insert into tx1 values (1, 1), > (1, 2), > (1, 3); > select count(*) as result,3 as expected from tx1 u > where exists (select * from tx1 v where u.a=v.a and u.b <> v.b); > select count(*) as result,3 as expected from tx1 u > where exists (select * from tx1 v where u.a=v.a and u.b <> v.b limit 1); > {code} > current results are 6 and 2. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17767) Rewrite correlated EXISTS/IN subqueries into LEFT SEMI JOIN
[ https://issues.apache.org/jira/browse/HIVE-17767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg reassigned HIVE-17767: -- > Rewrite correlated EXISTS/IN subqueries into LEFT SEMI JOIN > --- > > Key: HIVE-17767 > URL: https://issues.apache.org/jira/browse/HIVE-17767 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > > Currently such queries are written into group by + inner join with value > generator and is inefficient. Value generator consists of join with outer > query to fetch all correlated values. This value generator could be > completely eliminated if such queries are instead rewritten into LEFT SEMI > JOIN. > Note that to do this first hive need to support LEFT SEMI JOIN with non-equi > condition (HIVE-17766). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17553) CBO wrongly type cast decimal literal to int
[ https://issues.apache.org/jira/browse/HIVE-17553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193898#comment-16193898 ] Vineet Garg commented on HIVE-17553: This happens due to RexBuilder::makeExactLiteral creating Integer/BigInteger if scale happens to be zero. Hive probably need to create the type own its own and pass it to makeExactLiteral to preserve the type. > CBO wrongly type cast decimal literal to int > > > Key: HIVE-17553 > URL: https://issues.apache.org/jira/browse/HIVE-17553 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg > > {code:sql}explain select 100.000BD from f{code} > {noformat} > STAGE PLANS: > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > TableScan > alias: f > Select Operator > expressions: 100 (type: int) > outputColumnNames: _col0 > ListSink > {noformat} > Notice that the expression 100.000BD is of type int instead of decimal. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17553) CBO wrongly type cast decimal literal to int
[ https://issues.apache.org/jira/browse/HIVE-17553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17553: --- Status: Patch Available (was: Open) > CBO wrongly type cast decimal literal to int > > > Key: HIVE-17553 > URL: https://issues.apache.org/jira/browse/HIVE-17553 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17553.1.patch > > > {code:sql}explain select 100.000BD from f{code} > {noformat} > STAGE PLANS: > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > TableScan > alias: f > Select Operator > expressions: 100 (type: int) > outputColumnNames: _col0 > ListSink > {noformat} > Notice that the expression 100.000BD is of type int instead of decimal. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17553) CBO wrongly type cast decimal literal to int
[ https://issues.apache.org/jira/browse/HIVE-17553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17553: --- Attachment: HIVE-17553.1.patch > CBO wrongly type cast decimal literal to int > > > Key: HIVE-17553 > URL: https://issues.apache.org/jira/browse/HIVE-17553 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17553.1.patch > > > {code:sql}explain select 100.000BD from f{code} > {noformat} > STAGE PLANS: > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > TableScan > alias: f > Select Operator > expressions: 100 (type: int) > outputColumnNames: _col0 > ListSink > {noformat} > Notice that the expression 100.000BD is of type int instead of decimal. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17553) CBO wrongly type cast decimal literal to int
[ https://issues.apache.org/jira/browse/HIVE-17553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg reassigned HIVE-17553: -- Assignee: Vineet Garg > CBO wrongly type cast decimal literal to int > > > Key: HIVE-17553 > URL: https://issues.apache.org/jira/browse/HIVE-17553 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > > {code:sql}explain select 100.000BD from f{code} > {noformat} > STAGE PLANS: > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > TableScan > alias: f > Select Operator > expressions: 100 (type: int) > outputColumnNames: _col0 > ListSink > {noformat} > Notice that the expression 100.000BD is of type int instead of decimal. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-17553) CBO wrongly type cast decimal literal to int
[ https://issues.apache.org/jira/browse/HIVE-17553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16193898#comment-16193898 ] Vineet Garg edited comment on HIVE-17553 at 10/5/17 11:24 PM: -- This happens due to RexBuilder::makeExactLiteral (in calcite) creating Integer/BigInteger if scale happens to be zero. Hive probably need to create the type own its own and pass it to makeExactLiteral to preserve the type. was (Author: vgarg): This happens due to RexBuilder::makeExactLiteral creating Integer/BigInteger if scale happens to be zero. Hive probably need to create the type own its own and pass it to makeExactLiteral to preserve the type. > CBO wrongly type cast decimal literal to int > > > Key: HIVE-17553 > URL: https://issues.apache.org/jira/browse/HIVE-17553 > Project: Hive > Issue Type: Bug > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > > {code:sql}explain select 100.000BD from f{code} > {noformat} > STAGE PLANS: > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > TableScan > alias: f > Select Operator > expressions: 100 (type: int) > outputColumnNames: _col0 > ListSink > {noformat} > Notice that the expression 100.000BD is of type int instead of decimal. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17726) Using exists may lead to incorrect results
[ https://issues.apache.org/jira/browse/HIVE-17726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206298#comment-16206298 ] Vineet Garg commented on HIVE-17726: Hi [~pvary]. This does look like caused by HIVE-17726. Is there a JIRA associated with the failure? I'll update the test and will commit it. > Using exists may lead to incorrect results > -- > > Key: HIVE-17726 > URL: https://issues.apache.org/jira/browse/HIVE-17726 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Zoltan Haindrich >Assignee: Vineet Garg > Attachments: HIVE-17726.1.patch, HIVE-17726.2.patch, > HIVE-17726.3.patch > > > {code} > drop table if exists tx1; > create table tx1 (a integer,b integer); > insert into tx1 values (1, 1), > (1, 2), > (1, 3); > select count(*) as result,3 as expected from tx1 u > where exists (select * from tx1 v where u.a=v.a and u.b <> v.b); > select count(*) as result,3 as expected from tx1 u > where exists (select * from tx1 v where u.a=v.a and u.b <> v.b limit 1); > {code} > current results are 6 and 2. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17756) Enable subquery related Qtests for Hive on Spark
[ https://issues.apache.org/jira/browse/HIVE-17756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206531#comment-16206531 ] Vineet Garg commented on HIVE-17756: [~dapengsun] Can you open a jira and regenerate failing tests? I can take a look then. > Enable subquery related Qtests for Hive on Spark > > > Key: HIVE-17756 > URL: https://issues.apache.org/jira/browse/HIVE-17756 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer >Reporter: Dapeng Sun >Assignee: Dapeng Sun > Fix For: 3.0.0 > > Attachments: HIVE-17756.001.patch > > > HIVE-15456 and HIVE-15192 using Calsite to decorrelate and plan subqueries. > This JIRA is to indroduce subquery test and verify the subqueries plan for > Hive on Spark -- This message was sent by Atlassian JIRA (v6.4.14#64029)