[jira] [Commented] (HIVE-10716) Fold case/when udf for expression involving nulls in filter operator.
[ https://issues.apache.org/jira/browse/HIVE-10716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14560457#comment-14560457 ] Ashutosh Chauhan commented on HIVE-10716: - [~gopalv] I need to verify, but my guess is https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java#L80 is coming in play here. > Fold case/when udf for expression involving nulls in filter operator. > - > > Key: HIVE-10716 > URL: https://issues.apache.org/jira/browse/HIVE-10716 > Project: Hive > Issue Type: New Feature > Components: Logical Optimizer >Affects Versions: 1.2.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Fix For: 1.2.1 > > Attachments: HIVE-10716.patch > > > From HIVE-10636 comments, more folding is possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10716) Fold case/when udf for expression involving nulls in filter operator.
[ https://issues.apache.org/jira/browse/HIVE-10716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14560403#comment-14560403 ] Gopal V commented on HIVE-10716: The easiest fix to the problem seems to be an additional filter expr to produce an AND() {code} hive> explain select avg(ss_sold_date_sk) from store_sales where (case ss_sold_date when '1998-01-02' then 1 else null end)=1; Map Operator Tree: TableScan alias: store_sales filterExpr: CASE (ss_sold_date) WHEN ('1998-01-02') THEN (true) ELSE (null) END (type: int) Statistics: Num rows: 2474913 Data size: 9899654 Basic stats: COMPLETE Column stats: COMPLETE {code} vs {code} hive> explain select avg(ss_sold_date_sk) from store_sales where (case ss_sold_date when '1998-01-02' then 1 else null end)=1 and ss_sold_time_Sk > 0; Map Operator Tree: TableScan alias: store_sales filterExpr: ((ss_sold_date = '1998-01-02') and (ss_sold_time_sk > 0)) (type: boolean) Statistics: Num rows: 1237456 Data size: 9899654 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (ss_sold_time_sk > 0) (type: boolean) {code} [~ashutoshc]: any idea why the extra filter helps in fixing the PPD case? > Fold case/when udf for expression involving nulls in filter operator. > - > > Key: HIVE-10716 > URL: https://issues.apache.org/jira/browse/HIVE-10716 > Project: Hive > Issue Type: New Feature > Components: Logical Optimizer >Affects Versions: 1.3.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-10716.patch > > > From HIVE-10636 comments, more folding is possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10716) Fold case/when udf for expression involving nulls in filter operator.
[ https://issues.apache.org/jira/browse/HIVE-10716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14560400#comment-14560400 ] Gopal V commented on HIVE-10716: [~ashutoshc]: LGTM - +1 for the count(1) case, but it looks really odd that the {{TableScan::filterExpr}} is not getting folded for this. TableScan FilterExpr is populated before this folding happens, so it might just be an optimization ordering issue? {code} hive> explain select count(1) from store_sales where (case ss_sold_date when 'x' then 1 else null end)=1; STAGE PLANS: Stage: Stage-1 Tez Edges: Reducer 2 <- Map 1 (SIMPLE_EDGE) DagName: gopal_20150526214205_80c41d84-1694-47e9-ab24-144f8007b187:13 Vertices: Map 1 Map Operator Tree: TableScan alias: store_sales filterExpr: CASE (ss_sold_date) WHEN ('x') THEN (true) ELSE (null) END (type: int) Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE Filter Operator predicate: (ss_sold_date = 'x') (type: boolean) Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE Select Operator Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE Group By Operator aggregations: count(1) mode: hash outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 93 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator sort order: Statistics: Num rows: 1 Data size: 93 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col0 (type: bigint) Execution mode: vectorized Reducer 2 Reduce Operator Tree: Group By Operator aggregations: count(VALUE._col0) {code} > Fold case/when udf for expression involving nulls in filter operator. > - > > Key: HIVE-10716 > URL: https://issues.apache.org/jira/browse/HIVE-10716 > Project: Hive > Issue Type: New Feature > Components: Logical Optimizer >Affects Versions: 1.3.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-10716.patch > > > From HIVE-10636 comments, more folding is possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10716) Fold case/when udf for expression involving nulls in filter operator.
[ https://issues.apache.org/jira/browse/HIVE-10716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14545016#comment-14545016 ] Ashutosh Chauhan commented on HIVE-10716: - [~gopalv] Can you take a look? > Fold case/when udf for expression involving nulls in filter operator. > - > > Key: HIVE-10716 > URL: https://issues.apache.org/jira/browse/HIVE-10716 > Project: Hive > Issue Type: New Feature > Components: Logical Optimizer >Affects Versions: 1.3.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-10716.patch > > > From HIVE-10636 comments, more folding is possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10716) Fold case/when udf for expression involving nulls in filter operator.
[ https://issues.apache.org/jira/browse/HIVE-10716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14544965#comment-14544965 ] Hive QA commented on HIVE-10716: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12733023/HIVE-10716.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 8939 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3902/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3902/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3902/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12733023 - PreCommit-HIVE-TRUNK-Build > Fold case/when udf for expression involving nulls in filter operator. > - > > Key: HIVE-10716 > URL: https://issues.apache.org/jira/browse/HIVE-10716 > Project: Hive > Issue Type: New Feature > Components: Logical Optimizer >Affects Versions: 1.3.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-10716.patch > > > From HIVE-10636 comments, more folding is possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)