[jira] [Commented] (HIVE-10716) Fold case/when udf for expression involving nulls in filter operator.

2015-05-26 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14560457#comment-14560457
 ] 

Ashutosh Chauhan commented on HIVE-10716:
-

[~gopalv] I need to verify, but my guess is 
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java#L80
 is coming in play here.

> Fold case/when udf for expression involving nulls in filter operator.
> -
>
> Key: HIVE-10716
> URL: https://issues.apache.org/jira/browse/HIVE-10716
> Project: Hive
>  Issue Type: New Feature
>  Components: Logical Optimizer
>Affects Versions: 1.2.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Fix For: 1.2.1
>
> Attachments: HIVE-10716.patch
>
>
> From HIVE-10636 comments, more folding is possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10716) Fold case/when udf for expression involving nulls in filter operator.

2015-05-26 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14560403#comment-14560403
 ] 

Gopal V commented on HIVE-10716:


The easiest fix to the problem seems to be an additional filter expr to produce 
an AND()
{code}
hive> explain select avg(ss_sold_date_sk) from store_sales where (case 
ss_sold_date when '1998-01-02' then 1 else null end)=1;

 Map Operator Tree:
TableScan
  alias: store_sales
  filterExpr: CASE (ss_sold_date) WHEN ('1998-01-02') THEN 
(true) ELSE (null) END (type: int)
  Statistics: Num rows: 2474913 Data size: 9899654 Basic stats: 
COMPLETE Column stats: COMPLETE
{code}

vs

{code}
hive> explain select avg(ss_sold_date_sk) from store_sales where (case 
ss_sold_date when '1998-01-02' then 1 else null end)=1 and ss_sold_time_Sk > 0;
Map Operator Tree:
TableScan
  alias: store_sales
  filterExpr: ((ss_sold_date = '1998-01-02') and 
(ss_sold_time_sk > 0)) (type: boolean)
  Statistics: Num rows: 1237456 Data size: 9899654 Basic stats: 
COMPLETE Column stats: COMPLETE
  Filter Operator
predicate: (ss_sold_time_sk > 0) (type: boolean)
{code}

[~ashutoshc]: any idea why the extra filter helps in fixing the PPD case?

> Fold case/when udf for expression involving nulls in filter operator.
> -
>
> Key: HIVE-10716
> URL: https://issues.apache.org/jira/browse/HIVE-10716
> Project: Hive
>  Issue Type: New Feature
>  Components: Logical Optimizer
>Affects Versions: 1.3.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-10716.patch
>
>
> From HIVE-10636 comments, more folding is possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10716) Fold case/when udf for expression involving nulls in filter operator.

2015-05-26 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14560400#comment-14560400
 ] 

Gopal V commented on HIVE-10716:


[~ashutoshc]: LGTM - +1 for the count(1) case, but it looks really odd that the 
{{TableScan::filterExpr}} is not getting folded for this.

TableScan FilterExpr is populated before this folding happens, so it might just 
be an optimization ordering issue?

{code}
hive> explain select count(1) from store_sales where (case ss_sold_date when 
'x' then 1 else null end)=1;

STAGE PLANS:
  Stage: Stage-1
Tez
  Edges:
Reducer 2 <- Map 1 (SIMPLE_EDGE)
  DagName: gopal_20150526214205_80c41d84-1694-47e9-ab24-144f8007b187:13
  Vertices:
Map 1 
Map Operator Tree:
TableScan
  alias: store_sales
  filterExpr: CASE (ss_sold_date) WHEN ('x') THEN (true) ELSE 
(null) END (type: int)
  Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL 
Column stats: COMPLETE
  Filter Operator
predicate: (ss_sold_date = 'x') (type: boolean)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL 
Column stats: COMPLETE
Select Operator
  Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL 
Column stats: COMPLETE
  Group By Operator
aggregations: count(1)
mode: hash
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 93 Basic stats: 
COMPLETE Column stats: COMPLETE
Reduce Output Operator
  sort order: 
  Statistics: Num rows: 1 Data size: 93 Basic stats: 
COMPLETE Column stats: COMPLETE
  value expressions: _col0 (type: bigint)
Execution mode: vectorized
Reducer 2 
Reduce Operator Tree:
  Group By Operator
aggregations: count(VALUE._col0)
{code}

> Fold case/when udf for expression involving nulls in filter operator.
> -
>
> Key: HIVE-10716
> URL: https://issues.apache.org/jira/browse/HIVE-10716
> Project: Hive
>  Issue Type: New Feature
>  Components: Logical Optimizer
>Affects Versions: 1.3.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-10716.patch
>
>
> From HIVE-10636 comments, more folding is possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10716) Fold case/when udf for expression involving nulls in filter operator.

2015-05-14 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14545016#comment-14545016
 ] 

Ashutosh Chauhan commented on HIVE-10716:
-

[~gopalv] Can you take a look?

> Fold case/when udf for expression involving nulls in filter operator.
> -
>
> Key: HIVE-10716
> URL: https://issues.apache.org/jira/browse/HIVE-10716
> Project: Hive
>  Issue Type: New Feature
>  Components: Logical Optimizer
>Affects Versions: 1.3.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-10716.patch
>
>
> From HIVE-10636 comments, more folding is possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10716) Fold case/when udf for expression involving nulls in filter operator.

2015-05-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14544965#comment-14544965
 ] 

Hive QA commented on HIVE-10716:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12733023/HIVE-10716.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 8939 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3902/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3902/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3902/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12733023 - PreCommit-HIVE-TRUNK-Build

> Fold case/when udf for expression involving nulls in filter operator.
> -
>
> Key: HIVE-10716
> URL: https://issues.apache.org/jira/browse/HIVE-10716
> Project: Hive
>  Issue Type: New Feature
>  Components: Logical Optimizer
>Affects Versions: 1.3.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-10716.patch
>
>
> From HIVE-10636 comments, more folding is possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)