xinzhang created HIVE-18351:
-------------------------------
Summary: hive vectorize bug ' set
hive.vectorized.execution.enabled=true'
Key: HIVE-18351
URL: https://issues.apache.org/jira/browse/HIVE-18351
Project: Hive
Issue Type: Bug
Components: hpl/sql
Affects Versions: 1.1.0
Environment: centos6.X cdh5.7 hive-1.1.0-cdh5.7.1
Reporter: xinzhang
hive shell
>set hive.vectorized.execution.enabled=true;
>select * from (SELECT a.created_day send_date, a.task_id, count(*) send_cnt
>FROM dwd_edm_mail_send_logger_d a WHERE a.pt <='20171228' GROUP BY
>a.created_day, a.task_id) p1 LEFT OUTER JOIN(SELECT a.task_id, count(CASE WHEN
>a.behavior=8 THEN 1 ELSE NULL END) hard_bounce_cnt, count(CASE WHEN
>a.behavior=7 THEN 1 ELSE NULL END) soft_bounce_cnt FROM
>dwd_edm_mail_bounce_logger_d a WHERE a.pt <='20171228' GROUP BY a.task_id) p2
>ON p1.task_id=p2.task_id;
---------------------------------------------------------------------------
OK
2017-12-21 612 3052 612 43 43
2017-12-22 614 3274 614 56 56
2017-12-23 615 2267 615 30 30
2017-12-24 616 2126 616 47 47
2017-12-28 617 3919 617 60 60
---------------------------------------------------------------------------
>set hive.vectorized.execution.enabled=false;
>select * from (SELECT a.created_day send_date, a.task_id, count(*) send_cnt
>FROM dwd_edm_mail_send_logger_d a WHERE a.pt <='20171228' GROUP BY
>a.created_day, a.task_id) p1 LEFT OUTER JOIN(SELECT a.task_id, count(CASE WHEN
>a.behavior=8 THEN 1 ELSE NULL END) hard_bounce_cnt, count(CASE WHEN
>a.behavior=7 THEN 1 ELSE NULL END) soft_bounce_cnt FROM
>dwd_edm_mail_bounce_logger_d a WHERE a.pt <='20171228' GROUP BY a.task_id) p2
>ON p1.task_id=p2.task_id;
---------------------------------------------------------------------------
OK
2017-12-21 612 3052 612 35 8
2017-12-22 614 3274 614 44 12
2017-12-23 615 2267 615 21 9
2017-12-24 616 2126 616 25 22
2017-12-28 617 3919 617 42 18
---------------------------------------------------------------------------
sum: the second result is right . set hive.vectorized.execution.enabled=true;
make the result error.
-------------------
another try.
>SELECT a.created_day send_date, a.task_id, count(*) send_cnt FROM
>dwd_edm_mail_send_logger_d a WHERE a.pt <='20171228' GROUP BY a.created_day,
>a.task_id) p1 LEFT OUTER JOIN(SELECT a.task_id, count(CASE WHEN a.behavior=8
>THEN 1 END) hard_bounce_cnt, count(CASE WHEN a.behavior=7 THEN 1 END)
>soft_bounce_cnt FROM dwd_edm_mail_bounce_logger_d a WHERE a.pt <='20171228'
>GROUP BY a.task_id) p2 ON p1.task_id=p2.task_id;
>set hive.vectorized.execution.enabled=false;
--------------------------------------------------------------------
OK
2017-12-21 612 3052 612 35 8
2017-12-22 614 3274 614 44 12
2017-12-23 615 2267 615 21 9
2017-12-24 616 2126 616 25 22
2017-12-28 617 3919 617 42 18
>set hive.vectorized.execution.enabled=true;
--------------------------------------------------------------------
OK
2017-12-21 612 3052 612 35 8
2017-12-22 614 3274 614 44 12
2017-12-23 615 2267 615 21 9
2017-12-24 616 2126 616 25 22
2017-12-28 617 3919 617 42 18
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)