qingfa zhou created HIVE-23165:
----------------------------------
Summary: Hive On Spark left join and right join generated
inconsistent data
Key: HIVE-23165
URL: https://issues.apache.org/jira/browse/HIVE-23165
Project: Hive
Issue Type: Bug
Components: Hive
Affects Versions: 2.2.0
Environment: hive :2.3.0
spark:2.2.0
hadoop:2.7.3
Reporter: qingfa zhou
Assignee: Xuefu Zhang
*1)This is my sql.*
with delivery_day as (
select * from (
select dt,warehouse_code,b.sku_main_code,b.out_warehouse_code,b.is_pici_order
from data_smartorder.dm_ordering_information_system_order_detail_parse t
lateral view
json_tuple(t.information_info,'warehouse_code','sku_main_code','调出仓','是否预付商品')b
as warehouse_code,sku_main_code,out_warehouse_code,is_pici_order
where dt=date_format(date_sub(current_date,1),'yyyyMMdd')
and l1_category_name='策略配置'
and l2_category_name='pb仓库补货仓品维度新'
and b.is_pici_order='1'
)t
),
avg_sale_7 as (
select *,sku_sale_quantity+first_dilivery_quantity as avg_sale_7
from (
select t1.warehouse_code,t1.warehouse_name,t1.sku_main_code,t1.sku_name
sku_main_name,
sum(t1.warehouse_dispatch_quantity) as warehouse_dispatch_quantity,
sum(t1.sku_sale_quantity) as sku_sale_quantity,
sum(t1.first_dilivery_quantity) as first_dilivery_quantity
from data_smartorder.dw_ordering_warehouse_sku_cargo_delivery_data_di t1
where t1.dt=date_format(date_sub(current_date,1),'yyyyMMdd')
group by t1.warehouse_code,t1.warehouse_name,t1.sku_main_code,t1.sku_name
)t
)
select t1.warehouse_code,t1.sku_main_code,t1.out_warehouse_code,
t2.avg_sale_7
from delivery_day t1
left join avg_sale_7 t2
on t1.warehouse_code=t2.warehouse_code
and t1.sku_main_code=t2.sku_main_code
where t1.sku_main_code='37010832'
and t1.out_warehouse_code='1011';
left join and right join generated inconsistent data.
2) result in the left join
7001 37010832 1011 26.8572
1011 37010832 1011 130.2858
2002 37010832 1011 40
1701 37010832 1011 NULL
3) result in the right join
1011 37010832 1011 65.1429
2002 37010832 1011 20
7001 37010832 1011 13.4286
Inconsistent results in last column,'right join' 's result is right.But the
results of hive on tez and sparksql are consistent and is true.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)