qingfa zhou created HIVE-23165: ---------------------------------- Summary: Hive On Spark left join and right join generated inconsistent data Key: HIVE-23165 URL: https://issues.apache.org/jira/browse/HIVE-23165 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 2.2.0 Environment: hive :2.3.0
spark:2.2.0 hadoop:2.7.3 Reporter: qingfa zhou Assignee: Xuefu Zhang *1)This is my sql.* with delivery_day as ( select * from ( select dt,warehouse_code,b.sku_main_code,b.out_warehouse_code,b.is_pici_order from data_smartorder.dm_ordering_information_system_order_detail_parse t lateral view json_tuple(t.information_info,'warehouse_code','sku_main_code','调出仓','是否预付商品')b as warehouse_code,sku_main_code,out_warehouse_code,is_pici_order where dt=date_format(date_sub(current_date,1),'yyyyMMdd') and l1_category_name='策略配置' and l2_category_name='pb仓库补货仓品维度新' and b.is_pici_order='1' )t ), avg_sale_7 as ( select *,sku_sale_quantity+first_dilivery_quantity as avg_sale_7 from ( select t1.warehouse_code,t1.warehouse_name,t1.sku_main_code,t1.sku_name sku_main_name, sum(t1.warehouse_dispatch_quantity) as warehouse_dispatch_quantity, sum(t1.sku_sale_quantity) as sku_sale_quantity, sum(t1.first_dilivery_quantity) as first_dilivery_quantity from data_smartorder.dw_ordering_warehouse_sku_cargo_delivery_data_di t1 where t1.dt=date_format(date_sub(current_date,1),'yyyyMMdd') group by t1.warehouse_code,t1.warehouse_name,t1.sku_main_code,t1.sku_name )t ) select t1.warehouse_code,t1.sku_main_code,t1.out_warehouse_code, t2.avg_sale_7 from delivery_day t1 left join avg_sale_7 t2 on t1.warehouse_code=t2.warehouse_code and t1.sku_main_code=t2.sku_main_code where t1.sku_main_code='37010832' and t1.out_warehouse_code='1011'; left join and right join generated inconsistent data. 2) result in the left join 7001 37010832 1011 26.8572 1011 37010832 1011 130.2858 2002 37010832 1011 40 1701 37010832 1011 NULL 3) result in the right join 1011 37010832 1011 65.1429 2002 37010832 1011 20 7001 37010832 1011 13.4286 Inconsistent results in last column,'right join' 's result is right.But the results of hive on tez and sparksql are consistent and is true. -- This message was sent by Atlassian Jira (v8.3.4#803005)