Yes. We have not changed our script, and this only appears after we upgraded to new version at 24th.
Previously we’re using HIVE 1.2.0 + TEZ 0.7.0 Thanks! Chi From: Stephen Sprague [mailto:[email protected]] Sent: Tuesday, March 1, 2016 12:31 PM To: [email protected] Subject: Re: Wrong column is picked in HIVE 2.0.0 + TEZ 0.8.2 left join very interesting. so this did work correctly on your previous distribution of these two products? May i ask what they were? On Mon, Feb 29, 2016 at 8:24 PM, GAO Chi <[email protected] <mailto:[email protected]> > wrote: Hi all, We encountered a strange behavior after upgrading to HIVE 2.0.0 + TEZ 0.8.2. I simplified our query to this: SELECT a.key, a.a_one, b.b_one, a.a_zero, b.b_zero FROM ( SELECT 11 key, 0 confuse_you, 1 a_one, 0 a_zero ) a LEFT JOIN ( SELECT 11 key, 0 confuse_you, 1 b_one, 0 b_zero ) b ON a.key = b.key ; Above query generates this unexpected result: INFO : Status: Running (Executing on YARN cluster with App id application_1456723490535_3653) INFO : Map 1: 0/1 Map 2: 0/1 INFO : Map 1: 0/1 Map 2: 0(+1)/1 INFO : Map 1: 0(+1)/1 Map 2: 0(+1)/1 INFO : Map 1: 0(+1)/1 Map 2: 1/1 INFO : Map 1: 1/1 Map 2: 1/1 INFO : Completed executing command(queryId=hive_20160301115630_0a0dbee5-ba4b-45e7-b027-085f655640fd); Time taken: 10.225 seconds INFO : OK +--------+----------+----------+-----------+-----------+--+ | a.key | a.a_one | b.b_one | a.a_zero | b.b_zero | +--------+----------+----------+-----------+-----------+--+ | 11 | 1 | 0 | 0 | 1 | +--------+----------+----------+-----------+-----------+--+ If you change the constant value of subquery-b’s confuse_you column from 0 to 2, the problem disappears. The plan returned from EXPLAIN shows the incorrect one is picking _col1 and _col2, while the correct one is picking _col2 and _col3 form sub query b. Seems it cannot distinguish 2 columns with same constant value? Anyone encountered similar problem? Thanks! Chi
