Hello Abdelrhman, I am using the Hive version 0.8.1 which includes this fixing. I think this is an integer overflow bug of ConditionalResolverCommonJoin's inner class AliasFileSizePair. The compareTo() method may overflow:
public int compareTo(AliasFileSizePair o) { if (o == null) { return 1; } return (int)(size - o.size); } because size and o.size are long integers. Can anyone confirm this? Zhong On Thu, Feb 7, 2013 at 4:40 AM, Abdelrhman Shettia <ashet...@hortonworks.com > wrote: > Hi Zhong, > > It is possible that you are facing the following hive bug? You may want to > upgrade the current hive client. > > > https://issues.apache.org/jira/browse/HIVE-2095 > > > Thanks > -Abdelrhman > > > Hortonworks, Inc. > Technical Support Engineer > Abdelrahman Shettia > ashet...@hortonworks.com > Office phone: (708) 689-9609 > How am I doing? Please feel free to provide feedback to my manager Rick > Morris > at r...@hortonworks.com > > > On Feb 6, 2013, at 5:28 AM, Zhong Wang <wangzhong....@gmail.com> wrote: > > Hi all, > > I am running tests on Hive auto convert join. From the source code, it > seems the conditional task will consider the intermediate table size and > run the local task for generating hashtable on the intermediate table if it > is smaller than the threshold of hive.mapjoin.smalltable.filesize. However, > I ran a very simple query based on TPC-H: > > set hive.auto.convert.join=true; > > insert overwrite table q3_tmp > select c_custkey, o_orderkey, o_orderdate > from orders o join customer c on c.c_mktsegment = 'BUILDING' and > c.c_custkey = o.o_custkey > join lineitem l on l.l_orderkey = o.o_orderkey > where c.c_custkey < 1000; > > The intermediate table of c join o is very small (50KB), which is much > less than the threshold. However, both the map joins of the intermediate > table and lineitem are filtered by conditional task. Is this a bug of auto > convert join or something wrong with my usage/analysis? > > Zhong > > >