Hi Zhong, 

It is possible that you are facing the following hive bug? You may want to 
upgrade the current hive client.  


https://issues.apache.org/jira/browse/HIVE-2095


Thanks
-Abdelrhman 


Hortonworks, Inc.
Technical Support Engineer
Abdelrahman Shettia
ashet...@hortonworks.com
Office phone: (708) 689-9609
How am I doing?   Please feel free to provide feedback to my manager Rick 
Morris at r...@hortonworks.com


On Feb 6, 2013, at 5:28 AM, Zhong Wang <wangzhong....@gmail.com> wrote:

> Hi all,
> 
> I am running tests on Hive auto convert join. From the source code, it seems 
> the conditional task will consider the intermediate table size and run the 
> local task for generating hashtable on the intermediate table if it is 
> smaller than the threshold of hive.mapjoin.smalltable.filesize. However, I 
> ran a very simple query based on TPC-H:
> 
> set hive.auto.convert.join=true;
> 
> insert overwrite table q3_tmp
> select c_custkey, o_orderkey, o_orderdate
> from orders o join customer c on c.c_mktsegment = 'BUILDING' and
> c.c_custkey = o.o_custkey
> join lineitem l on l.l_orderkey = o.o_orderkey
> where c.c_custkey < 1000;
> 
> The intermediate table of c join o is very small (50KB), which is much less 
> than the threshold. However, both the map joins of the intermediate table and 
> lineitem are filtered by conditional task. Is this a bug of auto convert join 
> or something wrong with my usage/analysis?
> 
> Zhong

Reply via email to