Re: Re: sql mapjoin very slow

2015-08-30 Thread r7raul1...@163.com
) 4 UDFToDouble(attr_detl_id) (type: double) r7raul1...@163.com From: Gopal Vijayaraghavan Date: 2015-08-29 01:45 To: user Subject: Re: sql mapjoin very slow I have a question. I use hive 1.1.0 ,so hive.stats.dbclass default value is fs. Mean store statistics in local filesystem

Re: Re: sql mapjoin very slow

2015-08-28 Thread r7raul1...@163.com
statistics in local filesystem. Any one can tell what is the file path to store statistics ? r7raul1...@163.com From: r7raul1...@163.com Date: 2015-08-28 13:03 To: user Subject: Re: Re: sql mapjoin very slow I increase hive.hashtable.initialCapacity to 100 and decrease

Re: sql mapjoin very slow

2015-08-28 Thread Sergey Shelukhin
@hive.apache.org Subject: Re: Re: sql mapjoin very slow I found a method in HashMapWrapper class . I think hive will use statistics to adjust threshold automatically. public static int calculateTableSize( float keyCountAdj, int threshold, float loadFactor, long keyCount) { if (keyCount = 0

Re: sql mapjoin very slow

2015-08-28 Thread Gopal Vijayaraghavan
I have a question. I use hive 1.1.0 ,so hive.stats.dbclass default value is fs. Mean store statistics in local filesystem. Any one can tell what is the file path to store statistics ? The statistics aren't stored in the file system long term - the final destination for stats is the metastore.

Re: Re: sql mapjoin very slow

2015-08-27 Thread r7raul1...@163.com
I increase hive.hashtable.initialCapacity to 100 and decrease hive.hashtable.loadfactor to 0.5 . The query run faster. r7raul1...@163.com From: Sergey Shelukhin Date: 2015-08-28 09:56 To: user Subject: Re: sql mapjoin very slow Is the small-side table large, does it have a lot of rows

Re: sql mapjoin very slow

2015-08-27 Thread Sergey Shelukhin
@hive.apache.orgmailto:user@hive.apache.org Date: Thursday, August 27, 2015 at 18:51 To: user user@hive.apache.orgmailto:user@hive.apache.org Subject: Re: Re: sql mapjoin very slow I use MR. My mapjoin config as showed in follow picture: [cid:_Foxmail.1@7f3eed6a-4406-fa48-f0a1-ec347b3ed46e] [cid:_Foxmail.1

Re: sql mapjoin very slow

2015-08-27 Thread Sergey Shelukhin
Are you using MR and Tez? You could try optimized hash table in case of Tez, although it’s supposed to improve memory, not necessarily perf. Can you also share characteristics of the query and data? It is surprising to see so much time for HashMap.get. From:

Re: Re: sql mapjoin very slow

2015-08-27 Thread r7raul1...@163.com
I use MR. My mapjoin config as showed in follow picture: r7raul1...@163.com From: Sergey Shelukhin Date: 2015-08-28 09:21 To: user Subject: Re: sql mapjoin very slow Are you using MR and Tez? You could try optimized hash table in case of Tez, although it’s supposed to improve memory