> It has innumerable no of joins. Since its client specific query, u understand I cannot share. Sorry about that
Like I said, Joins are slow and in not done correctly could have terrible performance. A couple of handy techniques depend on how exactly are you trying to perform the join. For instance, if you are trying to join a smaller table to a larger one, a map join could work well for you where the smaller table is kept in-memory when the join is performed. Also if you are able to break your table down to smaller buckets, you might as well be able to use a bucketed map join for instance. Following link should be helpful[1][2]. Hope this helps. [1] https://cwiki.apache.org/confluence/display/Hive/LanguageManual+JoinOptimization [2] http://stackoverflow.com/questions/20199077/hive-efficient-join-of-two-tables On Fri, May 30, 2014 at 5:38 PM, <shouvanik.hal...@accenture.com> wrote: > Pls find the answers > > > > > > > > *From:* kulkarni.swar...@gmail.com [mailto:kulkarni.swar...@gmail.com] > *Sent:* Friday, May 30, 2014 3:34 PM > > *To:* user@hive.apache.org > *Subject:* Re: Need urgent help on hive query performance > > > > I feel it's pretty hard to answer this without understanding the following: > > > > 1. What exactly are you trying to query? CSV? Avro? .... > > HIVE table > > 2. Where is your data? HDFS? HBase? Local filesystem? > > Data is in s3 > > 3. What version of hive are you using? > > Hive 0.12 > > 4. What is an example of a query that is slow? Some queries like > joins and stuff would be inherently slower than other simpler ones(though > can be optimized). > > It has innumerable no of joins. Since its client specific query, u > understand I cannot share. Sorry about that > > > > Thanks, > > > > -- > Swarnim > > > > On Fri, May 30, 2014 at 5:32 PM, <shouvanik.hal...@accenture.com> wrote: > > Can you please give a specific example or blog to refer to. I did not > understand > > > > *From:* Ashish Garg [mailto:gargcreation1...@gmail.com] > *Sent:* Friday, May 30, 2014 3:31 PM > *To:* user@hive.apache.org > *Subject:* Re: Need urgent help on hive query performance > > > > try partitioning the table and run the queries which are partition > specific. Hope this helps. > > Thanks and Regards, > > Ashish Garg. > > > > On Fri, May 30, 2014 at 6:05 PM, <shouvanik.hal...@accenture.com> wrote: > > Hi, > > > > Does anybody help urgently on optimizing hive query performance? I am > looking more Hadoop tuning point of view. Currently, small amount of table > takes much time to query? > > > > We are running EMR cluster with 1 MASTER node, 2 Core Nodes and Task > Nodes. > > > > Quick help is much appreciated. > > > > Thanks, > > Shouvanik > > > ------------------------------ > > > This message is for the designated recipient only and may contain > privileged, proprietary, or otherwise confidential information. If you have > received it in error, please notify the sender immediately and delete the > original. Any other use of the e-mail by you is prohibited. Where allowed > by local law, electronic communications with Accenture and its affiliates, > including e-mail and instant messaging (including content), may be scanned > by our systems for the purposes of information security and assessment of > internal compliance with Accenture policy. > > ______________________________________________________________________________________ > > www.accenture.com > > > > > > > > -- > Swarnim > -- Swarnim