Using ORC is a very good idea. Also, since you are using Hive 0.13 you might want to try Tez as a query engine and enabling vectorization. The following link is a good guide for that http://hortonworks.com/hadoop-tutorial/supercharging-interactive-queries-hive-tez/ On Jul 31, 2014 10:26 AM, "Hussain Jamali" <[email protected]> wrote:
> I think first thing you should consider to change file format RC to ORC > > With Newer Hadoop version ORC file format is very optimized format > compared to RC. It will give you far better compression and performance > than RC file. > > > > Go for Zlib compression + ORC file format. > > > > *Regards,* > > > > *Hussain Jamali *[image: cid:[email protected]] > > *|** T: +91.20.4135-1138 | M: +91.89.56119707 | E: > [email protected] <[email protected]> * > > *AMDOCS |* *EMBRACE CHALLENGE* *EXPERIENCE SUCCESS* > > > > *From:* Natarajan, Prabakaran 1. (NSN - IN/Bangalore) [mailto: > [email protected]] > *Sent:* Thursday, July 31, 2014 6:20 PM > *To:* [email protected]; [email protected] > *Subject:* Hadoop and Hive Performance Tuning > > > > Hi > > > > I am using hive queries on structured RC file. > > > > Can you please let me know, the key performance parameters that I have > tune for better query performance (for Hadoop 2.3/ Yarn and Hive 0.13). > > > > *Thanks and Regards* > > Prabakaran.N aka NP > > nsn, Bangalore > > *When "I" is replaced by "We" - even Illness becomes "Wellness"* > > > > > > > > > This message and the information contained herein is proprietary and > confidential and subject to the Amdocs policy statement, you may review at > http://www.amdocs.com/email_disclaimer.asp >
