Using ORC is a very good idea. Also, since you are using Hive 0.13 you
might want to try Tez as a query engine and enabling vectorization. The
following link is a good guide for that
http://hortonworks.com/hadoop-tutorial/supercharging-interactive-queries-hive-tez/
On Jul 31, 2014 10:26 AM, "Hussain Jamali" <[email protected]>
wrote:

>  I think first thing you should consider to change file format RC to ORC
>
> With Newer Hadoop version ORC file format is very optimized format
> compared to RC. It will give you far better compression and performance
> than RC file.
>
>
>
> Go for Zlib compression + ORC file format.
>
>
>
> *Regards,*
>
>
>
> *Hussain Jamali *[image: cid:[email protected]]
>
>  *|** T: +91.20.4135-1138 |  M: +91.89.56119707 |  E:
> [email protected] <[email protected]> *
>
> *AMDOCS |* *EMBRACE CHALLENGE* *EXPERIENCE SUCCESS*
>
>
>
> *From:* Natarajan, Prabakaran 1. (NSN - IN/Bangalore) [mailto:
> [email protected]]
> *Sent:* Thursday, July 31, 2014 6:20 PM
> *To:* [email protected]; [email protected]
> *Subject:* Hadoop and Hive Performance Tuning
>
>
>
> Hi
>
>
>
> I am using hive queries on structured RC file.
>
>
>
> Can you please let me know, the key performance parameters  that I have
> tune for better query performance (for Hadoop 2.3/ Yarn and Hive 0.13).
>
>
>
> *Thanks and Regards*
>
> Prabakaran.N  aka NP
>
> nsn, Bangalore
>
> *When "I" is replaced by "We" - even Illness becomes "Wellness"*
>
>
>
>
>
>
>
>
>  This message and the information contained herein is proprietary and
> confidential and subject to the Amdocs policy statement, you may review at
> http://www.amdocs.com/email_disclaimer.asp
>

Reply via email to