Thanks Thejas for your input! These are interesting and very specific which is exactly what is required for a masters thesis.
Are there any publications on Hive and the evaluation of its performance that i can use to compare ? Regards, Sarfraz Rasheed Ramay (DIT) Dublin, Ireland. On Sat, May 3, 2014 at 3:07 AM, Thejas Nair <the...@hortonworks.com> wrote: > The primary difference between hive and pig is the language. There are > implementation differences that will result in performance > differences, but it will be hard to figure out what aspect of > implementation responsible for what improvement. > > I think a more interesting project would be to compare the impact of > various performance improvements in hive. There are many features that > you can turn on and off. > > example - > - hive vectorization > - file format - text vs RCFile vs ORC > - compressed vs uncompressed > - mapreduce vs tez execution engine > - stats optimized queries > > > > On Thu, May 1, 2014 at 5:47 AM, Sarfraz Ramay <sarfraz.ra...@gmail.com> > wrote: > >> > >> Hi, > >> > >> It seems that both Hive and Pig are used for managing large data sets. > >> Hive is more SQL oriented whereas Pig is more for the data flows. I am > doing > >> a master's thesis on the performance evaluation of both. Can some please > >> provide a list of tasks that would make for an interesting comparison ? > >> > >> > >> What is Hive good at ? > >> > >> What is Pig good at ? > >> > >> Ideally, i would like to take what Hive is good at and test it in Pig > and > >> vice versa. The competitive characteristics would make for an > interesting > >> comparison. > >> > >> > >> > >> > >> Regards, > >> Sarfraz Rasheed Ramay (DIT) > >> Dublin, Ireland. > > > > > > -- > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity to > which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader > of this message is not the intended recipient, you are hereby notified that > any printing, copying, dissemination, distribution, disclosure or > forwarding of this communication is strictly prohibited. If you have > received this communication in error, please contact the sender immediately > and delete it from your system. Thank You. >