Hi all,

After benchmarking Hive and Pig, I found that the Group By operator in Pig
is drastically slower that Hive's. I was wondering whether anybody has
experienced the same? And whether people may have any tips for improving
the performance of this operation? (Adding a DISTINCT as suggested by an
earlier post on here doesn't help. I am currently re-running the benchmark
with LZO compression enabled).

Regards,
Ben

Reply via email to