Hi all, Has anyone tried the hash aggregation feature in pig 0.10 and seen any performance improvement? Recently I'm benchmarking HashAgg and the combiner to see whether we should use HashAgg more aggresively, given that it has lower overhead then the combiner and more flexibility that it can auto-disable itself while the combiner can't.
Some of my benchmark results can be found in https://cwiki.apache.org/confluence/display/PIG/Pig+Performance+Optimization#PigPerformanceOptimization-HashAggvs.Combiner. Any comment is appreciated! Jie
