Hi Qing, We did think about Combiner when we started Hive. However earlier discussions lead us to believe that hash-based aggregation inside the mapper will be as competitive as using combiner in most cases.
In order to enable map-side aggregation, we just need to do the following before running the hive query: set hive.map.aggr=true; Zheng On Thu, Feb 26, 2009 at 6:03 AM, Raghu Murthy <[email protected]> wrote: > Right now Hive does not exploit the combiner. But hash-based map-side > aggregation in hive (controlled by hints) provides a similar optimization. > Using the combiner in addition to map-side aggregation should improve the > performance even more if the combiner can further aggregate the partial > aggregates generated from the mapper. > > > On 2/26/09 5:57 AM, "Qing Yan" <[email protected]> wrote: > > > Is there any way/plan for Hive to take advantage of M/R's combine() > > phrase? There can be either rules embedded in in the query optimizer or > hints > > passed by user... > > GROUP BY should benefit from this alot.. > > > > Any comment? > > > > > > > > -- Yours, Zheng
