Thanks bejoy. Regards Abhi
Sent from my iPhone On Sep 26, 2012, at 1:42 PM, Bejoy KS <[email protected]> wrote: > Hi Abshiek > > From the map reduce logs you can see whether the data processed by one > reducer is much more than that of other reducers. Or in short one reducer > takes relatively longer time complete compared to others. > > Also to my previous mail, one more optimization is possible for group By if > your table is bucketed or sorted bucketed. This optimization applies when the > Group By columns are same as bucketed columns or the group by columns are a > subset of sorted bucked columns. This optimization is enabled using > 'hive.optimize.groupby' which is true by default > > Regards, > Bejoy KS > > From: Abhishek <[email protected]> > To: "[email protected]" <[email protected]> > Cc: "[email protected]" <[email protected]> > Sent: Wednesday, September 26, 2012 10:59 PM > Subject: Re: How to optimize a group by query > > Hi Bejoy, > > Thanks for the reply, how can I know data skew among reducers. > > Regards > Abhi > > Sent from my iPhone > > On Sep 26, 2012, at 1:20 PM, Bejoy KS <[email protected]> wrote: > >> Hi Abshiek >> >> Group by performance can be improved by the following >> 1)enabling map side aggregation. In latest versions it is enabled by default >> SET hive.map.aggr = true; >> >> 2)Is there a data skew observed in some of the reducers? >> If so a better performance can be yielded by setting the following property >> SET hive.groupby.skewindata=true; >> >> >> Regards, >> Bejoy KS >> >> From: Abhishek <[email protected]> >> To: Hive <[email protected]> >> Sent: Wednesday, September 26, 2012 10:31 PM >> Subject: How to optimize a group by query >> >> Hi all, >> >> I have written a query with group by clause, it is consuming lot of time is >> there any way to optimize this any configuration property or some thing. >> >> Regards >> Abhi >> >> >> Sent from my iPhone > >
