Avram: see https://issues.apache.org/jira/browse/HIVE-474.
On Mon, Jul 6, 2009 at 9:17 AM, Avram Aelony <[email protected]>wrote: > > The documentation appears to state the following: > > "Multiple aggregations can be done at the same time, however, no two > aggregations can have different DISTINCT columns .e.g while the following is > possible " > > " INSERT OVERWRITE TABLE pv_gender_agg > SELECT pv_users.gender, count(DISTINCT pv_users.userid), count(1), > sum(DISTINCT pv_users.userid) > FROM pv_users > GROUP BY pv_users.gender; > However, the following query is not allowed. We don't allow multiple > DISTINCT expressions in the same query. > > INSERT OVERWRITE TABLE pv_gender_agg > SELECT pv_users.gender, count(DISTINCT pv_users.userid), count(DISTINCT > pv_users.ip) > FROM pv_users > GROUP BY pv_users.gender;" > > > http://wiki.apache.org/hadoop/Hive/LanguageManual/GroupBy > > > Is there an effort underway to allow multiple DISTINCT expressions in the > same query in the (near) future as well? > > Thanks & regards, > Avram > > > > > -----Original Message----- > From: Amr Awadallah [mailto:[email protected]] > Sent: Saturday, July 04, 2009 12:02 AM > To: [email protected] > Subject: Re: aggregations over multiple columns? > > Mike, > > This is a valid query, group by over multiple columns works in hive. > > -- amr > > Michael E. Driscoll wrote: > > Hi HIVErs, > > > > I'm trying to perform the following aggregation query in HIVE, which > > finds the largest purchase for all combinations of customer and store: > > > > SELECT customer, store, max(purchasePrice) > > FROM transactions > > GROUP BY customer, store > > > > If aggregation over multiple columns is not currently supported, how > > might I reformulate this to work in HIVE, possibly via a simpler > > series of queries? > > > > (I will post the exact error and reproducible code if it turns out > > this query is valid). > > > > regards, > > > > Mike > > > > b: www.dataspora.com/blog > > t: www.twitter.com/dataspora > > >
