Re: aggregations over multiple columns?

Jeff Hammerbacher Mon, 06 Jul 2009 19:51:57 -0700

Avram: see https://issues.apache.org/jira/browse/HIVE-474.


On Mon, Jul 6, 2009 at 9:17 AM, Avram Aelony <[email protected]>wrote:

>
> The documentation appears to state the following:
>
> "Multiple aggregations can be done at the same time, however, no two
> aggregations can have different DISTINCT columns .e.g while the following is
> possible "
>
> "  INSERT OVERWRITE TABLE pv_gender_agg
>  SELECT pv_users.gender, count(DISTINCT pv_users.userid), count(1),
> sum(DISTINCT pv_users.userid)
>  FROM pv_users
>  GROUP BY pv_users.gender;
> However, the following query is not allowed. We don't allow multiple
> DISTINCT expressions in the same query.
>
>  INSERT OVERWRITE TABLE pv_gender_agg
>  SELECT pv_users.gender, count(DISTINCT pv_users.userid), count(DISTINCT
> pv_users.ip)
>  FROM pv_users
>  GROUP BY pv_users.gender;"
>
>
> http://wiki.apache.org/hadoop/Hive/LanguageManual/GroupBy
>
>
> Is there an effort underway to allow multiple DISTINCT expressions in the
> same query in the (near) future as well?
>
> Thanks & regards,
> Avram
>
>
>
>
> -----Original Message-----
> From: Amr Awadallah [mailto:[email protected]]
> Sent: Saturday, July 04, 2009 12:02 AM
> To: [email protected]
> Subject: Re: aggregations over multiple columns?
>
> Mike,
>
>  This is a valid query, group by over multiple columns works in hive.
>
> -- amr
>
> Michael E. Driscoll wrote:
> > Hi HIVErs,
> >
> > I'm trying to perform the following aggregation query in HIVE, which
> > finds the largest purchase for all combinations of customer and store:
> >
> >   SELECT customer, store, max(purchasePrice)
> >   FROM transactions
> >   GROUP BY customer, store
> >
> > If aggregation over multiple columns is not currently supported, how
> > might I reformulate this to work in HIVE, possibly via a simpler
> > series of queries?
> >
> > (I will post the exact error and reproducible code if it turns out
> > this query is valid).
> >
> > regards,
> >
> > Mike
> >
> > b: www.dataspora.com/blog
> > t: www.twitter.com/dataspora
> >
>

Re: aggregations over multiple columns?

Reply via email to