Can you do an explain <query> and send us the plan. 

Another thing that you may try is to put the entire subquery in the from clause 
and then do an aggregation on it..

i.e.

select ..., count(1)
from (select ...
         from clickstream_output o join .....) t
group by t.description, t.num limit 40;

Also are you using the 0.3.0 release candidate?

Ashish

________________________________________
From: Matt Pestritto [m...@pestritto.com]
Sent: Wednesday, April 22, 2009 3:53 PM
To: hive-user@hadoop.apache.org
Subject: Re: Aggregrate Query Fails.

So the only change was to remove the column alias correct?
Still no luck.  Same result.

On Wed, Apr 22, 2009 at 6:49 PM, Prasad Chakka 
<pcha...@facebook.com<mailto:pcha...@facebook.com>> wrote:
That is strange... Does below also fail?

select  m.description, o_buyers.num , count(1) as total
from clickstream_output o
  join merchandise m on (o.merchandise_id = m.merchandise_id)
  left outer join ( select o1.merchandise_id, count(distinct o1.consumer_id) as 
num from clickstream_output o1
         where o1.file_date >= '20090216' and o1.file_date <= '20090217' and 
o1.event_id = 'buy' group by o1.merchandise_id ) o_buyers
     on (o_buyers.merchandise_id = o.merchandise_id)
where o.file_date >= '20090216' and o.file_date <= '20090217'
group by m.description, o_buyers.num limit 40 ;


________________________________
From: Matt Pestritto <m...@pestritto.com<http://m...@pestritto.com>>
Reply-To: <hive-user@hadoop.apache.org<http://hive-user@hadoop.apache.org>>
Date: Wed, 22 Apr 2009 15:43:40 -0700
To: <hive-user@hadoop.apache.org<http://hive-user@hadoop.apache.org>>
Subject: Re: Aggregrate Query Fails.


Thanks.
I tried that also earlier:
FAILED: Error in semantic analysis: line 9:24 Invalid Column Reference buyers

On Wed, Apr 22, 2009 at 6:40 PM, Amr Awadallah 
<a...@cloudera.com<http://a...@cloudera.com>> wrote:


in the group by, try this instead:

group by m.description, buyers limit 40 ;

Matt Pestritto wrote:
Hi - I'm having a problem with a query below.  When I try to run any aggregate 
function on a column from the sub-query, the job fails.
The queries and output messages are below.

Suggestions?

thanks in advance.

-- works:  2 map-reduces jobs.
select m.description, o_buyers.num as buyers
from clickstream_output o
  join merchandise m on (o.merchandise_id = m.merchandise_id)
  left outer join ( select o1.merchandise_id, count(distinct o1.consumer_id) as 
num from clickstream_output o1
         where o1.file_date >= '20090216' and o1.file_date <= '20090217' and 
o1.event_id = 'buy' group by o1.merchandise_id ) o_buyers
     on (o_buyers.merchandise_id = o.merchandise_id)
where o.file_date >= '20090216' and o.file_date <= '20090217' limit 40;

Successful output: PROD7362, 3

-- fails: 3 map-reduce jobs - 2nd reduce fails.
select  m.description, o_buyers.num as buyers, count(1) as total  -- sum or 
max(o_buyers.num) and removing from group by also fails.
from clickstream_output o
  join merchandise m on (o.merchandise_id = m.merchandise_id)
  left outer join ( select o1.merchandise_id, count(distinct o1.consumer_id) as 
num from clickstream_output o1
         where o1.file_date >= '20090216' and o1.file_date <= '20090217' and 
o1.event_id = 'buy' group by o1.merchandise_id ) o_buyers
     on (o_buyers.merchandise_id = o.merchandise_id)
where o.file_date >= '20090216' and o.file_date <= '20090217'
 group by m.description, o_buyers.num limit 40 ;



java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast 
to org.apache.hadoop.io.Text

 at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:172)
 at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:391)
 at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2122)

Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast 
to org.apache.hadoop.io.Text
 at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:489)

 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:353)
 at 
org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:69)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:353)

 at 
org.apache.hadoop.hive.ql.exec.JoinOperator.createForwardJoinObject(JoinOperator.java:273)
 at org.apache.hadoop.hive.ql.exec.JoinOperator.genObject(JoinOperator.java:516)
 at org.apache.hadoop.hive.ql.exec.JoinOperator.genObject(JoinOperator.java:505)

 at org.apache.hadoop.hive.ql.exec.JoinOperator.genObject(JoinOperator.java:505)
 at org.apache.hadoop.hive.ql.exec.JoinOperator.genObject(JoinOperator.java:505)
 at 
org.apache.hadoop.hive.ql.exec.JoinOperator.checkAndGenObject(JoinOperator.java:546)

 at org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:528)
 at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:143)





Reply via email to