Can you do an explain <query> and send us the plan. Another thing that you may try is to put the entire subquery in the from clause and then do an aggregation on it..
i.e. select ..., count(1) from (select ... from clickstream_output o join .....) t group by t.description, t.num limit 40; Also are you using the 0.3.0 release candidate? Ashish ________________________________________ From: Matt Pestritto [m...@pestritto.com] Sent: Wednesday, April 22, 2009 3:53 PM To: hive-user@hadoop.apache.org Subject: Re: Aggregrate Query Fails. So the only change was to remove the column alias correct? Still no luck. Same result. On Wed, Apr 22, 2009 at 6:49 PM, Prasad Chakka <pcha...@facebook.com<mailto:pcha...@facebook.com>> wrote: That is strange... Does below also fail? select m.description, o_buyers.num , count(1) as total from clickstream_output o join merchandise m on (o.merchandise_id = m.merchandise_id) left outer join ( select o1.merchandise_id, count(distinct o1.consumer_id) as num from clickstream_output o1 where o1.file_date >= '20090216' and o1.file_date <= '20090217' and o1.event_id = 'buy' group by o1.merchandise_id ) o_buyers on (o_buyers.merchandise_id = o.merchandise_id) where o.file_date >= '20090216' and o.file_date <= '20090217' group by m.description, o_buyers.num limit 40 ; ________________________________ From: Matt Pestritto <m...@pestritto.com<http://m...@pestritto.com>> Reply-To: <hive-user@hadoop.apache.org<http://hive-user@hadoop.apache.org>> Date: Wed, 22 Apr 2009 15:43:40 -0700 To: <hive-user@hadoop.apache.org<http://hive-user@hadoop.apache.org>> Subject: Re: Aggregrate Query Fails. Thanks. I tried that also earlier: FAILED: Error in semantic analysis: line 9:24 Invalid Column Reference buyers On Wed, Apr 22, 2009 at 6:40 PM, Amr Awadallah <a...@cloudera.com<http://a...@cloudera.com>> wrote: in the group by, try this instead: group by m.description, buyers limit 40 ; Matt Pestritto wrote: Hi - I'm having a problem with a query below. When I try to run any aggregate function on a column from the sub-query, the job fails. The queries and output messages are below. Suggestions? thanks in advance. -- works: 2 map-reduces jobs. select m.description, o_buyers.num as buyers from clickstream_output o join merchandise m on (o.merchandise_id = m.merchandise_id) left outer join ( select o1.merchandise_id, count(distinct o1.consumer_id) as num from clickstream_output o1 where o1.file_date >= '20090216' and o1.file_date <= '20090217' and o1.event_id = 'buy' group by o1.merchandise_id ) o_buyers on (o_buyers.merchandise_id = o.merchandise_id) where o.file_date >= '20090216' and o.file_date <= '20090217' limit 40; Successful output: PROD7362, 3 -- fails: 3 map-reduce jobs - 2nd reduce fails. select m.description, o_buyers.num as buyers, count(1) as total -- sum or max(o_buyers.num) and removing from group by also fails. from clickstream_output o join merchandise m on (o.merchandise_id = m.merchandise_id) left outer join ( select o1.merchandise_id, count(distinct o1.consumer_id) as num from clickstream_output o1 where o1.file_date >= '20090216' and o1.file_date <= '20090217' and o1.event_id = 'buy' group by o1.merchandise_id ) o_buyers on (o_buyers.merchandise_id = o.merchandise_id) where o.file_date >= '20090216' and o.file_date <= '20090217' group by m.description, o_buyers.num limit 40 ; java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.Text at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:172) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:391) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2122) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.Text at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:489) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:353) at org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:69) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:353) at org.apache.hadoop.hive.ql.exec.JoinOperator.createForwardJoinObject(JoinOperator.java:273) at org.apache.hadoop.hive.ql.exec.JoinOperator.genObject(JoinOperator.java:516) at org.apache.hadoop.hive.ql.exec.JoinOperator.genObject(JoinOperator.java:505) at org.apache.hadoop.hive.ql.exec.JoinOperator.genObject(JoinOperator.java:505) at org.apache.hadoop.hive.ql.exec.JoinOperator.genObject(JoinOperator.java:505) at org.apache.hadoop.hive.ql.exec.JoinOperator.checkAndGenObject(JoinOperator.java:546) at org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:528) at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:143)