One more thing to try: http://www.karmasphere.com/Karmasphere-Analyst/hive-queries-on-table-data.html#multi_group_by_inserts
Look for this text: *"hive.map.aggr* controls how we do aggregations" Let me know if this hint helps. Cheers, Ajo. On Mon, Jan 24, 2011 at 2:01 PM, Jonathan Coveney <jcove...@gmail.com>wrote: > Yes, I tried that, it looks like it forces it to 1 if there are no groups. > > 2011/1/24 Ajo Fod <ajo....@gmail.com> > > oh ... sorry you say you already tried that. >> >> >> >> On Mon, Jan 24, 2011 at 1:54 PM, Ajo Fod <ajo....@gmail.com> wrote: >> > you could try to set the number of reducers e.g: >> > set mapred.reduce.tasks=4; >> > >> > set this before doing the select. >> > >> > -Ajo >> > >> > On Mon, Jan 24, 2011 at 1:13 PM, Jonathan Coveney <jcove...@gmail.com> >> wrote: >> >> I have a 10 node server or so, and have been mainly using pig on it, >> but >> >> would like to try out Hive. >> >> I am running this query, which doesn't take too long in Pig, but is >> taking >> >> quite a long time in Hive. >> >> >> >> hive -e "select count(1) as ct from my_table where v1='02' and v2 = >> >> 11112222;" > thecount >> >> One thing is that this job only uses 1 reducer, but it is taking most >> of its >> >> time in its reduce step. I tried manually setting more reducers, but I >> think >> >> that for a job without groups, it forces 1 reducer? >> >> Either way, would love to know why this is dragging? It's worth noting >> that >> >> my_table is not saved in the Hive format, but rather as a flat file. I >> >> realize that this can influence performance, but shouldn't it at least >> >> perform on par with pig? >> >> Thanks for your help >> >> Jon >> > >> > >