One more thing to try:
http://www.karmasphere.com/Karmasphere-Analyst/hive-queries-on-table-data.html#multi_group_by_inserts


Look for this text:
*"hive.map.aggr* controls how we do aggregations"

Let me know if this hint helps.

Cheers,
Ajo.

On Mon, Jan 24, 2011 at 2:01 PM, Jonathan Coveney <jcove...@gmail.com>wrote:

> Yes, I tried that, it looks like it forces it to 1 if there are no groups.
>
> 2011/1/24 Ajo Fod <ajo....@gmail.com>
>
> oh ... sorry  you say you already tried that.
>>
>>
>>
>> On Mon, Jan 24, 2011 at 1:54 PM, Ajo Fod <ajo....@gmail.com> wrote:
>> > you could try to set the number of reducers e.g:
>> > set mapred.reduce.tasks=4;
>> >
>> > set this before doing the select.
>> >
>> > -Ajo
>> >
>> > On Mon, Jan 24, 2011 at 1:13 PM, Jonathan Coveney <jcove...@gmail.com>
>> wrote:
>> >> I have a 10 node server or so, and have been mainly using pig on it,
>> but
>> >> would like to try out Hive.
>> >> I am running this query, which doesn't take too long in Pig, but is
>> taking
>> >> quite a long time in Hive.
>> >>
>> >> hive -e "select count(1) as ct from my_table where v1='02' and v2 =
>> >> 11112222;" > thecount
>> >> One thing is that this job only uses 1 reducer, but it is taking most
>> of its
>> >> time in its reduce step. I tried manually setting more reducers, but I
>> think
>> >> that for a job without groups, it forces 1 reducer?
>> >> Either way, would love to know why this is dragging? It's worth noting
>> that
>> >> my_table is not saved in the Hive format, but rather as a flat file. I
>> >> realize that this can influence performance, but shouldn't it at least
>> >> perform on par with pig?
>> >> Thanks for your help
>> >> Jon
>> >
>>
>
>

Reply via email to