That is the way to do it. Some tasks like order by are forced into a
single reducer so depending on the query you are running you may not
be able to control the number.


On Mon, Apr 30, 2012 at 10:06 AM, Ryabin, Thomas
<[email protected]> wrote:
> I tried using this to set the number of reduce tasks to 2, but it doesn’t
> work for me. In my case the Hive query always creates 8 map tasks and 1
> reduce task. Could the number of reduce tasks be limited by the number of
> map tasks, so that if I wanted 2 reduce tasks I would need to increase the
> number of map tasks to 16 in my case?
>
>
>
> -Thomas
>
>
>
> From: Bejoy KS [mailto:[email protected]]
> Sent: Saturday, April 28, 2012 1:43 AM
> To: [email protected]
> Subject: Re: How to make the query compiler not determine the number of
> reducers?
>
>
>
> Hi Thomas
> Hive automatically sets the number of reducers for you. But you can easily
> override them at CLI. Before executing your query
> hive>SET mapred.reduce.tasks=n;
>
> Where n is the required num of reducers.
>
> Regards
> Bejoy KS
>
> Sent from handheld, please excuse typos.
>
> ________________________________
>
> From: "Ryabin, Thomas" <[email protected]>
>
> Date: Fri, 27 Apr 2012 16:48:25 -0400
>
> To: <[email protected]>
>
> ReplyTo: [email protected]
>
> Subject: How to make the query compiler not determine the number of
> reducers?
>
>
>
> Hi,
>
>
>
> When I run a query that uses a custom UDF I made, one of the lines it prints
> out is:
>
> Number of reduce tasks determined at compile time: 1
>
>
>
> And this causes the MapReduce job to have only 1 reducer. Is there a way to
> make it so the compiler does not determine the number of reduce tasks to
> create, so I can specify the number myself?
>
>
>
> The query in question is:
>
> select test_udf(name, store) from employees join stores;
>
>
>
> Thanks,
>
> Thomas

Reply via email to