For a query with no grouping. hive always uses 1 reducer.
Since map-side aggregation will reduce the number of rows, it is not a problem



On 7/14/09 10:11 PM, "Saurabh Nanda" <[email protected]> wrote:

Hi,

I have a table with 15,000,000+ rows sitting on a 4-node hadoop cluster with 
dfs.replication=4. Hive seems to ignoring my settings for mapred.reduce.tasks & 
hive.exec.reducers.max. Given below is a snippet of what I'm trying. What am I 
doing wrong?

hive>  set mapred.reduce.tasks=17;
hive> set hive.exec.reducers.max=17;
hive> select count(1) from hits;
Total MapReduce jobs = 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>

Saurabh.

Reply via email to