Hi, I have a table with 15,000,000+ rows sitting on a 4-node hadoop cluster with dfs.replication=4. Hive seems to ignoring my settings for mapred.reduce.tasks & hive.exec.reducers.max. Given below is a snippet of what I'm trying. What am I doing wrong?
hive> set mapred.reduce.tasks=17; hive> set hive.exec.reducers.max=17; hive> select count(1) from hits; Total MapReduce jobs = 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapred.reduce.tasks=<number> Saurabh. -- http://nandz.blogspot.com http://foodieforlife.blogspot.com
