I couldn't decide that whether it is an HBase question or Hadoop/Yarn. In the utility class for MR jobs integerated with HBase, *org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil, *
in the method: *public static void initTableReducerJob(String table,* * Class<? extends TableReducer> reducer, Job job,* * Class partitioner, String quorumAddress, String serverClass,* * String serverImpl, boolean addDependencyJars) throws IOException;* Im the above method the following check is added, while setting the number of reducers that: *...* *int regions = outputTable.getRegionsInfo().size();* *...* *if (job.getNumReduceTasks() > regions) {* * job.setNumReduceTasks(outputTable.getRegionsInfo().size());* * }* *... * What is the reason for doing this? And what are the negative effects we don't follow this? I can think one that, in case of more than one reducer writing/reading a same region can cause hot-spotting and performance issues. Are there any other reasons to add this check as well? Thanks a lot. Regards, Shahab