configure total number of mappers
---------------------------------

                 Key: HIVE-1199
                 URL: https://issues.apache.org/jira/browse/HIVE-1199
             Project: Hadoop Hive
          Issue Type: Improvement
          Components: Query Processor
            Reporter: Namit Jain
             Fix For: 0.6.0


For users, it might be very difficult to control the number of mappers. There 
are many parameters which confuses the users - 
for CombineHiveInputFormat, a different set of parameters is required to 
control the number of mappers.

In general, users should have a way to specify the total number of mappers, 
which should be obeyed. This will be very difficult
to guarantee, since the query might be reading from a large number of 
partitions, where a mapper can only span one partition.
What if the number of mappers that the user wants is less than the total number 
of partitions ?

It would be a very hueristic to have - a simple usecase that Joy had is as 
follows:

A query needs to be run on one table, which has a lot of small files - it will 
be easy for him to specify the total number of mappers
rather than the various rac local/node local combinefileinputformat parameters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to