configure total number of mappers
---------------------------------
Key: HIVE-1199
URL: https://issues.apache.org/jira/browse/HIVE-1199
Project: Hadoop Hive
Issue Type: Improvement
Components: Query Processor
Reporter: Namit Jain
Fix For: 0.6.0
For users, it might be very difficult to control the number of mappers. There
are many parameters which confuses the users -
for CombineHiveInputFormat, a different set of parameters is required to
control the number of mappers.
In general, users should have a way to specify the total number of mappers,
which should be obeyed. This will be very difficult
to guarantee, since the query might be reading from a large number of
partitions, where a mapper can only span one partition.
What if the number of mappers that the user wants is less than the total number
of partitions ?
It would be a very hueristic to have - a simple usecase that Joy had is as
follows:
A query needs to be run on one table, which has a lot of small files - it will
be easy for him to specify the total number of mappers
rather than the various rac local/node local combinefileinputformat parameters.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.