speculation should normalize progress rates based on amount of input data
-------------------------------------------------------------------------
Key: MAPREDUCE-2216
URL: https://issues.apache.org/jira/browse/MAPREDUCE-2216
Project: Hadoop Map/Reduce
Issue Type: Bug
Reporter: Joydeep Sen Sarma
We frequently see skews in data distribution both on the mappers and reducers.
The small ones finish quickly and the longer ones immediately get speculated.
We should normalize progress rates used by speculation with some metric
correlated to the amount of data processed by the task (like bytes read of rows
processed). That will prevent these unnecessary speculations.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.