Avoid priority inversion that could result due to scheduling running jobs in an 
order sorted by priority
--------------------------------------------------------------------------------------------------------

                 Key: HADOOP-4557
                 URL: https://issues.apache.org/jira/browse/HADOOP-4557
             Project: Hadoop Core
          Issue Type: Improvement
          Components: mapred
            Reporter: Hemanth Yamijala


- Consider a job, J1, with priority NORMAL that is running reduce tasks 
occupying all reduce slots and has running and pending map tasks. 
- At this point, suppose a job, J2, is submitted with priority HIGH or say its 
priority is changed to HIGH from NORMAL.
- The schedulers typically will start scheduling tasks from job J2, as J1's 
running maps complete. The default scheduler in Hadoop does this, and with 
HADOOP-4471, so will the capacity scheduler.
- However, as there are still pending maps in J1, the reduce tasks of J1 are 
all stuck and no reduce tasks of J2 can run. 
- So, all map tasks of J2 will complete, followed by completion of all map 
tasks of J1, and then reduce tasks from J1 will start getting freed for J2 to 
complete. 

This could result in jobs completing slowly. Also, if there are enough jobs of 
higher priority, they could result in low priority jobs being starved. At the 
same time more and more resources (such as intermediate disk space) will get 
consumed without jobs completing.

This jira is to discuss and implement a solution for the above problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to