[ 
https://issues.apache.org/jira/browse/TEZ-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16280801#comment-16280801
 ] 

Eric Wohlstadter commented on TEZ-3872:
---------------------------------------

Just to add to the description. 

It looks like the fact that the host doesn't get set in some cases in 
allocateTask (affinitized case), is causing the fallback to node locality (from 
container locality) to finally miss here:

https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L1475



> OneToOne Edge: Scheduling misses due to released containers
> -----------------------------------------------------------
>
>                 Key: TEZ-3872
>                 URL: https://issues.apache.org/jira/browse/TEZ-3872
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Gopal V
>         Attachments: tpcds_q69_1000_after.txt, tpcds_q69_1000_before.txt
>
>
> https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/TaskSchedulerManager.java#L477
> That's where it decides between using container or node/racks - it does not 
> record the hosts/racks for the container, the container affinity ignores node 
> affinity fall backs.
> https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L986
> Inside the YARN task scheduling impl, this only picks up the host if the 
> container is being held at the moment, not if it has been released - this 
> also has no checks for in use containers.
> TaskSchedulerManager can grab  ta.containerNodeId, directly off the attempt 
> information to get the host info as well container info.
> This needs a new allocateTask API which has container, host, rack in the 
> order of scheduling preference.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to