[
https://issues.apache.org/jira/browse/TEZ-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16279203#comment-16279203
]
Eric Wohlstadter commented on TEZ-3872:
---------------------------------------
[~gopalv]
Is there a way to get Hive 3 to generate more ONE_TO_ONE edges? I found 3
queries which use them. Or is their usage mostly ripped out of the code? If so,
was there a previous version of Hive which generated more ONE_TO_ONE edges?
Looking for more examples to test against.
> OneToOne Edge: Scheduling misses due to released containers
> -----------------------------------------------------------
>
> Key: TEZ-3872
> URL: https://issues.apache.org/jira/browse/TEZ-3872
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Gopal V
>
> https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/TaskSchedulerManager.java#L477
> That's where it decides between using container or node/racks - it does not
> record the hosts/racks for the container, the container affinity ignores node
> affinity fall backs.
> https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L986
> Inside the YARN task scheduling impl, this only picks up the host if the
> container is being held at the moment, not if it has been released - this
> also has no checks for in use containers.
> TaskSchedulerManager can grab ta.containerNodeId, directly off the attempt
> information to get the host info as well container info.
> This needs a new allocateTask API which has container, host, rack in the
> order of scheduling preference.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)