[
https://issues.apache.org/jira/browse/TEZ-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ming Ma updated TEZ-3270:
-------------------------
Description:
One of the scheduling factors is data locality. For a given completed task A,
it is better to have its depending tasks in B run on the same host/container to
reduce the network data transfer between the two. In addition, it might be
better to pick larger partition task over smaller partition task. For example,
in the above fair routing diagram, after task A1 has completed, task B1
and/or task B2 can be scheduled on the same host/container as task A1; and B2
has higher priority than B1 given P2 is larger than P1.
was:
The scheduling considers the following factors:
* Destination tasks’ dependency on source tasks defined by the routing policy.
* Data locality.
In the regular scattergather routing policy, each destination task depends on
all source tasks. If
slowstart is configured to be less than 1.0, destination tasks can be started
as long as a portion
of destination tasks have completed and can fetch data from all those completed
source tasks.
In fair routing, a destination task might depend on only a subset of source
tasks thus there is no
point of scheduling a destination task if none of the source tasks it depends
on have completed.
> Scheduling policy in fair routing
> ---------------------------------
>
> Key: TEZ-3270
> URL: https://issues.apache.org/jira/browse/TEZ-3270
> Project: Apache Tez
> Issue Type: Sub-task
> Reporter: Ming Ma
>
> One of the scheduling factors is data locality. For a given completed task A,
> it is better to have its depending tasks in B run on the same host/container
> to reduce the network data transfer between the two. In addition, it might be
> better to pick larger partition task over smaller partition task. For
> example, in the above fair routing diagram, after task A1 has completed, task
> B1
> and/or task B2 can be scheduled on the same host/container as task A1; and B2
> has higher priority than B1 given P2 is larger than P1.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)