Hi Fabio, 

Regarding the second container assignment, the critical aspect is 
"reusedContainer=true”. It is re-using the container used for the parent 
vertex’s task hence the priority is not relevant. In such cases, eventually the 
priority 4 container will be released without being used.
If you set tez.am.container.reuse.enabled to false, you will see the prio 4 
container being used as expected.

As for distance from root, the approach used in VertexImpl.java is:

      int distanceFromRoot = startEvent.getSourceDistanceFromRoot() + 1;
      if(vertex.distanceFromRoot < distanceFromRoot) {
        vertex.distanceFromRoot = distanceFromRoot;
      }

The above is done in SourceVertexStartedTransition which will invoked whenever 
a parent vertex has started. 

Hope that helps clarifies what is happening. 

— Hitesh


On Dec 5, 2014, at 6:11 AM, Fabio <[email protected]> wrote:

> Hi all,
> while reading the log from the join example (2 source vertexes and a sink 
> vertex) I noticed the following:
> 
> 2014-09-26 21:20:27,021 INFO [TaskSchedulerEventHandlerThread] 
> org.apache.tez.dag.app.rm.YarnTaskSchedulerService: Allocation request for 
> task: attempt_1411734050933_0003_1_00_000000_0 with request: 
> Capability[<memory:1024, vCores:1>]Priority[2] host: node02 rack: null
> ...
> 2014-09-26 21:20:28,237 INFO [DelayedContainerManager] 
> org.apache.tez.dag.app.rm.YarnTaskSchedulerService: Assigning container to 
> task, container=Container: [ContainerId: 
> container_1411734050933_0003_01_000002, NodeId: node02:34192, 
> NodeHttpAddress: node02:8042, Resource: <memory:1024, vCores:1>, Priority: 2, 
> Token: Token { kind: ContainerToken, service: 192.168.56.102:34192 }, ], 
> task=attempt_1411734050933_0003_1_00_000000_0, containerHost=node02, 
> localityMatchType=NodeLocal, matchedLocation=node02, honorLocalityFlags=true, 
> reusedContainer=false, delayedContainers=1, containerResourceMemory=1024, 
> containerResourceVCores=1
> 
> And something similar for the other "parent" vertex, nothing strange here. 
> But this is about the "joiner" vertex:
> 
> 2014-09-26 21:21:24,292 INFO [TaskSchedulerEventHandlerThread] 
> org.apache.tez.dag.app.rm.YarnTaskSchedulerService: Allocation request for 
> task: attempt_1411734050933_0003_1_02_000000_0 with request: 
> Capability[<memory:1024, vCores:1>]Priority[4] host: null rack: null
> ...
> 2014-09-26 21:21:24,318 INFO [DelayedContainerManager] 
> org.apache.tez.dag.app.rm.YarnTaskSchedulerService: Assigning container to 
> task, container=Container: [ContainerId: 
> container_1411734050933_0003_01_000003, NodeId: node02:34192, 
> NodeHttpAddress: node02:8042, Resource: <memory:1024, vCores:1>, Priority: 2, 
> Token: Token { kind: ContainerToken, service: 192.168.56.102:34192 }, ], 
> task=attempt_1411734050933_0003_1_02_000000_0, containerHost=node02, 
> localityMatchType=NodeLocal, matchedLocation=node02, honorLocalityFlags=true, 
> reusedContainer=true, delayedContainers=2, containerResourceMemory=1024, 
> containerResourceVCores=1
> 
> Here the priority of the obtained container is still 2, but I was expecting 
> to find the same priority of the request (4). So what is the priority of the 
> obtained container, since it seems to be 2 regardless of the request? Is it 
> used by Tez? How?
> 
> Another question I would like to ask is: I see the priority is calculated as 
> (vertexDistanceFromRoot + 1) * 2, where vertexDistanceFromRoot is (I think) 
> the distance from the vertex which got its input from a file, or at least not 
> from another vertex. But I haven't been able to understand how this value is 
> set, especially in case two (or more) branches converging in a common vertex 
> X have not the same "depth"... in other words: what happens if X has two 
> parents, one with priority 4 and one 6? Which will be its priority?
> 
> Thanks in advance
> 
> Fabio

Reply via email to