That is similar to MR’s approach.
When a task requests for containers, it will specify that it needs to be
scheduled on a particular set of hosts/racks via a TaskLocationHint.
The TaskLocationHint is converted to a container ask from YARN i.e. one
container on any of the hosts or racks specified in the location hint.
See MRInputHelpers and TaskLocationHint classes for more details.
Once a container is allocated to Tez, it will try its best to do host-level
locality. After a certain time back-off, if rack level fall backs are enabled,
it will try to match an unassigned task to a rack and then eventually fall back
to a “any” match if that fallback option is enabled.
— Hitesh
> On Oct 28, 2016, at 12:10 PM, Madhusudan Ramanna wrote:
>
> Hello Hitesh,
>
> Thanks for that explanation ! Could you clarify about how locality of input
> splits is used..
>
> thanks,
> Madhu
>
>
> On Thursday, October 27, 2016 11:19 PM, Hitesh Shah wrote:
>
>
> Hello Madhusudan,
>
> I will start with how container allocations work and make my way back to
> explaining splits.
>
> At the lowest level, each vertex will have decided to run a number of tasks.
> At a high level, when a task is ready to run, it tells the global DAG
> scheduler about its requirements ( i.e. what kind of container resources it
> needs, additional container specs such as env, local resources, etc. and also
> where it wants to be executed for locality.
>
> The global scheduler then requests the ResourceManager for as many containers
> as there are pending tasks. When YARN allocates a container to the Tez AM,
> the Tez AM decides which is the highest priority task ( vertices at the top
> of the tree run first ) that matches the container allocated and runs the
> task on it. Re-used containers are given higher priority over new containers
> due to JVM launch costs. And YARN may not give Tez all the containers it
> requested so Tez will make do with whatever it has. It may end up releasing
> containers which don’t match if there are non-matching tasks that need to be
> run.
>
> Now, let us take a “map” vertex which is reading data from HDFS. In MR, each
> task represented one split ( or a group if you use something like Hive’s
> CombineFileInputFormat ). In Tez, there are a couple of differences:
>
> 1) The InputFormat is invoked in the AM i.e. splits are calculated in the AM
> ( can be done on the client but most folks now run those in the AM)
> 2) Splits are grouped based on the wave configurations (
> https://cwiki.apache.org/confluence/display/TEZ/How+initial+task+parallelism+works
> ).
>
> Each grouped split will be mapped to one task. This will then define what
> kind of container is requested.
>
> Let us know if you have more questions.
>
> thanks
> — Hitesh
>
>
> > On Oct 27, 2016, at 5:06 PM, Madhusudan Ramanna wrote:
> >
> > Hello Folks,
> >
> > We have a native Tez application. My question is mainly about MR inputs
> > and tez allocated containers. How does tez grab containers ? Is it one per
> > input split ? Could someone shed some light on this ?
> >
> > thanks,
> > Madhu
>
>