That is similar to MR’s approach. When a task requests for containers, it will specify that it needs to be scheduled on a particular set of hosts/racks via a TaskLocationHint. The TaskLocationHint is converted to a container ask from YARN i.e. one container on any of the hosts or racks specified in the location hint.
See MRInputHelpers and TaskLocationHint classes for more details. Once a container is allocated to Tez, it will try its best to do host-level locality. After a certain time back-off, if rack level fall backs are enabled, it will try to match an unassigned task to a rack and then eventually fall back to a “any” match if that fallback option is enabled. — Hitesh > On Oct 28, 2016, at 12:10 PM, Madhusudan Ramanna <m.rama...@ymail.com> wrote: > > Hello Hitesh, > > Thanks for that explanation ! Could you clarify about how locality of input > splits is used.. > > thanks, > Madhu > > > On Thursday, October 27, 2016 11:19 PM, Hitesh Shah <hit...@apache.org> wrote: > > > Hello Madhusudan, > > I will start with how container allocations work and make my way back to > explaining splits. > > At the lowest level, each vertex will have decided to run a number of tasks. > At a high level, when a task is ready to run, it tells the global DAG > scheduler about its requirements ( i.e. what kind of container resources it > needs, additional container specs such as env, local resources, etc. and also > where it wants to be executed for locality. > > The global scheduler then requests the ResourceManager for as many containers > as there are pending tasks. When YARN allocates a container to the Tez AM, > the Tez AM decides which is the highest priority task ( vertices at the top > of the tree run first ) that matches the container allocated and runs the > task on it. Re-used containers are given higher priority over new containers > due to JVM launch costs. And YARN may not give Tez all the containers it > requested so Tez will make do with whatever it has. It may end up releasing > containers which don’t match if there are non-matching tasks that need to be > run. > > Now, let us take a “map” vertex which is reading data from HDFS. In MR, each > task represented one split ( or a group if you use something like Hive’s > CombineFileInputFormat ). In Tez, there are a couple of differences: > > 1) The InputFormat is invoked in the AM i.e. splits are calculated in the AM > ( can be done on the client but most folks now run those in the AM) > 2) Splits are grouped based on the wave configurations ( > https://cwiki.apache.org/confluence/display/TEZ/How+initial+task+parallelism+works > ). > > Each grouped split will be mapped to one task. This will then define what > kind of container is requested. > > Let us know if you have more questions. > > thanks > — Hitesh > > > > On Oct 27, 2016, at 5:06 PM, Madhusudan Ramanna <m.rama...@ymail.com> wrote: > > > > Hello Folks, > > > > We have a native Tez application. My question is mainly about MR inputs > > and tez allocated containers. How does tez grab containers ? Is it one per > > input split ? Could someone shed some light on this ? > > > > thanks, > > Madhu > >