Re: Tez containers and input splits

2016-10-28 Thread Hitesh Shah
That is similar to MR’s approach. 

When a task requests for containers, it will specify that it needs to be 
scheduled on a particular set of hosts/racks via a TaskLocationHint. 
The TaskLocationHint is converted to a container ask from YARN i.e. one 
container on any of the hosts or racks specified in the location hint. 

See MRInputHelpers and TaskLocationHint classes for more details. 

Once a container is allocated to Tez, it will try its best to do host-level 
locality. After a certain time back-off, if rack level fall backs are enabled, 
it will try to match an unassigned task to a rack and then eventually fall back 
to a “any” match if that fallback option is enabled. 

— Hitesh 

> On Oct 28, 2016, at 12:10 PM, Madhusudan Ramanna  wrote:
> 
> Hello Hitesh,
> 
> Thanks for that explanation ! Could you clarify about how locality of input 
> splits is used.. 
> 
> thanks,
> Madhu
> 
> 
> On Thursday, October 27, 2016 11:19 PM, Hitesh Shah  wrote:
> 
> 
> Hello Madhusudan,
> 
> I will start with how container allocations work and make my way back to 
> explaining splits. 
> 
> At the lowest level, each vertex will have decided to run a number of tasks. 
> At a high level, when a task is ready to run, it tells the global DAG 
> scheduler about its requirements ( i.e. what kind of container resources it 
> needs, additional container specs such as env, local resources, etc. and also 
> where it wants to be executed for locality. 
> 
> The global scheduler then requests the ResourceManager for as many containers 
> as there are pending tasks. When YARN allocates a container to the Tez AM, 
> the Tez AM decides which is the highest priority task ( vertices at the top 
> of the tree run first ) that matches the container allocated and runs the 
> task on it. Re-used containers are given higher priority over new containers 
> due to JVM launch costs. And YARN may not give Tez all the containers it 
> requested so Tez will make do with whatever it has. It may end up releasing 
> containers which don’t match if there are non-matching tasks that need to be 
> run.
> 
> Now, let us take a “map” vertex which is reading data from HDFS. In MR, each 
> task represented one split ( or a group if you use something like Hive’s 
> CombineFileInputFormat ). In Tez, there are a couple of differences: 
>   
> 1) The InputFormat is invoked in the AM i.e. splits are calculated in the AM 
> ( can be done on the client but most folks now run those in the AM)
> 2) Splits are grouped based on the wave configurations ( 
> https://cwiki.apache.org/confluence/display/TEZ/How+initial+task+parallelism+works
>  ).
> 
> Each grouped split will be mapped to one task. This will then define what 
> kind of container is requested. 
> 
> Let us know if you have more questions.
> 
> thanks
> — Hitesh
> 
> 
> > On Oct 27, 2016, at 5:06 PM, Madhusudan Ramanna  wrote:
> > 
> > Hello Folks,
> > 
> > We have a native Tez application.  My question is mainly about MR inputs 
> > and tez allocated containers.  How does tez grab containers ? Is it one per 
> > input split ?  Could someone shed some light on this ?
> > 
> > thanks,
> > Madhu
> 
> 



Re: Tez containers and input splits

2016-10-28 Thread Madhusudan Ramanna
Hello Hitesh,
Thanks for that explanation ! Could you clarify about how locality of input 
splits is used.. 
thanks,Madhu 

On Thursday, October 27, 2016 11:19 PM, Hitesh Shah  
wrote:
 

 Hello Madhusudan,

I will start with how container allocations work and make my way back to 
explaining splits. 

At the lowest level, each vertex will have decided to run a number of tasks. At 
a high level, when a task is ready to run, it tells the global DAG scheduler 
about its requirements ( i.e. what kind of container resources it needs, 
additional container specs such as env, local resources, etc. and also where it 
wants to be executed for locality. 

The global scheduler then requests the ResourceManager for as many containers 
as there are pending tasks. When YARN allocates a container to the Tez AM, the 
Tez AM decides which is the highest priority task ( vertices at the top of the 
tree run first ) that matches the container allocated and runs the task on it. 
Re-used containers are given higher priority over new containers due to JVM 
launch costs. And YARN may not give Tez all the containers it requested so Tez 
will make do with whatever it has. It may end up releasing containers which 
don’t match if there are non-matching tasks that need to be run.

Now, let us take a “map” vertex which is reading data from HDFS. In MR, each 
task represented one split ( or a group if you use something like Hive’s 
CombineFileInputFormat ). In Tez, there are a couple of differences: 
  
1) The InputFormat is invoked in the AM i.e. splits are calculated in the AM ( 
can be done on the client but most folks now run those in the AM)
2) Splits are grouped based on the wave configurations ( 
https://cwiki.apache.org/confluence/display/TEZ/How+initial+task+parallelism+works
 ).

Each grouped split will be mapped to one task. This will then define what kind 
of container is requested. 

Let us know if you have more questions.

thanks
— Hitesh


> On Oct 27, 2016, at 5:06 PM, Madhusudan Ramanna  wrote:
> 
> Hello Folks,
> 
> We have a native Tez application.  My question is mainly about MR inputs and 
> tez allocated containers.  How does tez grab containers ? Is it one per input 
> split ?  Could someone shed some light on this ?
> 
> thanks,
> Madhu