A clarification for (2), you can share an AM across multiple users by using 
form of proxy users and passing in the required delegation tokens to talk to 
various services such as HDFS. Also, HiveServer2 when the doAs mode is set to 
false, runs all AMs as user hive but can effectively run queries for various 
different users by doing its security check at the “perimeter”. 

— Hitesh

On Mar 9, 2015, at 10:30 AM, Bikas Saha <bi...@hortonworks.com> wrote:

> >>(1)- For every TEZ AM it is possible to launch just a single query/DAG at a 
> >>time. So within a given AM several DAGs can be executed only in sequential 
> >>order (a.k.a. a session), not in parallel. To execute DAGs in parallel we 
> >>always need several AMs.
>  
> Correct. Today a single AM will accept new DAGs when the AM is idle and run 
> them. An AM is idle when no DAG is running.
>  
> >>(2)- The AM is user-specific, and each user is expected to run queries 
> >>through its own AM (or on multiple AMs if there is a need for parallelism). 
> 
> Correct in a secure cluster. In a non-secure cluster an AM runs as the yarn 
> user which is common to all AMs. In a secure cluster, any entity that has 
> been given a client token (for that app attempt) by the RM, can communicate 
> with the AM. In a non-secure cluster, any entity that has obtained the AMs 
> connection information from the RM can communicate with the AM. The AM has an 
> additional set of ACL’s that determine who can submit, view, modify DAGs.
>  
> >>(3)- Several users can submit their DAGs as the same user (e.g.: through 
> >>hiveserver2), but in this case we will still have several AM.
> 
> Correct. However, the number of AMs will be determined by the policy of the 
> mediating server. It may choose to launch a new AM for every new DAG. Or 
> queue up and round robin through a limited set of AMs, etc.
>  
> Bikas
>  
> From: Fabio C. [mailto:anyte...@gmail.com] 
> Sent: Monday, March 09, 2015 4:31 AM
> To: user@tez.apache.org; u...@hive.apache.org
> Subject: Parallel queries/dags running in same AM?
>  
> Hi all,
> I've been using Tez on hive, and I had a chance to hear a conversation that 
> mismatches with my present knowledge, can anyone confirm the following 
> statement?
> (1)- For every TEZ AM it is possible to launch just a single query/DAG at a 
> time. So within a given AM several DAGs can be executed only in sequential 
> order (a.k.a. a session), not in parallel. To execute DAGs in parallel we 
> always need several AMs.
> (2)- The AM is user-specific, and each user is expected to run queries 
> through its own AM (or on multiple AMs if there is a need for parallelism). 
> (3)- Several users can submit their DAGs as the same user (e.g.: through 
> hiveserver2), but in this case we will still have several AM.
> 
> Thanks in advance
> 
> Fabio

Reply via email to