Thanks guys, always helpful.

On Mon, Mar 9, 2015 at 7:37 PM, Hitesh Shah <[email protected]> wrote:

> A clarification for (2), you can share an AM across multiple users by
> using form of proxy users and passing in the required delegation tokens to
> talk to various services such as HDFS. Also, HiveServer2 when the doAs mode
> is set to false, runs all AMs as user hive but can effectively run queries
> for various different users by doing its security check at the “perimeter”.
>
> — Hitesh
>
> On Mar 9, 2015, at 10:30 AM, Bikas Saha <[email protected]> wrote:
>
> > >>(1)- For every TEZ AM it is possible to launch just a single query/DAG
> at a time. So within a given AM several DAGs can be executed only in
> sequential order (a.k.a. a session), not in parallel. To execute DAGs in
> parallel we always need several AMs.
> >
> > Correct. Today a single AM will accept new DAGs when the AM is idle and
> run them. An AM is idle when no DAG is running.
> >
> > >>(2)- The AM is user-specific, and each user is expected to run queries
> through its own AM (or on multiple AMs if there is a need for parallelism).
> >
> > Correct in a secure cluster. In a non-secure cluster an AM runs as the
> yarn user which is common to all AMs. In a secure cluster, any entity that
> has been given a client token (for that app attempt) by the RM, can
> communicate with the AM. In a non-secure cluster, any entity that has
> obtained the AMs connection information from the RM can communicate with
> the AM. The AM has an additional set of ACL’s that determine who can
> submit, view, modify DAGs.
> >
> > >>(3)- Several users can submit their DAGs as the same user (e.g.:
> through hiveserver2), but in this case we will still have several AM.
> >
> > Correct. However, the number of AMs will be determined by the policy of
> the mediating server. It may choose to launch a new AM for every new DAG.
> Or queue up and round robin through a limited set of AMs, etc.
> >
> > Bikas
> >
> > From: Fabio C. [mailto:[email protected]]
> > Sent: Monday, March 09, 2015 4:31 AM
> > To: [email protected]; [email protected]
> > Subject: Parallel queries/dags running in same AM?
> >
> > Hi all,
> > I've been using Tez on hive, and I had a chance to hear a conversation
> that mismatches with my present knowledge, can anyone confirm the following
> statement?
> > (1)- For every TEZ AM it is possible to launch just a single query/DAG
> at a time. So within a given AM several DAGs can be executed only in
> sequential order (a.k.a. a session), not in parallel. To execute DAGs in
> parallel we always need several AMs.
> > (2)- The AM is user-specific, and each user is expected to run queries
> through its own AM (or on multiple AMs if there is a need for parallelism).
> > (3)- Several users can submit their DAGs as the same user (e.g.: through
> hiveserver2), but in this case we will still have several AM.
> >
> > Thanks in advance
> >
> > Fabio
>
>

Reply via email to