unsubscribe

Thanks,
Madhav

-----Original Message-----
From: Yaniv Rodenski <ya...@shinto.io> 
Sent: Tuesday, October 23, 2018 4:51 PM
To: dev@amaterasu.apache.org
Cc: d...@amaterasu.incubator.apache.org
Subject: EXT: Re: [DISCUSS] Dependencies resolution and action level 
dependencies

Excellent,

I've added:
AMATERASU-53 - Support action level dependencies 
<https://issues.apache.org/jira/browse/AMATERASU-53>
AMATERASU-54 - Use Docker for Mesos containerization 
<https://issues.apache.org/jira/browse/AMATERASU-54>
I suggest we review after AMATERASU-54 how to approach the YARN implementation.

Cheers,
Yaniv

On Tue, Oct 23, 2018 at 9:51 PM Yariv Triffon <yar...@gmail.com> wrote:

> Hi Yaniv,
> i'm good to grab the task of moving Mesos to use Docker containers.
>
> Cheers,
> Yariv
>
> On Tue, Oct 23, 2018 at 5:13 PM Yaniv Rodenski <ya...@shinto.io> wrote:
>
> > Thanks, Kirupa,
> >
> > I'll create the JIRA tasks shortly and assign that one to you.
> >
> >
> >
> > On Tue, Oct 23, 2018 at 5:09 PM Kirupa Devarajan <
> kirupagara...@gmail.com>
> > wrote:
> >
> > > Hi Yaniv,
> > >
> > > I am happy to pick up the following task
> > >
> > > 1. Add to the JobManager the functionality to read action level 
> > > dependencies
> > >
> > > Regards,
> > > Kirupa
> > >
> > > On Tue., 23 Oct. 2018, 11:04 am Yaniv Rodenski, <ya...@shinto.io>
> wrote:
> > >
> > > > Hi Nadav,
> > > >
> > > > It does make sense, in fact, we actually have action level 
> > > > resources already, however they are limited to the configuration 
> > > > files for the container.
> > > > I also think that we need to revision the way we set up those.
> > Correctly
> > > we
> > > > use YARN/Mesos to copy dependencies to the containers. With YARN 
> > > > 3.0
> I
> > > > think it makes sense to move to use Docker as the way to manage
> > resources
> > > > in the containers.
> > > > This should also have performance benefits + will make life 
> > > > easier (I
> > > hope)
> > > > when we start working on K8s.
> > > >
> > > > To do this, I think we need to add the following tasks:
> > > > 1. Add to the JobManager the functionality to read action level 
> > > > dependencies 2. Move from Mesos/YARN containers to Docker 
> > > > (probably at least two
> > > tasks)
> > > >
> > > > I'll add them to JIRA asap, for version 0.2.1-incubating if 
> > > > everyone
> is
> > > OK
> > > > with it.
> > > >
> > > > On Sat, Oct 20, 2018 at 6:43 PM Nadav Har Tzvi <
> nadavhart...@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > Hey everyone,
> > > > >
> > > > > Yaniv and I were just discussing how to resolve dependencies 
> > > > > in the
> > new
> > > > > frameworks architecture and integrate the dependencies with 
> > > > > the
> > > concrete
> > > > > cluster resource manager (Mesos/YARN) We rolled with the idea 
> > > > > of each runner (or base runner) performing
> > the
> > > > > dependencies resolution on its own.
> > > > > So for example, the Spark Scala runner would resolve the 
> > > > > required
> > JARs
> > > > and
> > > > > do whatever it needs to do with them (e.g. spark-submit --jars
> > > --packages
> > > > > --repositories, etc).
> > > > > The base Python provider will resolve dependencies and 
> > > > > dynamically
> > > > generate
> > > > > a requirement.txt file that will deployed to the executor.
> > > > > The handling of the requirements.txt file differs between 
> > > > > different concrete Python runners. For example, a regular 
> > > > > Python runner would
> > > > simply
> > > > > run pip install, while the pyspark runner would need to 
> > > > > rearrange
> the
> > > > > dependencies in a way that would be acceptable by spark-submit 
> > > > > (
> > > > >
> > > > >
> > > >
> > >
> >
> https://bytes.grubhub.com/managing-dependencies-and-artifacts-in-pyspa
> rk-7641aa89ddb7
> > > > > sounds like a decent idea, comment if you have a better idea
> please)
> > > > >
> > > > > So far I hope it makes sense.
> > > > >
> > > > > The next item I want to discuss is as follows:
> > > > > In the new architecture, we do hierarchical runtime 
> > > > > environment
> > > > resolution,
> > > > > starting at the top job level and drilling down to the action
> level,
> > > > > outputting one unified environment configuration file that is
> > deployed
> > > to
> > > > > the executor.
> > > > > I suggest doing the same with dependencies.
> > > > > Currently, we only have job level dependencies. I suggest that 
> > > > > we
> > > provide
> > > > > action level dependencies and resolve them in exactly the same
> manner
> > > as
> > > > we
> > > > > resolve the environment.
> > > > > There should be quite a few benefits for this approach:
> > > > >
> > > > >    1. It will give the option to have different versions of 
> > > > > the
> same
> > > > >    package in different actions. This is especially important 
> > > > > if
> you
> > > have
> > > > > 2+
> > > > >    pipeline developers working independently, this would reduce the
> > > > >    integration costs by letting each action be more self-contained.
> > > > >    2. It should lower the startup time per action. The more
> > > dependencies
> > > > >    you have, the longer it takes to resolve and install them.
> Actions
> > > > will
> > > > > no
> > > > >    longer get any unnecessary dependencies.
> > > > >
> > > > >
> > > > > What do you think? Does it make sense?
> > > > >
> > > > > Cheers,
> > > > > Nadav
> > > > >
> > > >
> > > >
> > > > --
> > > > Yaniv Rodenski
> > > >
> > > > +61 477 778 405
> > > > ya...@shinto.io
> > > >
> > >
> >
> >
> > --
> > Yaniv Rodenski
> >
> > +61 477 778 405
> > ya...@shinto.io
> >
>


--
Yaniv Rodenski

+61 477 778 405
ya...@shinto.io

Reply via email to