Hi Yaniv, i'm good to grab the task of moving Mesos to use Docker containers.
Cheers, Yariv On Tue, Oct 23, 2018 at 5:13 PM Yaniv Rodenski <ya...@shinto.io> wrote: > Thanks, Kirupa, > > I'll create the JIRA tasks shortly and assign that one to you. > > > > On Tue, Oct 23, 2018 at 5:09 PM Kirupa Devarajan <kirupagara...@gmail.com> > wrote: > > > Hi Yaniv, > > > > I am happy to pick up the following task > > > > 1. Add to the JobManager the functionality to read action level > > dependencies > > > > Regards, > > Kirupa > > > > On Tue., 23 Oct. 2018, 11:04 am Yaniv Rodenski, <ya...@shinto.io> wrote: > > > > > Hi Nadav, > > > > > > It does make sense, in fact, we actually have action level resources > > > already, however they are limited to the configuration files for the > > > container. > > > I also think that we need to revision the way we set up those. > Correctly > > we > > > use YARN/Mesos to copy dependencies to the containers. With YARN 3.0 I > > > think it makes sense to move to use Docker as the way to manage > resources > > > in the containers. > > > This should also have performance benefits + will make life easier (I > > hope) > > > when we start working on K8s. > > > > > > To do this, I think we need to add the following tasks: > > > 1. Add to the JobManager the functionality to read action level > > > dependencies > > > 2. Move from Mesos/YARN containers to Docker (probably at least two > > tasks) > > > > > > I'll add them to JIRA asap, for version 0.2.1-incubating if everyone is > > OK > > > with it. > > > > > > On Sat, Oct 20, 2018 at 6:43 PM Nadav Har Tzvi <nadavhart...@gmail.com > > > > > wrote: > > > > > > > Hey everyone, > > > > > > > > Yaniv and I were just discussing how to resolve dependencies in the > new > > > > frameworks architecture and integrate the dependencies with the > > concrete > > > > cluster resource manager (Mesos/YARN) > > > > We rolled with the idea of each runner (or base runner) performing > the > > > > dependencies resolution on its own. > > > > So for example, the Spark Scala runner would resolve the required > JARs > > > and > > > > do whatever it needs to do with them (e.g. spark-submit --jars > > --packages > > > > --repositories, etc). > > > > The base Python provider will resolve dependencies and dynamically > > > generate > > > > a requirement.txt file that will deployed to the executor. > > > > The handling of the requirements.txt file differs between different > > > > concrete Python runners. For example, a regular Python runner would > > > simply > > > > run pip install, while the pyspark runner would need to rearrange the > > > > dependencies in a way that would be acceptable by spark-submit ( > > > > > > > > > > > > > > https://bytes.grubhub.com/managing-dependencies-and-artifacts-in-pyspark-7641aa89ddb7 > > > > sounds like a decent idea, comment if you have a better idea please) > > > > > > > > So far I hope it makes sense. > > > > > > > > The next item I want to discuss is as follows: > > > > In the new architecture, we do hierarchical runtime environment > > > resolution, > > > > starting at the top job level and drilling down to the action level, > > > > outputting one unified environment configuration file that is > deployed > > to > > > > the executor. > > > > I suggest doing the same with dependencies. > > > > Currently, we only have job level dependencies. I suggest that we > > provide > > > > action level dependencies and resolve them in exactly the same manner > > as > > > we > > > > resolve the environment. > > > > There should be quite a few benefits for this approach: > > > > > > > > 1. It will give the option to have different versions of the same > > > > package in different actions. This is especially important if you > > have > > > > 2+ > > > > pipeline developers working independently, this would reduce the > > > > integration costs by letting each action be more self-contained. > > > > 2. It should lower the startup time per action. The more > > dependencies > > > > you have, the longer it takes to resolve and install them. Actions > > > will > > > > no > > > > longer get any unnecessary dependencies. > > > > > > > > > > > > What do you think? Does it make sense? > > > > > > > > Cheers, > > > > Nadav > > > > > > > > > > > > > -- > > > Yaniv Rodenski > > > > > > +61 477 778 405 > > > ya...@shinto.io > > > > > > > > -- > Yaniv Rodenski > > +61 477 778 405 > ya...@shinto.io >