You can easily share the artifacts with a docker shared volume in the container "EXPORT M2_HOME=/container/m2/"
follwed by "docker build -v ~/.m2/ /container/m2/ ........ " This will put the mvn jars into the host rather than the guest conatainer, so that they persist. On Thu, Jun 18, 2015 at 5:32 PM, Olaf Flebbe <[email protected]> wrote: > Thanks Nate > > for this focused writeup! > > Yeah maybe it is time to reboot our brains ... > > Additionaly to the points of nate I would like to attack this in bigtop > 1.1.0: > > ………….. > Building from source or downloading ? > …………… > > However we have a substancial problem hidden deep in th CI „2.0“ approach > using containers > > You may know that we place artifacts (i.e. jars) we built with bigtop into > the local maven cache ~/.m2. (look for mvn install in do-component-build). > The idea is that later maven builds will pick these artifacts and use them > rather downloading them from maven central. > > Placing artifacts into ~/.m2 will not have any effect if we use CI > containers the way we do now: The maven cache ~/.m2 is lost when the > container ends. > > [This triggered misfeature in JIRA BIGTOP-1893, BTW: gradle rpm/apt > behaved differently from a container build with artifacts from maven > central.] > > Option 1) Remove mvn install from all do-component-builds > > Results: > > + We compile projects the way the upstream-developer does. > - local fixes and configurations will not propagated > > Questions: > If we do not try to reuse our build-artifacts within compile we have to > ask ourself "why do we compile projects at all?“. > > We can build a great test wether someone else has touched / manipulated > the maven central cache if we compare artifacts, but is this the really the > point of compiling ourselves? > > > Option 2) Use mvn install and reuse artifacts even in containers. > > Consequences: > > - Containers are not stateless any more > > - We have to add depencies to CI jobs so they run in order > > - single components may break the whole compile process. > > - Compile does not scale any more > > My Opinion: > The way we do now "mvn install“ , simply tainting the maven cache seems > not a really controlled way to propagate artifacts to me. > > Option 3) Use 1) but reuse artifacts in packages by placing symlinks and > dependencies between them. > > - Packages will break with subtile problems if we do symlink artifacts > from different releases. > > ---- > Neither Option 1, Option 2 nor Option 3 seems a clever way to fix the > problem. Would like to hear comments regarding this issue: > > > In my humble opinion we should follow Option 2 with all the grave > consequences. But maybe reworking mvn install by placing the artifacts with > a bigtop specific name / groupid into the maven cache and upload them to > maven central . > > Olaf > > > > > > > > > > > > Am 18.06.2015 um 08:26 schrieb [email protected]: > > > > Building on conversations pre/during/post Apachecon and looking at the > post 1.0 bigtop focus and efforts, want to lay out a few things, get > peoples comments. Seems to be some consensus that the project can look > towards serving end application/data developers more going forward, while > continuing the tradition of the projects build/pkg/test/deploy roots. > > > > I have spent the past couple months, and heavily the past 3 or so weeks, > talking to many different potential end users at meetups, conferences, > etc.., also having some great conversations with commercial open source > vendors that are interested in what a "future bigtop" can be and what it > could provide to users. > > > > I believe we need to put some focused effort into few foundational > things to put the project in a position to move faster and attract a wider > range of users as well as new contributors. > > > > ----------- > > CI "2.0" > > ----------- > > > > Start of this is already underway based on the work roman started last > year and continuing effort with new setup and enhancement on bigtop AWS > infrastructure, Evans has been pushing this along into the 1.0 release. > Speed of getting new packages built and up to date needs to increase so > releases can happen at a regular clip.., even looking towards user friendly > "ad-hoc" bigtop builds where users could quickly choose the 2,3,4,etc > components they want and have a stack around that. > > > > Related to this, hoping the group can come to some idea/agreement on > some semver style versioning for the project post 1.0. I think this could > set a path forward for releases that can happen faster, while not holding > up the whole train if a single "smaller" component has a couple issues that > cant/wont be resolved by the main stakeholders or interested parties in > said component. An example might be new pig or sqoop having issues.., the > 1.2 release would still go out the door with 1.2.1 coming days/weeks later > once new pig or sqoop was fixed up. > > > > --------------------------------------------- > > Proper package repository hosting > > --------------------------------------------- > > > > I put together a little test setup based on the 0.8 assets, we can > probably build off of that with 1.0, working towards the CI automatically > posting nightly (or just-in-time) builds off latest so people can play > around. Debs/rpms seem should be the focal pt of output for the project > assets, everything else is additive and builds off of that (ie: user who > says "I am not a puppet shop so don’t care about the modules.., but do my > own automation and if you point me to some sane repositories I can do the > rest myself with couple decent getting started steps") > > > > ----------------------------------------------------------------- > > Greatly increasing the UX and getting started content > > ----------------------------------------------------------------- > > > > This is the big one.., new website, focused docs and getting started > examples for end users, other specific content for contributors. I will be > starting to put some cycles into new website jira probably starting next > week, will try to scoot through it and start posting some working examples > for feedback once something basic is in place. For those interested in > helping out on doc work and getting started content let me know.., looking > at subjects like: > > > > -Developer getting started > > -using the packages > > -using puppet modules and deployment options > > -deploying reference example stacks > > -setting up your own big data CI > > -etc > > > > -Contributing to Bigtop: > > -how to submit your first patch/pull-request > > -adding new component (step by step, canned learning component > example, etc) > > -adding tests to an existing component (steps, canned hello > world example test, etc) > > -writing your own test data generator > > -etc > > > > Those are some thoughts and couple initial focal areas that are driving > me around bigtop participation > > > > > > > > -----Original Message----- > > From: Andrew Purtell [mailto:[email protected]] > > Sent: Tuesday, June 16, 2015 12:02 PM > > To: [email protected] > > Cc: [email protected] > > Subject: Re: Rebooting the conversation on the Future of bigtop: > Abstracting the backplane ? Containers? > > > >> thanks andy - i agree with most of your opinions around continuing to > > build > > standard packages.. but can you clarify what was offensive ? must be a > misinterpretation somewhere. > > > > Sure. > > > > A bit offensive. > > > > "gridgain or spark can do what 90% of the hadoop ecosystem already does, > supporting streams, batch,sql all in one" -> This statement deprecates the > utility of the labors of rest of the Hadoop ecosystem in favor of Gridgain > and Spark. As a gross generalization it's unlikely to be a helpful > statement in any case. > > > > It's fine if we all have our favorites, of course. I think we're set up > well to empirically determine winners and losers, we don't need to make > partisan statements. Those components that get some user interest in the > form of contributions that keep them building and happy in Bigtop will stay > in. Those that do not get the necessary attention will have to be culled > out over time when and if they fail to compile or pass integration tests. > > > > > > On Mon, Jun 15, 2015 at 11:42 AM, jay vyas <[email protected]> > > wrote: > > > >> thanks andy - i agree with most of your opinions around continuing to > >> build standard packages.. but can you clarify what was offensive ? > >> must be a misinterpretation somewhere. > >> > >> 1) To be clear, i am 100% behind supporting standard hadoop build rpms > that > >> we have now. Thats the core product and will be for the forseeable > >> future, absolutely ! > >> > >> 2) The idea (and its just an idea i want to throw out - to keep us on > >> our toes), is that some folks may be interested in hacking around, in > >> a separate branch - on some bleeding edge bigdata deployments - which > >> attempts to incorporate resource managers and containers as > >> first-class citizens. > >> > >> Again this is all just ideas - not in any way meant to derail the > >> packaging efforts - but rather - just to gauge folks interest level in > >> the bleeding edge, docker, mesos, simplified processing stacks, and so > on. > >> > >> > >> > >> On Mon, Jun 15, 2015 at 12:39 PM, Andrew Purtell <[email protected]> > >> wrote: > >> > >>>> gridgain or spark can do what 90% of the hadoop ecosystem already > >>>> does, > >>> supporting streams, batch,sql all in one) > >>> > >>> If something like this becomes the official position of the Bigtop > >>> project, some day, then it will turn off people. I can see where you > >>> are coming from, I think. Correct me if I'm wrong: We have limited > >>> bandwidth, we should move away from Roman et. al.'s vision of Bigtop > >>> as an inclusive distribution of big data packages, and instead > >>> become highly opinionated and tightly focused. If that's accurate, I > >>> can sum up my concern as > >>> follows: To the degree we become more opinionated, the less we may > >>> have > >> to > >>> look at in terms of inclusion - both software and user communities. > >>> For example, I find the above quoted statement a bit offensive as a > >> participant > >>> on not-Spark and not-Gridgain projects. I roll my eyes sometimes at > >>> the Docker over-hype. Is there still a place for me here? > >>> > >>> > >>> > >>> On Mon, Jun 15, 2015 at 9:22 AM, jay vyas > >>> <[email protected]> > >>> wrote: > >>> > >>>> Hi folks. Every few months, i try to reboot the conversation about > the > >>>> next generation of bigtop. > >>>> > >>>> There are 3 things which i think we should consider : A backplane > >> (rather > >>>> than deploy to machines, the meaning of the term "ecosystem" in a > >>>> post-spark in-memory apacolypse, and containerization. > >>>> > >>>> 1) BACKPLANE: The new trend is to have a backplane that provides > >>>> networking abstractions for you (mesos, kubernetes, yarn, and so on). > >> Is > >>>> it time for us to pick a resource manager? > >>>> > >>>> 2) ECOSYSTEM?: Nowadays folks don't necessarily need the whole > >>>> hadoop ecosystem, and there is a huge shift to in-memory, > >>>> monolithic stacks happening (i.e. gridgain or spark can do what 90% > >>>> of the hadoop > >> ecosystem > >>>> already does, supporting streams, batch,sql all in one). > >>>> > >>>> 3) CONTAINERS: we are doing a great job w/ docker in our build infra. > >>>> Is it time to start experimenting with running docker tarballs ? > >>>> > >>>> Combining 1+2+3 - i could see a useful bigdata upstream distro > >>>> which (1) just installed an HCFS implementation (gluster,HDFS,...) > >>>> along side, > >> say, > >>>> (2) mesos as a backplane for the tooling for [[ hbase + spark + > >>>> ignite > >> ]] > >>>> --- and then (3) do the integration testing of available > >>>> mesos-framework plugins for ignite and spark underneath. If other > >>>> folks are interested, maybe we could create the "1x" or "in-memory" > >>>> branch to start hacking > >> on it > >>>> sometime ? Maybe even bring the flink guys in as well, as they are > >>>> interested in bigtop packaging. > >>>> > >>>> > >>>> > >>>> -- > >>>> jay vyas > >>>> > >>> > >>> > >>> > >>> -- > >>> Best regards, > >>> > >>> - Andy > >>> > >>> Problems worthy of attack prove their worth by hitting back. - Piet > >>> Hein (via Tom White) > >>> > >> > >> > >> > >> -- > >> jay vyas > >> > > > > > > > > -- > > Best regards, > > > > - Andy > > > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White) > > > > -- jay vyas
