Re: Rebooting the conversation on the Future of bigtop: Abstracting the backplane ? Containers?

jay vyas Thu, 18 Jun 2015 14:58:54 -0700

You can easily share the artifacts with a docker shared volume

in the container "EXPORT M2_HOME=/container/m2/"


follwed by

"docker build -v ~/.m2/ /container/m2/ ........ "

This will put the mvn jars into the host rather than the guest conatainer,
so that they persist.




On Thu, Jun 18, 2015 at 5:32 PM, Olaf Flebbe <[email protected]> wrote:

> Thanks Nate
>
> for this focused writeup!
>
> Yeah maybe it is time to reboot our brains ...
>
> Additionaly to the points of nate I would like to attack this in bigtop
> 1.1.0:
>
> …………..
> Building from source or downloading ?
> ……………
>
> However we have a substancial problem hidden deep in th CI „2.0“ approach
> using containers
>
> You may know that we place artifacts (i.e. jars) we built with bigtop into
> the local maven cache ~/.m2. (look for mvn install in do-component-build).
> The idea is that later maven builds will pick these artifacts and use them
> rather downloading them from maven central.
>
> Placing artifacts into ~/.m2 will not have any effect if we use CI
> containers the way we do now: The maven cache ~/.m2 is lost when the
> container ends.
>
> [This triggered misfeature in JIRA BIGTOP-1893, BTW:  gradle rpm/apt
> behaved differently from a container build with artifacts from maven
> central.]
>
> Option 1)  Remove mvn install from all do-component-builds
>
> Results:
>
> + We compile projects the way the upstream-developer does.
> - local fixes and configurations will not propagated
>
> Questions:
> If we do not try to reuse our build-artifacts within compile we have to
> ask ourself "why do we compile projects at all?“.
>
> We can build a great test wether someone else has touched / manipulated
> the maven central cache if we compare artifacts, but is this the really the
> point of compiling ourselves?
>
>
> Option 2) Use mvn install and reuse artifacts even in containers.
>
> Consequences:
>
> - Containers are not stateless any more
>
> - We have to add depencies to CI jobs so they run in order
>
> - single components may break the whole compile process.
>
> - Compile does not scale any more
>
> My Opinion:
> The way we do now "mvn install“ ,  simply tainting the maven cache seems
> not a really controlled way to propagate artifacts to me.
>
> Option 3) Use 1) but reuse artifacts in packages by placing symlinks and
> dependencies between them.
>
> - Packages will break with subtile problems if we do symlink artifacts
> from different releases.
>
> ----
> Neither Option 1, Option 2 nor Option 3 seems a clever way to fix the
> problem. Would like to hear comments regarding this issue:
>
>
> In my humble opinion we should follow Option 2 with all the grave
> consequences. But maybe reworking mvn install by placing the artifacts with
> a bigtop specific name / groupid into the maven cache and upload them to
> maven central .
>
> Olaf
>
>
>
>
>
>
>
>
>
>
> > Am 18.06.2015 um 08:26 schrieb [email protected]:
> >
> > Building on conversations pre/during/post Apachecon and looking at the
> post 1.0 bigtop focus and efforts, want to lay out a few things, get
> peoples comments.  Seems to be some consensus that the project can look
> towards serving end application/data developers more going forward, while
> continuing the tradition of the projects build/pkg/test/deploy roots.
> >
> > I have spent the past couple months, and heavily the past 3 or so weeks,
> talking to many different potential end users at meetups, conferences,
> etc.., also having some great conversations with commercial open source
> vendors that are interested in what a "future bigtop" can be and what it
> could provide to users.
> >
> > I believe we need to put some focused effort into few foundational
> things to put the project in a position to move faster and attract a wider
> range of users as well as new contributors.
> >
> > -----------
> > CI "2.0"
> > -----------
> >
> > Start of this is already underway based on the work roman started last
> year and continuing effort with new setup and enhancement on bigtop AWS
> infrastructure, Evans has been pushing this along into the 1.0 release.
> Speed of getting new packages built and up to date needs to increase so
> releases can happen at a regular clip.., even looking towards user friendly
> "ad-hoc" bigtop builds where users could quickly choose the 2,3,4,etc
> components they want and have a stack around that.
> >
> > Related to this, hoping the group can come to some idea/agreement on
> some semver style versioning for the project post 1.0.  I think this could
> set a path forward for releases that can happen faster, while not holding
> up the whole train if a single "smaller" component has a couple issues that
> cant/wont be resolved by the main stakeholders or interested parties in
> said component.  An example might be new pig or sqoop having issues.., the
> 1.2 release would still go out the door with 1.2.1 coming days/weeks later
> once new pig or sqoop was fixed up.
> >
> > ---------------------------------------------
> > Proper package repository hosting
> > ---------------------------------------------
> >
> > I put together a little test setup based on the 0.8 assets, we can
> probably build off of that with 1.0, working towards the CI automatically
> posting nightly (or just-in-time) builds off latest so people can play
> around.  Debs/rpms seem should be the focal pt of output for the project
> assets, everything else is additive and builds off of that (ie: user who
> says "I am not a puppet shop so don’t care about the modules.., but do my
> own automation and if you point me to some sane repositories I can do the
> rest myself with couple decent getting started steps")
> >
> > -----------------------------------------------------------------
> > Greatly increasing the UX and getting started content
> > -----------------------------------------------------------------
> >
> > This is the big one.., new website, focused docs and getting started
> examples for end users, other specific content for contributors.  I will be
> starting to put some cycles into new website jira probably starting next
> week, will try to scoot through it and start posting some working examples
> for feedback once something basic is in place.  For those interested in
> helping out on doc work and getting started content let me know.., looking
> at subjects like:
> >
> >   -Developer getting started
> >         -using the packages
> >         -using puppet modules and deployment options
> >         -deploying reference example stacks
> >         -setting up your own big data CI
> >         -etc
> >
> >   -Contributing to Bigtop:
> >         -how to submit your first patch/pull-request
> >         -adding new component (step by step, canned learning component
> example, etc)
> >         -adding tests to an existing component (steps, canned hello
> world example test, etc)
> >         -writing your own test data generator
> >         -etc
> >
> > Those are some thoughts and couple initial focal areas that are driving
> me around bigtop participation
> >
> >
> >
> > -----Original Message-----
> > From: Andrew Purtell [mailto:[email protected]]
> > Sent: Tuesday, June 16, 2015 12:02 PM
> > To: [email protected]
> > Cc: [email protected]
> > Subject: Re: Rebooting the conversation on the Future of bigtop:
> Abstracting the backplane ? Containers?
> >
> >> thanks andy - i agree with most of your opinions around continuing to
> > build
> > standard packages.. but can you clarify what was offensive ?  must be a
> misinterpretation somewhere.
> >
> > Sure.
> >
> > A bit offensive.
> >
> > "gridgain or spark can do what 90% of the hadoop ecosystem already does,
> supporting streams, batch,sql all in one" -> This statement deprecates the
> utility of the labors of rest of the Hadoop ecosystem in favor of Gridgain
> and Spark. As a gross generalization it's unlikely to be a helpful
> statement in any case.
> >
> > It's fine if we all have our favorites, of course. I think we're set up
> well to empirically determine winners and losers, we don't need to make
> partisan statements. Those components that get some user interest in the
> form of contributions that keep them building and happy in Bigtop will stay
> in. Those that do not get the necessary attention will have to be culled
> out over time when and if they fail to compile or pass integration tests.
> >
> >
> > On Mon, Jun 15, 2015 at 11:42 AM, jay vyas <[email protected]>
> > wrote:
> >
> >> thanks andy - i agree with most of your opinions around continuing to
> >> build standard packages.. but can you clarify what was offensive ?
> >> must be a misinterpretation somewhere.
> >>
> >> 1) To be clear, i am 100% behind supporting standard hadoop build rpms
> that
> >> we have now.   Thats the core product and will be for  the forseeable
> >> future, absolutely !
> >>
> >> 2) The idea (and its just an idea i want to throw out - to keep us on
> >> our toes), is that some folks may be interested in hacking around, in
> >> a separate branch - on some bleeding edge bigdata deployments - which
> >> attempts to incorporate resource managers and  containers as
> >> first-class citizens.
> >>
> >> Again this is all just ideas - not in any way meant to derail the
> >> packaging efforts - but rather - just to gauge folks interest level in
> >> the bleeding edge, docker, mesos, simplified  processing stacks, and so
> on.
> >>
> >>
> >>
> >> On Mon, Jun 15, 2015 at 12:39 PM, Andrew Purtell <[email protected]>
> >> wrote:
> >>
> >>>> gridgain or spark can do what 90% of the hadoop ecosystem already
> >>>> does,
> >>> supporting streams, batch,sql all in one)
> >>>
> >>> If something like this becomes the official position of the Bigtop
> >>> project, some day, then it will turn off people. I can see where you
> >>> are coming from, I think. Correct me if I'm wrong: We have limited
> >>> bandwidth, we should move away from Roman et. al.'s vision of Bigtop
> >>> as an inclusive distribution of big data packages, and instead
> >>> become highly opinionated and tightly focused. If that's accurate, I
> >>> can sum up my concern as
> >>> follows: To the degree we become more opinionated, the less we may
> >>> have
> >> to
> >>> look at in terms of inclusion - both software and user communities.
> >>> For example, I find the above quoted statement a bit offensive as a
> >> participant
> >>> on not-Spark and not-Gridgain projects. I roll my eyes sometimes at
> >>> the Docker over-hype. Is there still a place for me here?
> >>>
> >>>
> >>>
> >>> On Mon, Jun 15, 2015 at 9:22 AM, jay vyas
> >>> <[email protected]>
> >>> wrote:
> >>>
> >>>> Hi folks.   Every few months, i try to reboot the conversation about
> the
> >>>> next generation of bigtop.
> >>>>
> >>>> There are 3 things which i think we should consider : A backplane
> >> (rather
> >>>> than deploy to machines, the meaning of the term "ecosystem" in a
> >>>> post-spark in-memory apacolypse, and containerization.
> >>>>
> >>>> 1) BACKPLANE: The new trend is to have a backplane that provides
> >>>> networking abstractions for you (mesos, kubernetes, yarn, and so on).
> >> Is
> >>>> it time for us to pick a resource manager?
> >>>>
> >>>> 2) ECOSYSTEM?: Nowadays folks don't necessarily need the whole
> >>>> hadoop ecosystem, and there is a huge shift to in-memory,
> >>>> monolithic stacks happening (i.e. gridgain or spark can do what 90%
> >>>> of the hadoop
> >> ecosystem
> >>>> already does, supporting streams, batch,sql all in one).
> >>>>
> >>>> 3) CONTAINERS:  we are doing a great job w/ docker in our build infra.
> >>>> Is it time to start experimenting with running docker tarballs ?
> >>>>
> >>>> Combining 1+2+3 - i could see a useful bigdata upstream distro
> >>>> which (1) just installed an HCFS implementation (gluster,HDFS,...)
> >>>> along side,
> >> say,
> >>>> (2) mesos as a backplane for the tooling for [[ hbase + spark +
> >>>> ignite
> >> ]]
> >>>> --- and then (3) do the integration testing of available
> >>>> mesos-framework plugins for ignite and spark underneath.  If other
> >>>> folks are interested, maybe we could create the "1x" or "in-memory"
> >>>> branch to start hacking
> >> on it
> >>>> sometime ?    Maybe even bring the flink guys in as well, as they are
> >>>> interested in bigtop packaging.
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> jay vyas
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Best regards,
> >>>
> >>>   - Andy
> >>>
> >>> Problems worthy of attack prove their worth by hitting back. - Piet
> >>> Hein (via Tom White)
> >>>
> >>
> >>
> >>
> >> --
> >> jay vyas
> >>
> >
> >
> >
> > --
> > Best regards,
> >
> >   - Andy
> >
> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
> >
>
>


-- 
jay vyas

Re: Rebooting the conversation on the Future of bigtop: Abstracting the backplane ? Containers?

Reply via email to