Re: BIGTOP-1x branch.. Do we need multitenancy systems?

Bruno Mahé Wed, 11 Feb 2015 00:15:02 -0800

On 02/10/2015 10:05 PM, Roman Shaposhnik wrote:

On Tue, Feb 10, 2015 at 6:00 PM, RJ Nowling <[email protected]> wrote:

Can we articulate the value of packages over tarballs?  In my view, packages
are useful for managing dependencies and in-place updates.

In my view packages are the only way to get into the traditional IT deployment
infrastructures. These are the same infrastructures that don't want to touch
Ambari at all, since they are all standardized on Puppet/Chef and traditional
Linux packaging.


There's quite a few of them out there still, despite all the push of
Silicon Valley
to get everybody to things like Docker, etc.

+1.

I like docker and it is a very nice project. But it is not going to bean end in itself.Companies will continue to have various hosts, going from bare metal todifferent clouds providers (SaaS, PaaS...), docker included.


Aside from that, using packages provide so many benefits over tarballs:

* Packages have some metadata so I know what file belong where and howand what version* all the dependencies are specified in it. Which makes it easier toreuse even across docker files. This includes system dependencies aswell (ex: who depends on psmisc? why? can it be removed now that weupdated Apache Hadoop?)* it enables us to respect the Single Responsibility Principe and tosatisfy everyone, folks using bare metal as well as cloud technologies users* some patches may still need to be applied for compatibility/buildreasons. Using packages makes that easier* It provides a deep integration with the system so "it just works".Users are created, initscripts setup, alternatives setup, everything hasthe right permissions...* It makes it dead easy when I want to build multiple variants of thesame image since everything is pulled and setup correctly. If I were tomanually unpack tarballs, I would have to take care of that manually andalso it would take a lot more space than the package equivalent unless Ispend a lot of time deleting internal parts of each component. Example:I want hadoop client and fuse only for a variant.

Note that this could also be done with tarballs as well, but that wouldrequire a lot of duplication of command lines, trials and errors andwouldn't be as maintainable.

In conclusion, even if Apache Bigtop was to focus on docker, buildingpackages would be much better than dropping them and going toward a'tarball' approach. Packages would not only be more maintainable,satisfy more use cases but would also provide an abstraction layer sothe docker files could focus on the image itself instead of setting upthe various combinations of Apache Hadoop components.From a 10 000 ft view and in the big lines, docker is not muchdifferent than vagrant or boxgrinder. For those tools, having the recipeusing the packages was simplifying a lot of things and I don't see whyit would be different with docker.

Related question: what are BigTop's goals? Just integration testing?
Full blown distro targeted at end users? Packaging for others to build distros 
on top of?

All of the above? ;-) Seriously, I think we need to provide a way for consumers
of bigdata technology to be able to deploy it in the most efficient
way. This means
that we are likely to need to embrace different ways of packaging our stuff.

Thanks,
Roman.

+1 again

Another way to put it is to make the Apache Hadoop ecosystem usable.

That includes making it consumable as well as verifying that it allworks together.Packages have been the main way to consume such artifacts, but we havealways been opened to other ways (see vagrant and boxgrinder). We evenhad at some point a kickstart image to build bootable usb keys with anout of the box working Apache Hadoop environment :)

If tomorrow packages become obsolete, I don't see why we could not dropthem. But I think we are still far from that.



Thanks,
Bruno

Re: BIGTOP-1x branch.. Do we need multitenancy systems?

Reply via email to