Re: BIGTOP-1x branch.. Do we need multitenancy systems?

2015-02-11 Thread Bruno Mahé

On 02/10/2015 10:05 PM, Roman Shaposhnik wrote:

On Tue, Feb 10, 2015 at 6:00 PM, RJ Nowling rnowl...@gmail.com wrote:

Can we articulate the value of packages over tarballs?  In my view, packages
are useful for managing dependencies and in-place updates.

In my view packages are the only way to get into the traditional IT deployment
infrastructures. These are the same infrastructures that don't want to touch
Ambari at all, since they are all standardized on Puppet/Chef and traditional
Linux packaging.

There's quite a few of them out there still, despite all the push of
Silicon Valley
to get everybody to things like Docker, etc.


+1.

I like docker and it is a very nice project. But it is not going to be 
an end in itself.
Companies will continue to have various hosts, going from bare metal to 
different clouds providers (SaaS, PaaS...), docker included.


Aside from that, using packages provide so many benefits over tarballs:
* Packages have some metadata so I know what file belong where and how 
and what version
* all the dependencies are specified in it. Which makes it easier to 
reuse even across docker files. This includes system dependencies as 
well (ex: who depends on psmisc? why? can it be removed now that we 
updated Apache Hadoop?)
* it enables us to respect the Single Responsibility Principe and to 
satisfy everyone, folks using bare metal as well as cloud technologies users
* some patches may still need to be applied for compatibility/build 
reasons. Using packages makes that easier
* It provides a deep integration with the system so it just works. 
Users are created, initscripts setup, alternatives setup, everything has 
the right permissions...
* It makes it dead easy when I want to build multiple variants of the 
same image since everything is pulled and setup correctly. If I were to 
manually unpack tarballs, I would have to take care of that manually and 
also it would take a lot more space than the package equivalent unless I 
spend a lot of time deleting internal parts of each component. Example: 
I want hadoop client and fuse only for a variant.


Note that this could also be done with tarballs as well, but that would 
require a lot of duplication of command lines, trials and errors and 
wouldn't be as maintainable.


In conclusion, even if Apache Bigtop was to focus on docker, building 
packages would be much better than dropping them and going toward a 
'tarball' approach. Packages would not only be more maintainable, 
satisfy more use cases but would also provide an abstraction layer so 
the docker files could focus on the image itself instead of setting up 
the various combinations of Apache Hadoop components.
From a 10 000 ft view and in the big lines, docker is not much 
different than vagrant or boxgrinder. For those tools, having the recipe 
using the packages was simplifying a lot of things and I don't see why 
it would be different with docker.




Related question: what are BigTop's goals? Just integration testing?
Full blown distro targeted at end users? Packaging for others to build distros 
on top of?

All of the above? ;-) Seriously, I think we need to provide a way for consumers
of bigdata technology to be able to deploy it in the most efficient
way. This means
that we are likely to need to embrace different ways of packaging our stuff.

Thanks,
Roman.

+1 again

Another way to put it is to make the Apache Hadoop ecosystem usable.
That includes making it consumable as well as verifying that it all 
works together.
Packages have been the main way to consume such artifacts, but we have 
always been opened to other ways (see vagrant and boxgrinder). We even 
had at some point a kickstart image to build bootable usb keys with an 
out of the box working Apache Hadoop environment :)


If tomorrow packages become obsolete, I don't see why we could not drop 
them. But I think we are still far from that.



Thanks,
Bruno


Re: BIGTOP-1x branch.. Do we need multitenancy systems?

2015-02-11 Thread Jay Vyas
makes sense to me; This should help me to picture where bigtop is headed for 
the next several months.

So I guess the answer is yes : we still beleive in multitenant packaging and 
systems.

Thanks for all the feedback!

 On Feb 11, 2015, at 3:13 AM, Bruno Mahé br...@bmahe.net wrote:
 
 On 02/10/2015 10:05 PM, Roman Shaposhnik wrote:
 On Tue, Feb 10, 2015 at 6:00 PM, RJ Nowling rnowl...@gmail.com wrote:
 Can we articulate the value of packages over tarballs?  In my view, packages
 are useful for managing dependencies and in-place updates.
 In my view packages are the only way to get into the traditional IT 
 deployment
 infrastructures. These are the same infrastructures that don't want to touch
 Ambari at all, since they are all standardized on Puppet/Chef and traditional
 Linux packaging.
 
 There's quite a few of them out there still, despite all the push of
 Silicon Valley
 to get everybody to things like Docker, etc.
 
 +1.
 
 I like docker and it is a very nice project. But it is not going to be an end 
 in itself.
 Companies will continue to have various hosts, going from bare metal to 
 different clouds providers (SaaS, PaaS...), docker included.
 
 Aside from that, using packages provide so many benefits over tarballs:
 * Packages have some metadata so I know what file belong where and how and 
 what version
 * all the dependencies are specified in it. Which makes it easier to reuse 
 even across docker files. This includes system dependencies as well (ex: who 
 depends on psmisc? why? can it be removed now that we updated Apache Hadoop?)
 * it enables us to respect the Single Responsibility Principe and to satisfy 
 everyone, folks using bare metal as well as cloud technologies users
 * some patches may still need to be applied for compatibility/build reasons. 
 Using packages makes that easier
 * It provides a deep integration with the system so it just works. Users 
 are created, initscripts setup, alternatives setup, everything has the right 
 permissions...
 * It makes it dead easy when I want to build multiple variants of the same 
 image since everything is pulled and setup correctly. If I were to manually 
 unpack tarballs, I would have to take care of that manually and also it would 
 take a lot more space than the package equivalent unless I spend a lot of 
 time deleting internal parts of each component. Example: I want hadoop client 
 and fuse only for a variant.
 
 Note that this could also be done with tarballs as well, but that would 
 require a lot of duplication of command lines, trials and errors and wouldn't 
 be as maintainable.
 
 In conclusion, even if Apache Bigtop was to focus on docker, building 
 packages would be much better than dropping them and going toward a 'tarball' 
 approach. Packages would not only be more maintainable, satisfy more use 
 cases but would also provide an abstraction layer so the docker files could 
 focus on the image itself instead of setting up the various combinations of 
 Apache Hadoop components.
 From a 10 000 ft view and in the big lines, docker is not much different than 
 vagrant or boxgrinder. For those tools, having the recipe using the packages 
 was simplifying a lot of things and I don't see why it would be different 
 with docker.
 
 
 Related question: what are BigTop's goals? Just integration testing?
 Full blown distro targeted at end users? Packaging for others to build 
 distros on top of?
 All of the above? ;-) Seriously, I think we need to provide a way for 
 consumers
 of bigdata technology to be able to deploy it in the most efficient
 way. This means
 that we are likely to need to embrace different ways of packaging our stuff.
 
 Thanks,
 Roman.
 +1 again
 
 Another way to put it is to make the Apache Hadoop ecosystem usable.
 That includes making it consumable as well as verifying that it all works 
 together.
 Packages have been the main way to consume such artifacts, but we have always 
 been opened to other ways (see vagrant and boxgrinder). We even had at some 
 point a kickstart image to build bootable usb keys with an out of the box 
 working Apache Hadoop environment :)
 
 If tomorrow packages become obsolete, I don't see why we could not drop them. 
 But I think we are still far from that.
 
 
 Thanks,
 Bruno


Re: BIGTOP-1x branch.. Do we need multitenancy systems?

2015-02-10 Thread Konstantin Boudnik
I think betting it all on an over-night shift to microservices might be a bit
too risky. Also, I don't see how packages are opposite to the containers?

On Tue, Feb 10, 2015 at 11:44AM, jay vyas wrote:
 Hi bigtop.  Just a thought... (Thanks @rnowling for providing a seed for
 this discussion this morning)
 
 In BigTop 1x (the experimental branch) we have a chance to do things
 radically different.
 
 - How long are we planning on actually running bigtop on multitenant
 systems Months or years?
 - I we move towards individual processes/microservices/containers, who will
 be consuming our  RPM/DEB packages ?
 
 ...If nobody... then bigtop 1x could purely focus on nothing other than
 - integration testing,
 - cutting edge deployment, and
 - puppet recipes
 
 without worrying at all about packaging details.  There are tools for DNS
 for  containers and so on.
 
 
 -- 
 jay vyas



Re: BIGTOP-1x branch.. Do we need multitenancy systems?

2015-02-10 Thread Konstantin Boudnik
On Tue, Feb 10, 2015 at 10:05PM, Roman Shaposhnik wrote:
 On Tue, Feb 10, 2015 at 6:00 PM, RJ Nowling rnowl...@gmail.com wrote:
  Can we articulate the value of packages over tarballs?  In my view, packages
  are useful for managing dependencies and in-place updates.
 
 In my view packages are the only way to get into the traditional IT deployment
 infrastructures. These are the same infrastructures that don't want to touch
 Ambari at all, since they are all standardized on Puppet/Chef and traditional
 Linux packaging.
 
 There's quite a few of them out there still, despite all the push of
 Silicon Valley
 to get everybody to things like Docker, etc.

Yeah... it is easy to forget how thin is the Silicon Valley hype is, really.
And how clueless all the hipster types are on real life deployment and
enterprise software life-cycle. What I am saying is that Docker might be
completely gone in another 3 or 5 years, yet the structured deployment
management will be still around and barely notice the noise ;) 

Cos

  Related question: what are BigTop's goals? Just integration testing?
  Full blown distro targeted at end users? Packaging for others to build 
  distros on top of?
 
 All of the above? ;-) Seriously, I think we need to provide a way for 
 consumers
 of bigdata technology to be able to deploy it in the most efficient
 way. This means
 that we are likely to need to embrace different ways of packaging our stuff.
 
 Thanks,
 Roman.


Re: BIGTOP-1x branch.. Do we need multitenancy systems?

2015-02-10 Thread Andrew Purtell
There is also the matter of insuring to the best of the project's ability
that the binary bits fit together. We do this in the package builds by
specifying dependency versions on the Maven command lines used to build.
Whether the end result is a bunch of jars in a directory, a bunch of
tarballs, or a bunch of deb or rpm archives, most of the effort we have is
not maintaining the packaging - it is figuring out a BOM and set of
transitive dependencies that can be compiled together and successfully
deployed.

Let me turn this question around and ask how difficult it is to do the
packaging part _after_ getting the builds of a BOM to all work? I've found
it to be 30% of the work, substantial, but not double the effort.

That said, it could make sense to not worry about building OS specific
packages. The output of a Bigtop build would then be something like a
directory full of tarballs that can be unpacked in different locations and
combinations but can be assured to work together. Like Cloudera's Parcels I
suppose.

I also think the current renewed interest in containers is strong, and not
a fad, but this is a pendulum that swings around.


On Tue, Feb 10, 2015 at 8:44 AM, jay vyas jayunit100.apa...@gmail.com
wrote:

 Hi bigtop.  Just a thought... (Thanks @rnowling for providing a seed for
 this discussion this morning)


 In BigTop 1x (the experimental branch) we have a chance to do things
 radically different.

 - How long are we planning on actually running bigtop on multitenant
 systems Months or years?
 - I we move towards individual processes/microservices/containers, who
 will be consuming our  RPM/DEB packages ?

 ...If nobody... then bigtop 1x could purely focus on nothing other than
 - integration testing,
 - cutting edge deployment, and
 - puppet recipes

 without worrying at all about packaging details.  There are tools for DNS
 for  containers and so on.


 --
 jay vyas




-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)