On 10/03/2016 6:22 PM, Charles Plessy wrote:
> Le Thu, Mar 10, 2016 at 10:18:30AM +0100, Bastian Blank a écrit :
>> On Wed, Mar 09, 2016 at 11:51:58PM +0900, Charles Plessy wrote:
>>
>>> Maybe this problem can be solved by the use of metapackages ?  With the
>>> exclusion of cloud-init, specialised kernels etc., can we converge on a
>>> metapackage that would represent the most frequent expectations of users of
>>> non-minimal cloud images in Debian and elsewhere ?
>> Thats what a task is, a meta-package.
> Since [email protected] was CCed, I thought that "task" was employed to mean 
> "a
> metapackage built from the tasksel source package", not just any metapackage 
> in
> general.
>
> So let me rephrase:
>
> Is the proposal to go through tasksel ?  If yes, what are the expected
> advantages over the use of an ad-hoc metapackage ?
>
> Cheers,
>


All of what people generally want installed in their EC2 instances can
be achieved with a suitable boot time UserData section that Cloud-init
installs. So long as the base image has enough base packages to fetch &
install additional packages at boot; which means it has enough base
packages to be configured to make a request to a Debian repo by some
means (socks, or explicitly defined proxy, or private repository, or
direct) with simple configuration, then that's easy. So, for example, in
an EC2 metadata environment, setting the UserData to:

  #!/bin/sh
  apt-get update && apt-get install -y less unattended-updates
....


...will do what many users will want quite quickly.

I don't think that requires us to create additional base images that
cater for various combinations. Sometimes those base images want to
download packages from Storage (s3 or other external to the instance)
that the client wants to get using a script. So long as that instance
has the necessary tools (curl, wget, awscli in the case of EC2, plus the
lower layer SSL libraries that curl and wget use for https) then that's
provided ample means to deliver secured, authenticated payloads to the
instance. If the user wants a base image for their own purpose then they
can either take a base image, customise and "create image" of their
running server, or use the bootstrap-vz script to generate their own
images with their selected base packages.

One item of note that we have included in the AWS EC2 base images is the
apt-transport-https package, purely that, in environments where the
(customer organisational) security policy forbids the use of outgoing
HTTP from an instance, but permits (limited?) HTTPS, then permitting a
simple re-configuration of the base image sources.list file makes this
possible without being a chicken-and-egg problem.

Should users pre-bake their images? For large
organisations/corporates/governments/etc who may do hundreds of launches
per day, possibly in an auto-scale group - yes, so that the image, and
all launch time dependencies are pre-installed and not subject to
external services outside of the cloud provider that may not be
available during a launch. I've seen people bootstrap live pulls from
external revision control provider platforms- and then seen those
platforms be down when they've tried to launch. If you want to rely on
the stability and availability of an image, you should master your own
image and maintain that as an artifact for your environment.

If you're contemplating an ad-hoc meta-package instead - then perhaps
that is something that's more widely cloud and non cloud applicable -
such as what tasksel already has as tasks? But if a possible "cloud"
task was to contain utilities for various cloud vendors - then who
decides which vendors and utilities are included, and which ones are not
- and when does a small cloud provider get a task for their cloud
environment and APIs? Or do we make a "cloud-aws"and "cloud-azure"
tasks. If we do that, then why not include those utilities in the base
image(s) of those providers to start with. That starts to become
bloatware if I have a package where 90% of the content is for cloud
environments that I don't use, and I'm paying for the storage for these
utilities, times many instances. Even a task of "cloud-aws" may include
libraries and APIs for stuff I'll never use: the
Go/Python/R/perl/java/node/php SDKs when all I want is less?  So
updating the above simple UserData script to use tasksel is also trivial:

  #!/bin/sh
  apt-get update && tasksel install <task>


My preference is to keep the number of images per release to a minimum
with small deviations that ensure that the "base"image is universally
useful in each providers environment. We were, for a while, generating 4
images in EC2 per release (multiplied by 12 regions). Now we generate
one x86 64 bit image, and replicate it (PVM virtualisation and i386 is
sun-setting on AWS: amd64 HVM is everywhere). If you twist my arm then
perhaps an additional base image(s) for a new CPU architecture(s) may
one day be needed.


  James

-- 
/Mobile:/ +61 422 166 708, /Email:/ james_AT_rcpt.to

Reply via email to