Changing the topic to follow the subject.

[tl;dr] it's time to rearchitect container images to stop incluiding config-time only (puppet et al) bits, which are not needed runtime and pose security issues, like CVEs, to maintain daily.

Background:
1) For the Distributed Compute Node edge case, there is potentially tens of thousands of a single-compute-node remote edge sites connected over WAN to a single control plane, which is having high latency, like a 100ms or so, and limited bandwith. Reducing the base layer size becomes a decent goal there. See the security background below. 2) For a generic security (Day 2, maintenance) case, when puppet/ruby/systemd/name-it gets a CVE fixed, the base layer has to be updated and all layers on top - to be rebuild, and all of those layers, to be re-fetched for cloud hosts and all containers to be restarted... And all of that because of some fixes that have nothing to OpenStack. By the remote edge sites as well, remember of "tens of thousands", high latency and limited bandwith?.. 3) TripleO CI updates (including puppet*) packages in containers, not in a common base layer of those. So each a CI job has to update puppet* and its dependencies - ruby/systemd as well. Reducing numbers of packages to update for each container makes sense for CI as well.

Implementation related:

WIP patches [0],[1] for early review, uses a config "pod" approach, does not require to maintain a two sets of config vs runtime images. Future work: a) cronie requires systemd, we'd want to fix that also off the base layer. b) rework to podman pods for docker-puppet.py instead of --volumes-from a side car container (can't be backported for Queens then, which is still nice to have a support for the Edge DCN case, at least downstream only perhaps).

Some questions raised on IRC:

Q: is having a service be able to configure itself really need to involve a separate pod? A: Highly likely yes, removing not-runtime things is a good idea and pods is an established PaaS paradigm already. That will require some changes in the architecture though (see the topic with WIP patches).

Q: that's (fetching a config container) actually more data that about to download otherwise A: It's not, if thinking of Day 2, when have to re-fetch the base layer and top layers, when some unrelated to openstack CVEs got fixed there for ruby/puppet/systemd. Avoid the need to restart service containers because of those minor updates puched is also a nice thing.

Q: the best solution here would be using packages on the host, generating the config files on the host. And then having an all-in-one container for all the services which lets them run in an isolated mannner. A: I think for Edge cases, that's a no go as we might want to consider tiny low footprint OS distros like former known Container Linux or Atomic. Also, an all-in-one container looks like an anti-pattern from the world of VMs.

[0] https://review.openstack.org/#/q/topic:base-container-reduction
[1] https://review.rdoproject.org/r/#/q/topic:base-container-reduction

Here is a related bug [1] and implementation [1] for that. PTAL folks!

[0] https://bugs.launchpad.net/tripleo/+bug/1804822
[1] https://review.openstack.org/#/q/topic:base-container-reduction

Let's also think of removing puppet-tripleo from the base container.
It really brings the world-in (and yum updates in CI!) each job and each container! So if we did so, we should then either install puppet-tripleo and co on the host and bind-mount it for the docker-puppet deployment task steps (bad idea IMO), OR use the magical --volumes-from <a-side-car-container> option to mount volumes from some "puppet-config" sidecar container inside each of the containers being launched by docker-puppet tooling.

On Wed, Oct 31, 2018 at 11:16 AM Harald Jensås <hjensas at redhat.com> wrote:
We add this to all images:

https://github.com/openstack/tripleo-common/blob/d35af75b0d8c4683a677660646e535cf972c98ef/container-images/tripleo_kolla_template_overrides.j2#L35

/bin/sh -c yum -y install iproute iscsi-initiator-utils lvm2 python
socat sudo which openstack-tripleo-common-container-base rsync cronie
crudini openstack-selinux ansible python-shade puppet-tripleo python2-
kubernetes && yum clean all && rm -rf /var/cache/yum 276 MB
Is the additional 276 MB reasonable here?
openstack-selinux <- This package run relabling, does that kind of
touching the filesystem impact the size due to docker layers?

Also: python2-kubernetes is a fairly large package (18007990) do we use
that in every image? I don't see any tripleo related repos importing
from that when searching on Hound? The original commit message[1]
adding it states it is for future convenience.

On my undercloud we have 101 images, if we are downloading every 18 MB
per image thats almost 1.8 GB for a package we don't use? (I hope it's
not like this? With docker layers, we only download that 276 MB
transaction once? Or?)


[1] https://review.openstack.org/527927



--
Best regards,
Bogdan Dobrelya,
Irc #bogdando


--
Best regards,
Bogdan Dobrelya,
Irc #bogdando

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to