OKD 4 - A Modest Proposal

Clayton Coleman Thu, 20 Jun 2019 14:20:19 -0700

TL:DR - I can’t even summarize this, but it’s worth it to read!

First, I’ll start this off with an apology - I intended to draft an OKD 4
proposal many months ago, but I kept pushing it back to fix “just one more
bug”, and as a result there’s been a real gap in regular summarization
across the project.  While I have talked to many community members
one-on-one, and many of us interact with each other on GitHub and on Slack
and at conferences, I was remiss in highlighting and concentrating the
roadmap, design, and iteration proposals for a large chunk of the last 6
months and I’ll do my best to rectify that starting now.

OKD 3.11 has been out since the fall, and is still getting fixes. It should
be no surprise to folks on this list that the acquisition of CoreOS last
spring triggered a rethink / re-imagining of what OpenShift could / should
be. There was a broad agreement that we’ve all been doing Kubernetes The
Hard Way™ (even the cloud providers) since the early days of Kube. Some of
these hard things we accepted because Kubernetes was moving so fast.

But Kubernetes is maturing. The code base is moving from a monorepo to a
much larger set of individual services and extensions. The ecosystem on
top of Kubernetes is what is now innovating at a rapid pace. Contributors
from both CoreOS and OpenShift asked what a v2 of Tectonic and what a v4 of
OpenShift would look like if:

we built a platform anchored around Kubernetes
-

that allowed us to rapidly include and support the innovation in the
broader ecosystem
-

all the way down to the operating system
-

that informed the evolution of operators (the natural way to extend
Kubernetes)

That took longer than anticipated. Many of those pieces were big bets that
we weren’t positive could be well integrated, and if you’ve been following
along in the almost a hundred repos that make up OKD you know that some of
those pieces reached maturity only in the last month or so. Some of the
aspects of Tectonic which weren’t open source weren’t immediately replaced,
and, as we evolved the initial CoreOS operating system vision, it wasn’t
clear whether it would be Fedora, or RHEL, or something in between. Much
of the change happened in the open, but not all of the planning or debate
at the high level.

I’m sorry for that. I will make a concerted effort to summarize what is
going on and what to expect more regularly, and also do more to move those
discussions into the broader forums rather than stay in specific scopes or
specific channels.

---

So - where are we now (June 2019), and where do we go from here?

The first question is philosophy - what sort of shared goals should we
define for OKD?

I personally feel strongly that the CoreOS mission - secure the internet
with up-to-date open source software - continues to be more relevant with
each passing year. As a contributor to Kubernetes, I know how difficult
keeping an up-to-date version of it can be. As a personal computing user,
I’m horrified at the insecurity of our hardware, our software, and the
services and clouds we use. And it’s not going to get better unless we
make a concerted effort to fix it.

The OKD mission has always been to create a developer and operations
friendly Kubernetes distribution that isn’t afraid to be opinionated in
order to make running software easier. That opinionation started with
development tools on top of Kubernetes and security underneath. But I think
we should now take the next step - strengthen our opinions on how we ship
and update (continuously!) and how we run the platform itself. Not just to
deliver new features, but to deliver fixes and plug security holes.

So I would propose our goal for OKD 4 to be:

***

The perfect Kubernetes distribution for those who want to continuously be
on the latest Kubernetes and ecosystem components. It should combine an
up-to-date OS, the Kubernetes control plane, and a large number of
ecosystem operators to provide an easy-to-extend distribution of Kubernetes
that is always on the latest released version of ecosystem tools.

***

Does that resonate with others? What other goals do people believe in and
are willing to support with their time and effort?

---

The second step (if we agree with the philosophy) is to articulate the
choices that would describe Kubernetes The Probably Better Than Before Way:

*** First, Kubernetes is the best platform for running distributed apps,
and since we believe that, we should use it to run Kubernetes itself.

You should never have to:

Restart your control plane by SSHing to a machine
-

Remember which components are running as a system service vs as a pod
-

Orchestrate pods + node services.
-

Take downtime during a control plane reconfiguration

This ensures that the platform benefits from things we add (resiliency,
debuggability, observability), and makes the platform (which HAS to run
successfully 100% of the time) the best possible canary for when people
introduce dumb bugs into Kubernetes that break that promise. If a CI job
fails because a control plane upgrade doesn’t maintain perfect
availability, the contributor who made the change should be able to see
that right there, before they merge their code.

Running Kube native things natively means operators, and there were a lot
of tricky problems to solve - how do you run an operator that runs the
control plane that happens to run the operator? How do you recover when a
node crashes? How do you manage etcd as a container? How do you get each
component to have its own operator, and how do those present a coherent
story? This part of the story took a long time to evolve and is still
evolving, as anyone who has followed the operator SDK, operator lifecycle
manager project, controller-runtime project, or the upstream addons project
knows.

*** Second, the node and all the software on it exists solely to run
containers for Kubernetes, and nodes should be an innate property of a
cluster, not an external thing to manage.

You should never have to:

Build a golden VM image
-

See the error message “package conflict while updating kubelet”
-

Fix a broken VM on the cloud
-

Try to perform a manual rollback of packages on a host

To that end:

The OS should contain all of the binaries necessary to function as a node
2.

All config on the node should come from the cluster
3.

The cluster decides when the node upgrades
4.

Tthe cluster should be able to deliver updates to the nodes just like
any other component

To accomplish this, we decided to bring many of the key components of both
Container Linux and Atomic into a harmonious whole. ostree is really good
at treating general purpose OS packages exactly like immutable content
sets. Ignition is the *best* first boot solution that is consistent across
clouds and metal, and has to be part of the OS to do that job well. The OS
has to be ready to be a node, programmed to join the cluster, securely and
safely, and once it’s joined it needs to be 100% focused on running
containers. That end result is Fedora CoreOS and RHEL CoreOS - but each one
is slightly different, driven by slightly different use cases, and on
different schedules. A key goal is that the cluster controls the software
on the node, so the expectation should be that OKD4 will control and own
the software on that node from kernel to kubelet.

We also knew that we needed to make node lifecycle trivial. Not easy,
trivial. We added dynamic compute (just like dynamic storage and dynamic
load balancing) by adopting and stabilizing aspects of the Kubernetes
cluster api (specifically the alpha Machine API) to let you quickly and
easily add new nodes, delete them, update them, and in general treat them
like pods. Treating nodes like pods is AMAZING (replaceable, fungible,
editable, and no big deal) and is life-changing from an operational
perspective. That said, the upstream project is going through a ton of
change and iteration right now, so for the purposes of having something
stable we chose to expose a limited subset of the machine API as an
OpenShift resource and will continue to help steer the upstream in a
direction that keeps that amazing user experience. Bringing infrastructure
under the control of the cluster has always been at the heart of
Kubernetes, and extending that to nodes is the biggest operational
improvement next to using operators for managing components.

*** Third, we want to have only the configuration that matters (no
pointless choices), make them trivial to change post install, and make the
config of the cluster Just-Another-API-Object.

You should never have to:

SSH to the machine to change the config of the cluster
-

Have to look in filesystem directories or set filesystem permissions on
config
-

Have to restore anything except your etcd backup
-

Ask what fields are supported on a config change

You should be able to declaratively configure a Kubernetes cluster exactly
like you declaratively configure a Kubernetes application. If you change
this global configuration (the spec) you should get an update when it’s
been applied (status). The hard choices here were how much config to expose
- I’ll be honest, we may have gotten some of it wrong, and some things that
work in 3.11 won’t work in OKD4. We’ll definitely need to assess where we
went too far and add back some of that config.

But the new pattern is a lot easier to reason about - config changes become:

oc apply -f my-infrastructure-config-folder

Or the more interactive:

oc edit authentication

That’s it! Instead of 2400 ansible parameters, or 600 kubernetes component
flags, we have ~50-60 core parameters on API objects, and a bunch of new
API objects (like machinesets and ingress controllers) that can be adopted
for your own use cases, and ALL of them can be managed with the same tools
you use for your appliactions.

Since we planned to use CRDs for this, there was a LOT of work to make
CRDs… well, frankly, be usable. That included a ton of performance work at
scale, validation on CRDs, getting server-side ‘get’ working, and
supporting OpenAPI on CRDs:

oc explain authentication

A huge shout-out to everyone who has been involved in CRDs in Kubernetes -
there was a lot of drudge work to get them up to the standards you’d expect
so that they can be used for config, and everyone will benefit from this.

*** Fourth, you have to believe that updates will Just Work in order to
trust turning on automatic updates

You should never have to:

Click the upgrade button
-

Decide whether it is safe to upgrade
-

Forget to upgrade to get the latest security patches
-

Take downtime during an upgrade

This is a hard problem, and it involves a lot of practice. It means better
technology when deploying, but also better testing, better repeatability,
and better standardization of *how* updates are rolled out. It means better
discipline in upstream projects to avoid breaking changes, and the ability
to catch when regressions happen and roll back. Much of the effort
involved in making updates “just work” comes from a community of users who
are willing to test, offer feedback, and provide fixes. While the three
previous items are a big help, without a community process that encourages
early adoption and incentivizes participation this is the most likely to
fail.

I think this is the area we have the most risk in achieving as a community,
so experimentation and steady evolution are important.

---

There are other parts of this story, but these four form the core -
operators and self management, machines and CoreOS, a simpler and
internally consistent set of on-cluster configuration, and trustworthy
automatic updates. I believe that we should inherit and do justice to the
CoreOS vision to have a continuously up to date version of Kubernetes that
brings in the wider ecosystem where you are never afraid to upgrade - the
“cloud service” model of Kubernetes anywhere (any platform, cloud or metal)
that hides the boring details so you can Just Run Software.

---

The work to get from where we are to an OKD4 alpha is probably three major
parts:

Because the operating system integration is so critical, we need to make
sure that the major components (ostree, ignition, and the kubelet) are tied
together in a CoreOS distribution that can be quickly refreshed with OKD -
the Fedora CoreOS project is close to being ready for integration in our
CI, so that’s a natural place to start. That represents the biggest
technical obstacle that I’m aware of to get our first versions of OKD4 out
(the CI systems are currently testing on top of RHEL CoreOS but we have
PoCs of how to take an arbitrary ostree based distro and slot it in).

We would also need to define a release process that benefits and reinforces
the continuous model that operators enable. There have been a lot of
investments made to CI in OKD and I think that can help us deliver that
fast update premise. I think we’ve also seen that a distribution is more
than just it’s kernel, so concrete versions and version numbers are less
meaningful than in the past. I suspect a date-based release model where
fixes regularly flow in, things are automatically tested, and you can
easily upgrade might better reflect the reality of the environment we live
in than the hard Kubernetes boundaries of the past, as well as help reduce
the time from problem identification to resolution.

Finally, I’m sensitive to feedback I’ve heard from many people who want the
freedom to mix and match, so I think we can err on the side of “making
forks easy” if you want to experiment with your own distribution. That
matches a similar desire in the Fedora CoreOS working group to target
different use cases, including those that have nothing to do with
Kubernetes - that freedom to have alternatives is part of what I like the
most about open source, and I do think that everyone should have the
opportunity to make those choices differently via tooling

I will note that until OKD4 alpha is ready, those who are interested can
use https://try.openshift.com to try an OCP4 cluster. If there are things
that you *don’t* like about OCP4 that OKD4 should learn from, that feedback
is important.

---

This was a long email - sorry again - but I hope it can help start the
discussion about what comes next. I suggest we gather feedback and
thoughts for a week or so here, and then try to draft something more
concrete. In the meantime, getting a read out from where Fedora CoreOS is
currently at, thinking about what pieces of the current integration streams
may not make sense for OKD4, and what additional points of view we should
consider before moving forward would all be very concrete steps we could
take.

I’ll be hosting an OpenShift Commons Briefing on June 26th at 9:00 am
Pacific
<https://commons.openshift.org/events.html#event%7Cokd4-road-map-release-update-with-clayton-coleman-red-hat%7C960>
to further explore these topics with the wider community. I hope you’ll
join the conversation and look forward to hearing from the others across
the community. Meeting details here: http://bit.ly/OKD4ReleaseUpdate

Thank you for your continued support and for being a part of the OKD
community,

Clayton Coleman

Kubernetes and OKD contributor

_______________________________________________
dev mailing list
dev@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/dev

OKD 4 - A Modest Proposal

Reply via email to