Hi all,
Here I'm putting down my analysis of what was right and what was wrong
in the development model for 1.4 and 1.5, hoping to improve it for 1.6.
[long e-mail, even for one from me].
What worked very well was to have a stable branch in parallel of the
development branch. The mess was that the development branch lasted
forever. I got trapped in 1.4 having to maintain a reasonable quality
during -dev because of people using it in production, and I sweared it
wouldn't happen anymore with 1.5. And it did, to a lesser extent, but it
did, for exactly the same reasons.
What happened is a chicken-and-egg problem. We develop features that
people need. When these features appear, people start to test them and
once the features are stable enough, these people start to use them in
production, and once the features are in production, we need to care a
bit more about not breaking things.
I know that most people are against running development code in
production. But when you see this from a business perspective, actually
it makes a lot of sense : you have the shiny new feature you needed
immediately available, and you get support for free because you know
that any bug you report will be immediately looked at by the developers.
That's what happened with SSL, it took about one year to completely
stabilize, and it reached a very high quality thanks to all the users
who wanted to use it and reported bugs early. So that's an essential
part of the ecosystem that we must keep going well.
That issue of development quality was partially addressed in 1.5 where
it was explicitly stated that we wouldn't try to keep compatibility with
previous development versions and that each new version could break
things. It seemsquite well accepted, especially because people who run
their production on top of development versions generally have a staging
environment to test a new version, and are not the type of persons to be
negatively surprized by some breakage.
So what happened if everything went well ?
What happened was that I made the same mistake twice. First time it was
in 1.4, and I didn't see the warning and repeated that mistake in 1.5.
The big mistake is that I announced both a set of features and an ETA
for them. 1.4 was supposed to come with keep-alive and be completed in
6 months. And for the first time we received some funding from both
Exceliance and Loadbalancer.org to implement new nice features such as
persistence and RDP (thanks BTW!). There are some core changes that
cannot be done in parallel especially when you start from a design that
has some limits related to its history. So that delayed the keep-alive
development, which finally happened for client-facing side only (the
most important one by then). 1.4 was released after 11 months instead
of the expected 6 (but with many new cool features).
I reiterated the mistake for 1.5 by announcing that it would focus on
getting server-side keep-alive done in 6 months. In parallel, while
trying to get a feature done and fixed, we got even more feature
requests and contributions or funding for new lovely features (stick-
table synchronisation, tracking, SSL, compression, improved checks,
etc...). And the new features stacked on top of each other and there's
never a good moment to say "let's stop now" because there is demand and
that's normal and a proof of a healthy project. The 6 initially expected
months slowly became 4 years, with me saying every month "we'll release
once we have end-to-end keep-alive"...
There is a good reason for that mess that I learned a long time ago [1]
from Linus himself and that I failed to see coming here in haproxy : you
can't slow down development. For my defense, the project became very popular
between 1.4 and 1.5, attracting more users, feedback and requests.
Granted, now 1.5 is amazing and we all love it. But we cannot continue
this development process where people have to tell their boss for 4
years going that they're running development code in production and that
they have no idea how long this situation will last because the project
maintainer refuses to give a date. End users want to have something to
reassure their boss. They won't ever say "let's wait for haproxy to have
feature XYZ", instead they say "I've tested haproxy which has this nice
feature XYZ that will become stable soon, so let's evaluate it now and
deploy once it's ready". *THAT* is the key.
And it's not even remotely imaginable that we'll start to reject large
contributions from companies who make this project a success.
The solution should be easy because it already works for Linux : we need
to apply a merge window and stop promising new features in advance. In
fact we have the choice between promising a set of features or an ETA, but
given how things went in the past, it's certain that new features come as
a surprize and the ETA is desperately awaited.
So how long should a merge window last and how long should we wait
between two releases ? If we wait too long between two releases, users
will feel bored. If we don't let enough time, we'll have to maintain
more versions in parallel and it will be harder to merge new features.
We're not the Linux kernel project, so we should not expect all code to
be merged during the first two weeks and have 3 months from that point
to fix all the mess. We're developing slowly and debugging slowly. So I
think we should experiment with something like the following for version
1.6 :
- development must stop no later than end of March 2015
- release expected around May or June 2015.
That's roughly one year for a complete version, with about 8-9 months of
development, and it leaves time for users who only use releases to test
during summer and start to deploy in September. I suspect that the last
month will already slow down on adoption of dangerous features. And what
does not work at all by then should simply be reverted and postponed for
next major version. That also means that the -dev versions should get
closer so that we break less between each version and users can upgrade
with less validation work each time.
Maybe if that cycle goes well we'll be able to speed up for next version,
though I doubt it will be reasonable. Anyway we can experiment with the
development cycle precisely because we have stable branches in parallel.
We'll probably have some topic branches in the future. We used to have
that a few times, the last one being Thierry's ACL rework which was a
success as it lived in parallel to two development versions and resulted
in no less than 53 patches being merged at once when everything got
ready.
We'll also try to improve the project's organization. We currently have
some very talented people doing an amazing job, each with skills in
specific areas. I'd like to have some of these persons officially get
a subsystem maintainer's hat. That's already what we're doing with SSL.
Everyone knows I'm a noob at SSL, and whenever we get a patch, I wait
for Emeric's approval. There are already people I trust enough for
merging their patches without even reading them, so let's simplify the
process and have them be involved earlier in the process when possible
and have their acks/nacks accepted by everyone. We will also contemplate
establishing a stable team so that I'm not always delaying stable
releases anymore just because I have other things in mind, and setting
up a bug tracking solution. Nothing is fixed yet, there have been some
discussions behind the curtains, there's no emergency and since a number
of participants work on their spare time for free, I will never demand
anything they cannot provide on a voluntary basis. Things will probably
get clearer after the summer holidays.
Concerning the new features, no promises, but we know that we need to
progress in the following areas :
- multi-process : better synchronization of stats and health checks,
and find a way to support peers in this mode. I'm still thinking a
lot that due to the arrival of latency monsters that are SSL and
compression, we could benefit from having a thread-based architecture
so that we could migrate tasks to another CPU when they're going to
take a lot of time. The issue I'm seeing with threads is that
currently the code is highly dependent on being alone to modify any
data. Eg: a server state is consistent between entering and leaving
a health check function. We don't want to start adding huge mutexes
everywhere.
- hot reconfiguration : some users are abusing the reload mechanism to
extreme levels, but that does not void their requirements. And many
other users occasionally need to reload for various reasons such as
adding a new server or backend for a specific customer. While in the
past it was not possible to change a server address on the fly, we
could now do it easily, so we could think about provisionning a few
extra servers that could be configured at run time to avoid a number
of reloads. Concerning the difficulty to bind the reloaded processes,
Simon had done some work in this area 3 years ago with the master-
worker model. Unfortunately we never managed to stabilize it because
of the internal architecture that was hard to adapt and taking a lot
of time. It could be one of the options to reconsider though, along
with FD passing across processes. Similarly, persistent server states
across reloads is often requested and should be explored.
- DNS : some users in EC2 or equivalent environments would benefit a lot
from some dynamic DNS updates for their servers addresses. We know how
to do it and what to do, it was even described on the list, it's just
that nobody had the time to do it yet. It would save a lot of reloads
for a number of users.
- RAM-based small objects cache : one of the ML participants started
working on this some time ago. The work needed to get this done has
significantly reduced. The idea is not to have a great cache but a
pessimistic and safe one which can reduce the amount of local
communications between haproxy and the origin servers or caches. I'd
rather have a maintenance-free cache with a 50% hit ratio than one
requiring a lot of care with 90% hit ratio. The ones capable of
safely achieving 90% are dedicated to this task and doing it pretty
well (eg: Varnish).
- improved connection reuse : especially for fetching objects from a
local cache using a URL hash, it's interesting to keep connections
open between haproxy and the cache. That comes with a huge amount
of new issues (eg: connection monitoring, etc) but that's manageable.
- HTTP/2 : this is the most troublesome part that we absolutely need
to work on, because it may redefine the whole product's architecture
(and I'm currently working on identifying the shortest path to having
something acceptable). It's critically important because when HTTP/2
starts to be deployed, there will be the products which support it,
and the other ones... A naive approach could consist in having a
protocol converter to receive HTTP/2 and convert the frames to
HTTP/1 but that is highly counter productive since it will actually
significantly slow down communications. The reason is that browsers
will try hard not to emit multiple connections, so all objects would
be serialized, making things much worse than with HTTP/1. So we
really need to instantiate multiple streams from a single connection
and currently haproxy is not at all architectured this way (initially
it's a pipe with one socket at each end). Actually though many
entities were removed from the struct session and it could be an
opportunity for going even further and definitely get rid of it.
I know that a lot of ideas and new requirement will come over time. But we
need to consider the features as a whole to find the most suited architecture
to get all that done. For example, from time to time we're seeing people who
want to have local file-system access. That votes in favor of having a master
process outside of a chroot which could possibly handle the config and which
could communicate with the other one(s). But at the same time, we possibly
don't want to use processes exclusively if we want to share stick-tables,
stats, or cache. All that needs to be though about. We have 8 months to see
what can be done now.
If anyone has any comment / question / suggestion, as usual feel free to
keep the discussion going on.
For now enough talking, I'm going back to real work :-)
Willy
---
[1] http://yarchive.net/comp/linux/development_speed.html