Hi German,

Responses in-line:

On Fri, Sep 5, 2014 at 2:31 PM, Eichberger, German <german.eichber...@hp.com
> wrote:

>  Hi Stephen,
> I think this is a good discussion to have and will make it more clear why
> we chose a specific design. I also believe by having this discussion we
> will make the design stronger.  I am still a little bit confused what the
> driver/controller/amphora agent roles are. In my driver-less design we
> don’t have to worry about the driver which most likely in haproxy’s case
> will be split to some degree between controller and amphora device.

Yep, I agree that a good technical debate like this can help both to get
many people's points of view and can help determine the technical merit of
one design over another. I appreciate your vigorous participation in this
process. :)

So, the purpose of the controller / driver / amphora and the
responsibilities they have are somewhat laid out in the Octavia v0.5
component design document, but it's also possible that there weren't enough
specifics in that document to answer the concerns brought up in this
thread. So, to that end in my mind, I see things like the following:

The controller:
* Is responsible for concerns of the Octavia system as a whole, including
the intelligence around interfacing with the networking, virtualization,
and other layers necessary to set up the amphorae on the network and
getting them configured.
* Will rarely, if ever, talk directly to the end-systems or -services (like
Neutron, Nova, etc.). Instead it goes through a "clean" driver interface
for each of these.
* The controller has direct access to the database where state is stored.
* Must load at least one driver, may load several drivers and choose
between them based on configuration logic (ex. flavors, config file, etc.)

The driver:
* Handles all communication to or from the amphorae
* Is loaded by the controller (ie. its interface with the controller is a
base class, associated methods, etc. It's objects and code, not a RESTful
* Speaks amphora-specific protocols on the back-end. In the case of the
reference "haproxy" amphora, this will most likely be in the form of a
RESTful API with an agent on the amp, as well as (probably) HMAC-signed UDP
health, status and stats messages from the amp to the driver.

The amphora:
* Does the actual load balancing
* Is managed by the controller through the driver.
* Should be as "dumb" as possible.
* Comes in different types, based on the software in the amphora image.
(Though all amps of a given type should be managed by the same driver.)
Types might include "haproxy," "nginx," "haproxy + nginx," "3rd party
vendor X," etc.
* Should never have direct access to the Octavia database, and therefore
attempt to be as stateless as possible, as far as configuration is

To be honest, our current product does not have a "driver" layer per se,
since we only interface with one type of back-end. However, we still render
our haproxy configs in the controller. :)

> So let’s try to sum up what we want a controller to do:
> -          Provision new amphora devices
> -          Monitor/Manage health
> -          Gather stats
> -          Manage/Perform configuration changes
> The driver as described would be:
> -          Render configuration changes in a specific format, e.g. haproxy
> Amphora Device:
> -          Communicate with the driver/controller to make things happen
> So as Doug pointed out I can make a very thin driver which basically
> passes everything through to the Amphora Device or on the other hand of the
> spectrum I can make a very thick driver which manages all aspects from the
> amphora life cycle to whatever (aka kitchen sink). I know we are going for
> uttermost flexibility but I believe:

So, I'm not sure it's fair to characterize the driver I'm suggesting as
"very thick." If you get right down to it, I'm pretty sure the only major
thing we disagree on here is where the haproxy configuration is rendered:
 Just before it's sent over the wire to the amphora, or just after it's
JSON-equivalent is received over the wire from the controller.

>  -          With building an haproxy centric controller we don’t really
> know which things should be controller/which thing should be driver. So my
> shortcut is not to build a driver at all J
So, I've become more convinced that having a driver layer there is going to
be important if we want to support 3rd party vendors creating their own
amphorae at all (which I think we do). It's also going to be important if
we want to be able to support other versions of open-source amphorae (or
experimental versions prior to pushing out to a wider user-base, etc.)

Also, I think: Making ourselves use a driver here also helps keep
interfaces clean. This helps us avoid spaghetti code and makes things more
maintainable in the long run.

>  -          The more flexibility increases complexity and makes it
> confusing for people to develop components. Should this concern go into the
> controller, the driver, or the amphora VM? Two of them? Three of them?
> Limiting choices makes it simpler to achieve that.
"Centralize intelligence / decentralize workload."  There will often be
multiple ways we can solve certain problems, but if we try to follow this
mantra, and use clean interfaces between components, it starts to become
more clear which code strategies we should be following. Yes, it's
sometimes hard to know the right way to do things-- which is why we end up
having these wonderful debates. ;) But I don't think the answer is "this is
hard, let's just lump everything together."

Also, rule of thumb (perhaps not stated in our constitution... yet):  Try
to architect things so the most frequently deployed elements see the fewest
changes. (This is actually related to the "centralize intelligence /
decentralize workload" mantra in a round-about way: Central intelligence
elements will be both fewer in number and more frequently changed than
"dumb" workload components.) This makes managing change for large
deployments easier. (Again, it's both easier and less risky to update 100
controllers versus 10,000+ amphorae.)

> HPs worry is that by creating the potential to run multiple (version of
> drivers) drivers, on multiple versions of controllers, on multiple versions
> of amphora devices creates a headache for testing. For example does the
> version 4.1 haproxy driver work with the cersion 4.2 controller on an 4.0
> amphora device? Which compatibility matrix do we need to build/test?
> Limiting one driver to one controller can help with making that manageable.

Ok, so, I think this is possibly where part of our misunderstanding comes
from. I realize above that I said a single driver could talk to multiple
versions of back-end amphorae via a couple methods, but let's ignore that
for a minute and assume that we only test / assume drivers will be speaking
with the latest version of the amphorae to which they correspond.

I should probably clarify something that I've been assuming but may not be
obvious:  I'm assuming that the "version" of the amphorae (drawn mostly
from the version of the glue scripts, agent, and other code we write which
lives on the amphora) is numbered separately and moves at a different rate
than the version of the driver.  Think of this like the version of the
firmware and version of the driver used with your printer. Sometimes a
major bugfix entails updating both the firmware and driver. However, it's
also common for a bugfix / feature enhancement to involve only updating the
printer driver version and not the printer firmware.

What I'm getting at here is that if we're doing the configuration rendering
in the driver and not on the amphora, there will be some bugfixes / feature
enhancements which only entail updating the driver because *there are
literally no changes that need to be made to the amphora for the bugfix /

Does this actually happen? Yes! To give a concrete example drawn from our
product history:  On our existing load balancer product, which is powered
by stunnel + haproxy a new OpenSSL vulnerability was discovered, the fix
for which was to add a line to the stunnel configuration disabling a
certain kind of SSL negotiation. Since we were rendering configurations
centrally on our controllers, all we needed to do was update the
configuration template on our controller and push out new configs for
anyone using SSL termination. Took literally 10 minutes to implement once
we understood the problem, and we didn't have to touch or otherwise update
the software or scripts running on our appliances at all.

It's even easier for L7 feature enhancements: You don't even have to push
anything out to the amphora, just update the controller / driver to expose
the new feature and users can then start using it at will.

Are all feature enhancements / bugfixes this easy? No! How do you tell the
difference between which changes are major and minor? Anything which
touches the code running on the amphora is "major" (ie. like a firmware
update). Anything which only touches the controller / driver is "minor"
(ie. like a driver update).

It seems strange to me that we'd force even minor changes to configurations
to be "major" updates for the sake of sending
JSON-which-will-immediately-be-turned-into-haproxy.cfg over the wire
instead of just the haproxy.cfg. :/

So with that in mind:  Please understand that your model and mine do not
have to differ in the slightest when it comes to how to manage 'major'
updates, whether that be running a different driver / controller for the
new amphora version (Ick!), or doing on-demand lazy upgrades of amphora as
the driver discovers old, incompatible-versioned amphora it needs to update
(probably smoothest way to handle this, possibly as a default action of the
option 2 I mentioned above), or whether we force all amphora to be updated
as soon as possible after a controller update (most risky and probably not
the best way to handle this). We've yet to define exactly how this workflow
should be handled, but it's actually somewhat secondary to the problem of
where to render the configs.  (Maybe we should have a conversation about
this in another thread?)

And in any case, I'm not seeing a need to ensure the driver works with
anything but the latest amphora image version to which it corresponds
(again, keeping in mind that amphora image and driver should be allowed to
change at different rates and are therefore versioned separately). :/ This
is especially the case if we define the default action to be taken upon a
failure to push out a new config to be to check the version of the amphora
and upgrade as necessary (ie. lazy upgrading)...

Also, not that we can't revisit this of course:  But the v0.5 component
design entailing a "VM Driver" already went through gerrit review and was
approved (by yourself even!) This discussion was originally about where to
render the haproxy configs, but it really seems like y'all are against the
idea of having an amphora driver interface at all. :/


Stephen Balukoff
Blue Box Group, LLC
(800)613-4305 x807
OpenStack-dev mailing list

Reply via email to