Hi Stephen,

I think this is a good discussion to have and will make it more clear why we 
chose a specific design. I also believe by having this discussion we will make 
the design stronger.  I am still a little bit confused what the 
driver/controller/amphora agent roles are. In my driver-less design we don’t 
have to worry about the driver which most likely in haproxy’s case will be 
split to some degree between controller and amphora device.

So let’s try to sum up what we want a controller to do:

-          Provision new amphora devices

-          Monitor/Manage health

-          Gather stats

-          Manage/Perform configuration changes

The driver as described would be:

-          Render configuration changes in a specific format, e.g. haproxy

Amphora Device:

-          Communicate with the driver/controller to make things happen

So as Doug pointed out I can make a very thin driver which basically passes 
everything through to the Amphora Device or on the other hand of the spectrum I 
can make a very thick driver which manages all aspects from the amphora life 
cycle to whatever (aka kitchen sink). I know we are going for uttermost 
flexibility but I believe:

-          With building an haproxy centric controller we don’t really know 
which things should be controller/which thing should be driver. So my shortcut 
is not to build a driver at all ☺

-          The more flexibility increases complexity and makes it confusing for 
people to develop components. Should this concern go into the controller, the 
driver, or the amphora VM? Two of them? Three of them? Limiting choices makes 
it simpler to achieve that.

HPs worry is that by creating the potential to run multiple (version of 
drivers) drivers, on multiple versions of controllers, on multiple versions of 
amphora devices creates a headache for testing. For example does the version 
4.1 haproxy driver work with the cersion 4.2 controller on an 4.0 amphora 
device? Which compatibility matrix do we need to build/test? Limiting one 
driver to one controller can help with making that manageable.

Thanks,
German

From: Stephen Balukoff [mailto:sbaluk...@bluebox.net]
Sent: Friday, September 05, 2014 10:44 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Octavia] Question about where to render haproxy 
configurations

Hi German,

Thanks for your reply! My responses are in-line below, and of course you should 
feel free to counter my counter-points. :)

For anyone else paying attention and interested in expressing a voice here, 
we'll probably be voting on this subject at next week's Octavia meeting.

On Thu, Sep 4, 2014 at 9:13 PM, Eichberger, German 
<german.eichber...@hp.com<mailto:german.eichber...@hp.com>> wrote:
Hi,

Stephen visited us today (the joy of spending some days in Seattle☺) and we 
discussed that  further (and sorry for using VM – not sure what won):

Looks like "Amphora" won, so I'll start using that terminology below.


1.       We will only support one driver per controller, e.g. if you upgrade a 
driver you deploy a new controller with the new driver and either make him take 
over existing VMs (minor change) or spin  up new ones (major change) but keep 
the “old” controller in place until it doesn’t serve any VMs any longer
Why? I agree with the idea of one back-end type per driver, but why shouldn't 
we support more than one driver per controller?

I agree that you probably only want to support one version of each driver per 
controller, but it seems to me it shouldn't be that difficult to write a driver 
that knows how to speak different versions of back-end amphorae. Off the top of 
my head I can think of two ways of doing this:

1. For each new feature or bugfix added, keep track of the minimal version of 
the amphora required to use that feature/bugfix. Then, when building your 
configuration, as various features are activated in the configuration, keep a 
running track of the minimal amphora version required to meet that 
configuration. If the configuration version is higher than the version of the 
amphora you're going to update, you can pre-emptively return an error detailing 
an unsupported configuration due to the back-end amphora being too old. (What 
you do with this error-- fail, recycle the amphora, whatever-- is up to the 
operator's policy at this point, though I would probably recommend just 
recycling the amphora.) If a given user's configuration never makes use of 
advanced features later on, there's no rush to upgrade their amphoras, and new 
controllers can push configs that work with the old amphoras indefinitely.

2. If the above sounds too complicated, you can forego that and simply build 
the config, try to push it to the amphora, and see if you get an error 
returned.  If you do, depending on the nature of the error you may decide to 
recycle the amphora or take other actions. As there should never be a case 
where you deploy a controller that generates configs with features that no 
amphora image can satisfy, re-deploying the amphora with the latest image 
should correct this problem.

There are probably other ways to manage this that I'm not thinking of as well-- 
these are just the two that occurred to me immediately.

Also, your statement above implies some process around controller upgrades 
which hasn't been actually decided yet. It may be that we recommend a different 
upgrade path for controllers.


2.       If we render configuration files on the VM we only support one upgrade 
model (replacing the VM) which might simplify development as opposed to the 
driver model where we need to write code to push out configuration changes to 
all VMs for minor changes + write code to failover VMs for major changes
So, you're saying it's a *good* thing that you're forced into upgrading all 
your amphoras for even minor changes because having only one upgrade path 
should make the code simpler.

For large deployments, I heartily disagree.


3.       I am afraid that half baked drivers will break the controller and I 
feel it’s easier to shoot VMs with half baked renderers  than the controllers.

I defer to Doug's statement on this, and will add the following:

Breaking a controller temporarily does not cause a visible service interruption 
for end-users. Amphorae keep processing load-balancer requests. All it means is 
that tenants can't make changes to existing load balanced services until the 
controllers are repaired.

But blowing away an amphora does create a visible service interruption for 
end-users. This is especially bad if you don't notice this until after you've 
gone through and updated your fleet of 10,000+ amphorae because your upgrade 
process requires you to do so.

Given the choice of scrambling to repair a few hundred broken controllers while 
almost all end-users are oblivious to the problem, or scrambling to repair 10's 
of thousands of amphorae while service stops for almost all end-users, I'll 
take the former.  (The former is a relatively minor note on a service status 
page. The latter is an article about your cloud outage on major tech blogs and 
a damage-control press-release from your PR department.)


4.       The main advantage by using an Octavia format to talk to VMs is that 
we can mix and match VMs with different properties (e.g. nginx, haproxy) on the 
same controller because the implementation detail (which file to render) is 
hidden
So, I would consider shipping a complete haproxy config to the amphora being 
"Octavia format" in one sense. But I would also point out that behind the 
driver, it's perfectly OK to speak in very back-end specific terms. That's sort 
of the point to a driver:  Speak a more generic protocol on the front end (base 
classes + methods that should be fulfilled by each driver, etc.), and speak a 
very implementation-specific protocol on the back-end.  I would not, for 
example, expect the driver which speaks to an amphora built to run haproxy to 
be speaking the exact same protocol as a driver which speaks to an amphora 
built to run nginx.

Also, you can still mix and match here--  just create a third amphora driver 
which can speak in terms of both haproxy and nginx configs to an amphora 
back-end which is capable of running either haproxy or nginx. The point is that 
the driver should match the back-end it's supposed to communicate with, and 
what protocol the driver chooses to speak to the back-end (including the UDP 
stuff you bring up later) is entirely up to the driver.


5.       The major difference in The API between Stephen and me would be that I 
would send json files which get rendered on the VM into a haproxy file whereas 
he would send an haproxy file. We still need to develop an interface on the VM 
to report stats and health in Octavia format. It is conceivable with Stephen’s 
design that drivers would exist which would translate stats and health from a 
proprietary format into the Octavia one. I am not sure how we would get the 
proprietary VMs to emit the UDP health packets… In any case a lot of logic 
could end up in a driver – and fanning that processing out to the VMs might 
allow for less controllers.
So I'm not exactly sure what you mean by the "Octavia format" here. I'm going 
to assume that you mean something to the effect of "represented using objects 
and data models which correspond with Octavia's internals". This doesn't 
necessarily mean that APIs need to use terms which exactly line up with these, 
though I think that might be what you're implying here. In any case, whether 
other drivers choose to use UDP health packets for getting status information 
from their versions of amphorae is up to them-- that's just been the suggestion 
for the one which we're developing first which will use haproxy for the load 
balancing software. Again, no other driver should be restricted in what it's 
allowed to do based on what we do with the haproxy driver-- they don't have to 
follow the same communication model at all if they don't want to.

Otherwise, I completely agree that the API the haproxy driver will speak to the 
amphorae will necessarily have other features in it beyond simply shipping 
configuration files back and forth. That's been something I've been meaning to 
work on documenting, so I'll get started on that today.

Overall, if I don’t like to take advantage of the minor update model the main 
difference between me and Stephen is in the haproxy case to ship json instead 
of haproxy config. I understand that the minor update capability is a make or 
break for Stephen though in my experience configuration changes without other 
updates are rare (and my experience might not be representative).

So, I've been trying to contemplate why it is that you're not expecting to see 
a need for doing minor updates in a relatively simple fashion. I think this 
might stem from our existing product offerings:  Blue Box's in-house built load 
balancer already does TLS, SNI, Layer-7 switching and other advanced features 
whereas I'm not sure the product you're used to running does all that (at least 
in a way that's exposed to the user).

As we were developing the layer-7 features especially, we found it very common 
to make minor tweaks to the configuration file format to allow for additional 
types of layer-7 rules as our customers asked for this functionality, for 
example. These are cases where we're not making any substantial changes to the 
back-end image (ie. it's still running the same version of haproxy and our glue 
scripts), and therefore didn't need to update any of the software (including 
our glue scripts) on the back-ends themselves.

It's true that a certain amount of intelligence about the configuration file 
format is necessary on the back-end--  they need to know how to parse out the 
pool members in order to gather statistics on them, for example-- but these 
capabilities are generally unaffected by other minor configuration file format 
changes.



In any case there certainly is some advantage for appliances which not 
necessarily speak the Octavia protocol by allowing custom drivers. However, 
given our plan to use an Ocatvia UDP package to emit health messages from the 
VMs to the controller and since controller provision VMs in Nova it might be a 
better integration point for appliances to have custom controllers. I am just 
not convinced that a custom driver is sufficient for all cases –

Ok, in this case, I think you're equivocating "Octavia protocol" with "the 
protocol the haproxy driver speaks to the haproxy-based amphorae," which I 
think is the wrong way to think about this.

Also, how things are dealt with in the controller itself (ie. outside of the 
drivers it loads) should necessarily be the same, no matter which driver / 
back-end is used. I'm not in favor of a design which requires a different 
controller for each kind of back-end. In that design, you'd need to invent 
another "controller-driver" layer, or at least reduce the role of the 
controller to essentially be a driver (presumably having to duplicate all the 
common "controller" elements between these "effectively drivers" thingies)... 
which seems silly to me. The step to implementation-specific objects and 
protocols should happen in the controller's drivers.

I do have one other question for you or your team, German:

Is there something about rendering the haproxy configuration in the driver 
which is a show-stopper (or major inconvenience) for HP's ability to use this 
product? I'm trying to understand what exactly it is about shipping rendered 
configuration files from the driver to the amphora that's so distasteful to 
y'all.

Stephen

--
Stephen Balukoff
Blue Box Group, LLC
(800)613-4305 x807
_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to