Hi Jeff,
sorry for the late response, your message is one of the few I found
unread in my mail box after moving a lot of ML junk out of it.
On Fri, Aug 19, 2011 at 09:05:53AM -0400, Jeff Buchbinder wrote:
The API stats (pool.content, etc) calls that I had implemented are
essentially the same, except that they format the data in a way that is
much more easily consumed by other services. (JSON formatted data beats
out CSV in an ease-of-use context for anything other than feeding to a
spreadsheet or awk/sed/cut/grepping data.)
I'm well aware of this too. That's why I wanted that we used JSON in our
API at Exceliance.
(...)
= This means that you need your file-based config to always be in
sync with the API changes. There is no reliable way of doing so
in any component if the changes are applied at two distinct
places at the same time !
It depends what you're using haproxy for. If you're populating the
configuration from the API (which is my eventual goal, if possible) for
an elastic/dynamic server pool scenario where servers will be brought
into the pool dynamically, it doesn't matter as much about configuration
file persistence.
But you still need to populate your config before starting the daemon,
otherwise a restart may be fatal just because the few first seconds
before you update its conf break the site.
(...)
There is only one way to solve these classes of issues, by respecting those
two rules :
- the changes must be performed to one single place, which is the
reference
(here the config file)
- the changes must then be applied using the normal process from this
reference
I would think it would also be possible to replay a list of
modifications to the original configuration, which would not require
rewriting the original config. Not a perfect solution, but another
possibility. (The downside would potentially be that a change to the
original configuration would change the way that the replayed actions
would behave.)
Yes that's the problem. Replaying is only valid in an independant context.
That's the problem we have with the defaults sections. They're quite handy
but they're changing a lot of semantics when it comes to configuring the
sections that depend on them. If your main config file gets a change, it's
very possible that replaying your changes will not do the right thing again.
What this means is that anything related to changing more than an
operational
status must be performed on the config file first, then propagated to the
running processes using the same method that is used upon start up (config
parsing and loading).
That assumes that you're not dealing with a transient configuration (as
I had mentioned earlier. It's an admirable goal to allow configuration
persistence for things like the pool.add and pool.remove methods (since
those are, at the moment, the only two that touch the configuration in a
way that would seriously break a stored config file).
As I indicated above, the idea of a transient config file scares me a lot.
Either you have no server in it and you serve 503 errors to everyone when
you start, until the config is updated, or you have a bunch of old servers
and in environments such as EC2, you send traffic to someone else's servers
because they were assigned your previous IP.
Also, outside of pool.add and pool.remove, I'm not really doing anything
conceptually outside of what the stats control socket already has been
doing. Weight and maintenance mode are not persisted to the
configuration file. The only difference is the way that I'm allowing
access to it (disregarding pool.add and pool.remove, of course).
Even the weight has different semantics in the config file and on the stats
socket. The stats socket controls the effective weight without affecting
the configured weight. The reason is that you can set the weight to 100%
on the stats socket and you get back the configured weight.
Right now haproxy is not able to reload a config once it's started. And
since
we chroot it, it will not be able to access the FS afterwards. However we
can
reload a new process with the new config (that's what most of us are
currently
doing).
That's also what I'm doing in our production setup. The importance of an
accessible API, though, is that it allows third party services (for
example, a software deployer or cloud management service) to control
certain aspects of the proxy without having to resort to kludges like
using ssh to remotely push commands into a socket with socat. (Which, by
the way, works just fine run locally with a wrapper script, but makes it
more difficult to integrate into a deployment process.)
Oh I know that well too ;-)
At the company, we decided to address precisely this issue with the API we
developped : it only affects the config file and never plays with the socket
because right now we have not implemented any operational status changes.