Re: [PATCH] MINOR: cli: ability to change a server's name

Willy Tarreau Thu, 22 Jun 2017 15:28:38 -0700

On Thu, Jun 22, 2017 at 11:46:12AM -0700, Joseph Lynch wrote:
> Hm, I'm still struggling to understand why this is a problem as an
> option for operators, sorry if I'm being dense!


No it's not because you're dense, it's simply because you're using a
very small subset of haproxy's features and in *your* environment,
the name is just a random piece of information. In other places it's
a primary key to a number of things. In fact the server's name should
be seen as a technical internal identifier that happens to be user
visible. It's very much like the numerical id and is even interchangeable
with it. BTW as you noticed, it's stored in srv->id. There are places in
the code just like external tools which heavily rely on this strong 1:1
unicity. I'm not judging whether or not it was a good or bad initial
choice and whether or not it's still a good choice, it's just a fact
that it exists, for SNMP monitoring, for log analysis, for the health
checks when "http-check send-state" is used, in the agent, in the
unique-id header, in http-send-name-header, maybe even for the peers
protocol (I don't remember), for internal lookups. Changing this internal
ID on the fly doesn't seem like a good idea to me just because we suddenly
change this mapping *after* it was resolved and applied. The whole process
relies on some consistency. If you dump the stats and you see for server
srv1 "UP via bck2/srv1", this bck2/ srv1 does exist and is guaranteed to
exist for the whole life of the process, which allows the stats to pause
on full buffers and continue in the middle of a dump because the state
doesn't change below it, which allows external tools too recursively
check the bck2/srv1 stats in the same dump which is guaranteed to exist,
etc. Or even worse, you click on the stats page to disable 4 servers,
their names are sent, but due to the way the request arrives in partial
segments, they're processed as data arrives, and in the middle, your
service controller renames some servers so certain operations are applied
to different servers, or multiple times to the same one and none to another
one.

It's hard for me to enumerate good examples of inconsistencies in a
system where the consistency is guaranteed by design and has always
been granted, because everything was built on top of such an assumption,
so while some of the breakage is easy to predict, a lot more has to be
discovered.

> > The server state is a perfect example of huge breakage that would be
> > caused by such a change. In short, you start with the names from the
> > configuration file, which are looked up in the state file, then the
> > names are changed, and once dumped, they are different. So upon next
> > reload, nothing matches at all.
> 
> As long as the configuration file is updated before updating the name
> this should be fine, shouldn't it?

That's the theory, probably that *you* are going to do it. Matching logs
against configs is another game. Also when a log is emitted after a long
session, what name do you dump, the one the session started on or the
one the session ended on ?

> I was under the impression that
> best practice for using dynamic HAProxy server features was always to
> write out the modified configuration file when sending socket
> commands.

It depends on the changes. For all changes that are made in order to
avoid a reload, that's indeed the case. For regular operations, most
often it's not (eg: change a server's weight during backups etc).

> That way reloads are consistent, and if someone really needs
> to know what exactly is happening right now in the running process
> that's what show stat or show servers state is for? I'm also
> struggling to see why the state file is an issue, my understanding is
> that you save the server state file before a reload, so why would you
> send a name update after you save the server sate?

It could be the other way around in fact, it's people not dumping the
server state as often as the name updates, for whatever reasons. And
it could be seen differently as well, why would someone replace a
server's internal name after everything was mapped on it ? :-)

> > For me it's a more general problem of inconsistency between the 
> > configuration
> > and what is running. For example if you have this :\
> 
> I very much agree, I just thought that we were already going down this
> path with dynamically updating addrs and using server templates (which
> introduce config-runtime drift as well).

Server templates are a very good example : the server names are automatic
numbers. If you happen to rename a server belonging to a template, there's
no way you'll be able to find it again in the state file since you cannot
change its name in the configuration.

> As long as whatever is doing
> the socket manipulation (in our case synapse), does file configuration
> changes first shouldn't this be reasonably ok?

I'm pretty fine with changing a lot of things in the config file and in
the internal state at run time, as long as these elements are not keys
to basically everything built around. I don't want the processes to
become inconsistent just because the root keys of certain strong
relations between objects and processes (logs/stats/...) are broken at
run time. I'd rather help you bring in the ability to directly manipulate
the server you need base on the data you get without having to perform a
painful initial lookup to discover a name matching an entry.

> >
> >     backend foo
> >          use-server srv1 if rule1
> >          use-server srv2 if rule2
> >          server srv1 1.1.1.1:80
> >          server srv2 2.2.2.2:80
> >
> > Do you know what happens once you'll switch the names over the CLI ? (hint:
> > I know).
> 
> My understanding is that we'd continue sending to the original
> servers. I do see this would be bad, but isn't the requirement on the
> user to not do this, specifically if you have use-server rules then
> you have to reload the config to pick up changes in backends?

It depends. There have already been requests to make use-server as dynamic
as use_backend. Just like there's currently an "nbsrv" converter which
takes a backend name on input and turns it into a server count, and it
was suggested to have an equivalent srv_is_up to do the same with a
server's state. So again, differences between lines resolved at parsing
time and those handled at run time.

> Furthermore, rule1 or rule2 could refer to hard coded IP addresses
> which would also be wrong if we updated the addr of the server with
> "set server addr" as it is.

You rarely send to a server based on its own destination address, as
for this you don't really need a proxy, let alone a load balancer ;-)

> Also thinking about this more, if
> use-server rules dynamically searched for the server instead of
> storing that might be a nifty way to do like primary replica failover,
> something like:
> 
> backend foo
>     use-server leader if write_operation
>     server leader 1.1.1.1:1234
>     server follower1 2.2.2.2:1234
>     server follower2 3.3.3.3:1234

Sure it could. The thing is that we can't have half of the keywords
work one way and the other half another way. And in order to change a
mapping at run time you'd simply have to insert a map in the middle
and change this map for example.

(...)
> > Here after a renaming of the tracked servers, it becomes a mess, you'll
> > constantly have to know what's taken at startup time and what's taken at
> > runtime.
> 
> I agree, but if an administrator is using features like track or
> use-server I think they should pick up a server change with a reload
> rather than with the dynamic socket. Similar to if they are using ip
> or port based ACLs they should probably reload instead of setting the
> address dynamically?

No in fact there's a very common use case for massive "track" directives,
it's when you're hosting hundreds or thousands of domains on a few servers.
So you can have your dynamic farms mapped onto those servers, assigned by
customer if you want (eg: one customer can only use srv1..10 while others
may use 10..100), and you simply have one backend dedicated to health
checks which provides the true visibility for all the farms.

> > On the opposite, there's zero backwards compatibility concerns to have
> > with adding a description field because we don't have it for servers for
> > now, only for backends and frontends.
> 
> I see, ok I will see if this can work.

Joe, I'm not forcing you to contemplate options that are unsuited to your
use case, I'd rather really grasp it to ensure something perfectly fit can
be established. For example if we realize that what is needed is sort of a
dynamic ID for use with service discovery tools, maybe that's the solution.
Maybe you need an ID specific to a service controller that this service
controller is allowed to change, which is returned in the state file,
dumped in the stats etc. I don't know but I think it would be muuuuuuch
safer to design around something like this than to try to deviate an
existing entry for a more or less related use case just because at first
glance it seems fit.

> > Just a question, wouldn't your requirement in fact be to lookup a server
> > by its address ? I mean, wouldn't it be easier for your use case ? If so
> > I'm pretty sure that's something we could easily apply. Either we apply
> > this to the first server that matches the address, or we apply it to all
> > those which match, depending on the difficulty.
> >
> > For example wouldn't it be easier for you to be able to do :
> >
> >    set server addr bk/@1.1.1.1:80 2.2.2.2:80
> >
> > or :
> >
> >    disable server bk/@1.1.1.1:80
> >
> > ?
> 
> Interesting syntax, I think if we added that to all the set server
> commands it could work (we'd have to port from using enable/disable
> server to set server state, but that could work) for all but the
> strangest edge case in synapse.

Doing so is almost trivial as we "just" have to do it in a single function
that converts a server name string to a pointer, and it's used everywhere
by the CLI. I think the function is something like backend_get_server_name()
or something like this.

> For context Synapse generates HAProxy
> server names from host:port + a user provided "name" (which is
> typically the fqdn, but some users use it to double register the same
> physical server, which this syntax would no longer allow). It does
> seem to me though like we're doing a lot more work then we need (but
> that's probably because I don't understand why updating the name is
> bad yet heh).

Looking up the IP address is around 10 lines to add to the function above
so that's not a huge work. However I feel like you'll need this user provided
name and that's possibly the thing we need to add as a server attribute that
you can use from the external service.

In fact it even starts to make sense to me. The current "name" is the
config name. The one used to resolve names at parsing time and to correlate
logs and stats with config entries. You need a different name to reference
a server corresponding to a resource or entity or something among your farm
that your external tool can manipulate at will. I understand how forcing
your tool to have to adapt to haproxy's internal identifier causes pain,
just like having haproxy's internal identifier change all the time will
cause pain. One more reason for having this specific identifier that can
be used externally.

> > I'm pretty sure you have a very valid use case that we probably need to
> > discuss more to find the largest intersection with what can reasonably be
> > done here without breaking existing setups. We know that the management
> > socket needs to be significantly improved for some maintenance operations,
> > so it's even more important to know what your ideal usage scenario would
> > be to see how it could be mapped here.
> 
> The usage scenario is microservices, namely we have a bunch of service
> instances (containers) which are constantly churning around physical
> machines as applications autoscale, deploy, fail, get preempted, get
> bin-packed onto other machines etc ... This is pretty common in any
> Mesos or Kubernetes like service deployment. Right now we manage the
> churn with synapse, which gets dynamic push updates from a service
> registry (zookeeper, dns, etc), and automatically reconfigures (within
> seconds) and reloads (within minutes) HAProxy across a fleet of a few
> thousand machines. To give you an idea of the kind of churn, synapse
> handles about 10-20 thousand socket updates (mostly servers going in
> and out of maintenance) and about 200 reloads per day per machine. We
> mostly reload to pick up new servers, and although synapse tries to
> group server changes into 1 minute groups (so we aggregate all the
> changes within 1 minute) we still get about 200 restarts per day and
> for something simple like scaling up a backend by one server we have
> to wait 1 minute for HAProxies everywhere to pick that up. Ideally we
> can add servers just as fast as we can down/up them, which I thought
> might be possible with the new dynamic "set server addr" directive,
> but if we can't change the name then synapse can't map it's notion of
> a server to HAProxy's notion of a server and I don't know how to make
> it work. Synapse doesn't support fancy per server features like
> set-server and server tracking since they're really not needed in the
> microservice world. You just need a backend pool that dynamically
> updates as fast as possible as containers migrate around machines.

OK so if we add a string as the external identifier (let's at least
give a name to this thing) and the ability to lookup by transport
address, you have everything :
  - the mesos/kubernetes identifier directly accessible
  - the IP:port directly accessible

Did I miss anything ?

> I'm still not quite understanding why allowing changing the name is
> going to cause breakage for users that are choosing to change the name
> dynamically,

Because not everyone does everything right, because there are people like
the ones on this list trying hard to debug impossible situations and to
trust only what is provided (ie: config+logs) till they spot the line
which has made this impossible situation possible, etc.

> but I do really appreciate all the context you're giving me :-)

You're welcome. Don't get me wrong, I'm not saying NIH, do it like I
want. I'm just saying that everything related to internal IDs must not
change (same for frontend/backend names and IDs, peers, etc). But I am
interested in seeing how we can nicely cover your use case as I think
it really is not that difficult once properly defined.

Cheers,
Willy

Re: [PATCH] MINOR: cli: ability to change a server's name

Reply via email to