Hi Luke! On Wed, Jul 08, 2020 at 11:57:15AM +0200, Luke Seelenbinder wrote: > I've been following along the torturous road, and I'm happy to see all the > issues resolved and the excellent results.
You can imagine how I am as well :-) > Personally, I'm excited about the > performance gains. I'll deploy this soon on our network. OK! > To dig up an old discussion--I took a look at better support for SRV records > (using the priority field as backup/non-backup, etc.) a few weeks ago, but > determined it didn't make sense in our use case. The issue is 0 weighted > servers are considerably less useful to us since they aren't ever used, even > in the condition where every other server is down. I seem to remember a discussion about making this configurable but I don't seem to see any commit matching anything like that, so maybe the discussion ended up in "change the behavior again the previous one was wrong", I don't remember well. > That raises the next question: is the idea of server groups (with the ability > for a request to try group 1, then group 2, etc. on retries) in the > development plans at some point? Would that be something I could tinker as a > longer term project? That could indeed be an interesting approach because we already almost do that between active and backup servers, except that there is always one single group at a time. In fact there are 4 possible states for a servers group: - populated only with all active servers which are UP or unchecked, provided that there is at least one such server ; - populated only with all backup servers which are UP or unchecked, provided there is at least one such server, that no active server exists in UP or unchecked state, and that option useallbackups is set; - populated with the first UP or unchecked backup server, provided that there is at last one such server, that no active server exists in UP or unchecked state, and that option useallbackups is not set; - no server: all are down ; With your approach it would be almost identical except that we would always have two load-balancing groups, a primary one and a secondary one, the first one made only of the active servers and the second one made only of the backup servers. We would then pick from the first list and if it's empty, then the next one. I shouldn't even consume too much memory since the structures used to attach the servers to the group are carried by the servers themselves. Only static hash-based algorithm would cause a memory increase on the backend but they're rarely used with many servers due to the high risk of rebalancing so I gues that could be a pretty reasonable change. We'd just document that the keyword "backup" means "server of the secondary group", and probably figure new actions or decisions to force to use one group over the other one. Please note that I'd rather avoid adding too many groups into a farm because we don't want to start to scan many of them. If keeping 2 as we have today is already sufficient for your use case, I'd rather stick to this. We still need to put a bit more thoughts on this because I vaguely remember an old discussion where someone wanted to use a different LB algorithm for the backup servers. Here in terms of implementation it would not be a big deal, we could have one LB algo per group. But in terms of configuration (for the user) and configuration storage (in the code), it would be a real pain. But possibly that it would still be worth the price if it starts to allow to assemble a backend by "merging" several groups (that's a crazy old idea that has been floating around for 10+ years and which could possibly make sense in the future to address certain use cases). If you're interested in going on these ideas, please, oh please, never forget about the queues (those that are used when you set a maxconn parameter), because their behavior is thightly coupled with the LB algorithms, and the difficulty is to make sure a server which frees a connection slot can immediately pick the oldest pending request either in its own queue (server already assigned) or the backend's (don't care about what server handles the request). This may become more difficult when dealing with several groups, hence possibly queues. My secret agenda would ideally be to one day support shared server groups with their own queues between multiple backends so that we don't even need to divide the servers' maxconn anymore. But it's still lacking some reflexion. I'm dumping all that in case it can help you get a better idea of the various mid-term possibilities and what the steps could be (and also what not to do if we don't want to shoot ourselves in the foot). Cheers, Willy