On Nov 29, 2009, at 10:45 AM, Nigel Kersten wrote: > > > On Sun, Nov 29, 2009 at 6:43 AM, Ohad Levy <[email protected]> wrote: > > > On Sun, Nov 29, 2009 at 6:31 AM, Christian Hofstaedtler <[email protected]> > wrote: > load sharing > and/or > fault tolerance > > And both of these things are currently very complicated to do as > long as the client has only one hostname to talk to. > > Why not change this? > > +1 - I think its acceptable for each client to connect to one server > and keep on using that server for its whole "puppet run". > > I think it's perfectly reasonable, and would make failover a lot > simpler. > > Do we have all the information internally to tell when an error > indicates that a server is unavailable?
Hah, I doubt it. > Would we give up and consider it a failure every time we can't find > a puppet:/// file resource? So we'd be changing behavior when > someone typos a puppet URI ? Should the behavior be different if we > time out retrieving rather than not being able to find it at all? Urgh, no way. The connection itself needs to fail, not just have some random exception - probably, anything other than a timeout wouldn't constitute a failure. > However, does this really help load balancing? > > Say one server in a pair is overloaded and timing out on file > resources... do clients simply start their run all over again with > the other server? That seems kind of inefficient.... given that they > may have all progressed quite far into their run. With the 'retry' functionality in ruby, the caller never knows of a problem unless none of the servers work. You pick a new server, reconnect, and keep going. I can't think of anything that would reasonably restart the whole run itself. > I'd really like to be able to combine both. Shared state for load > balanced pairs, multiple servers in the client config for failover > and restarting the current run. > > > a simple solution might be to implement a DNS SRV record (e.g. like > LDAP) which allows the client to decide to which puppetmaster he > would like to connect to. > this in time could be enhanced to get the server load etc (so it > could try to use another server or to wait for a while). > > This is essentially what we're doing now. We have simple monitoring > in place so all our clients can check the load of the puppet server > their DNS view points to, and fall back to an alternate server if > the load is too high. > > > I would be happy not to add any additional depedencies (even though > memcache is acceptable) - a specially a database, e.g. if I have 5 > locations where i need HA + load sharing, i don't want to end up > maintaining 5 set of clusters. > > ++ > > I can't see a way around shared state for efficient load balancing, > but think that being able to provide a list of puppet servers to the > clients would greatly help with failover. I agree. Any volunteers? :) Especially since rowlf was supposed to be out this quarter but it looks like we'll be releasing 0.25.2 on the same timeframe instead. -- The great tragedy of Science - the slaying of a beautiful hypothesis by an ugly fact. --Thomas H. Huxley --------------------------------------------------------------------- Luke Kanies | http://reductivelabs.com | http://madstop.com -- You received this message because you are subscribed to the Google Groups "Puppet Developers" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/puppet-dev?hl=en.
