Yup. Typically once something fails I consider it questionable / unstable until it proves itself to me again. The routing / circuit analogy is a perfect example.
Many HA "things" allow the user to configure preemption or not - such that once the primary node fails and the secondary takes over, when the primary is believed to be healthy again, does it "automatically" become the primary again - OR - must the admin manually make it the primary again? Personally preemption is disabled in all my HA routers, firewalls, etc. Once something fails I want to review / analyze the failure and validate it's stable before I trust it again and start running traffic through it! G -----Original Message----- From: freeradius-users-bounces+ggatten=waddell....@lists.freeradius.org [mailto:freeradius-users-bounces+ggatten=waddell....@lists.freeradius.org] On Behalf Of Alexander Clouter Sent: Thursday, August 04, 2011 9:20 AM To: [email protected] Subject: Re: num_answers_to_alive Stefan Winter <[email protected]> wrote: > > The documentation says that 3..10 are *useful* ranges, but doesn't > mention that everything else is forbidden. In particular, I would like > to use 1, not 3. The idea is: the server was dead before, but now it > managed to send a reply back - so it must have been fixed. I would like > to mark it alive immediately. Is that unreasonable? > Similar to 'link flapping' (think OSPF/BGP), you should use heuristics as things are not just black and white. If a service simply had two states "up" and "down" then that probably would be okay, but we also have 'unstable'. Imagine this state coming from: * overloaded RADIUS server (or backend DB) * link congestion between RADIUS servers Having a value of three, says not just "alive" but also "alive and has been for a while"; this could be further interpreted that the service is stable as well as alive. If the system briefly came back and died then on attempt two or three you would have likely seen a failure. Hope I am explaining myself well :) Cheers -- Alexander Clouter .sigmonster says: BOFH excuse #256: You need to install an RTFM interface. - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html <font size="1"> <div style='border:none;border-bottom:double windowtext 2.25pt;padding:0in 0in 1.0pt 0in'> </div> "This email is intended to be reviewed by only the intended recipient and may contain information that is privileged and/or confidential. If you are not the intended recipient, you are hereby notified that any review, use, dissemination, disclosure or copying of this email and its attachments, if any, is strictly prohibited. If you have received this email in error, please immediately notify the sender by return email and delete this email from your system." </font> - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html

