Alan DeKok wrote:
Philip Molter wrote:
Thanks for your patience with this.  I'm migrating from an old RADIUS
platform that supports this behavior to freeradius, and I'm just trying
to make sure I get everything working.

  What behavior?  Failover from one home server to another?  FreeRADIUS
does this already.

  I think what you want is to have the re-transmits switch from one home
server to another *before* the first one has been marked dead.  This is
difficult to do automatically.  Something like "send retransmits to a
backup server" is possible, but can have cause other problems.

  But you can use "radmin" to do this manually.

What I really want is just, instead of the request being marked as
failed when one of the home servers doesn't respond, for the proxy
subsystem to just try sending the request to another configured home
server.

  But it already does that.  Run the server, and watch how it behaves.
As I said before, the difficulty is determining *when* to do this failover.

 If the proxy has tried sending a request to every non-zombie
home server in the list and still hasn't gotten anything, then it can
mark the request as failed.

  Sorry, but it takes time to determine that a home server has failed.
By the time this decision has been made for 2-3 home servers, 30 seconds
have usually passed, and the NAS has given up on the request.

The way I originally thought it was going to work is similar to how
modules are load-balanced.  If I have five SQL servers loaded through 5
named SQL module configs, it will try the first, then the second, then
the third until one of them returns success.  It would be great if the
proxy load-balancing could work the same way.

  Unless I'm really missing something, it already does this.  Just
configure "type = load-balance" in the home server pool.

  Have you done this?

Yes, this is the configuration I'm currently running, and it's not working for me. I have a radclient sending a request, retrying 10 times on a 5-second timer, and after 10 retries, it still hasn't gotten a response. After the second retry, the proxy has marked the server as at least a zombie and started status-checks, but every retransmit after that is getting a cached result of no response.

  What do you expect the proxy to do with requests sent to a home server
that *might* be down?  How should the proxy decide that the home server
is down?  Be specific.  Draw flow diagrams...

This is what I want to happen

client req ->  proxy
               proxy req ->  home server #1
client ret ->  proxy
               proxy ret ->  home server #1
              [proxy fails home server #1 for lack of response]
client ret ->  proxy
               proxy req ->  home server #2
               proxy <- resp home server #2
client <- resp proxy


This is what is happening with my post-proxy config:

client req ->  proxy
               proxy req -> home server #1
client ret ->  proxy
               proxy ret -> home server #1
              [proxy fails home server #1 for lack of response]
client ret ->  proxy
              [proxy detects retransmit, does nothing]
client ret ->  proxy
              [proxy detects retransmit, does nothing]
client ret ->  proxy
              [proxy detects retransmit, does nothing]
...

This is what happens without a post-proxy config:

client req ->  proxy
               proxy req -> home server #1
client ret ->  proxy
               proxy ret -> home server #1
              [proxy fails home server #1 for lack of response]
client  <- rej proxy

  If you can come up with a better algorithm, then by all means we'll
implement it.  But coming up with an algorithm that works *well* from
limited information is hard.

  The issue with your configuration is that you are trying valiantly to
game the system.  You're setting the timers *way* too low, and the
marking the requests as failed too early.  When the NAS retransmits, you
claim you want the proxy to fail over to another server... AFTER you've
already told it to give up on the request.

My config is not marking any request as failed. If I do not configure anything for Post-Proxy-Type, I get back an Access-Reject right when the first home server fails. There is no failover. The comments in proxy.conf make that clear:

#  If the home server doesn't respond to the request within
#  this time, this server will consider the request dead, and
#  respond to the NAS with an Access-Reject.

In other words, if the server the load-balance solution happens to choose doesn't respond to my request, tough luck. I might have 19 other servers configured that are up, the request I just sent is getting an Access-Reject. The Post-Proxy-Type is just a hack to at least not send back an Access-Reject which breaks the whole process.

  Your configuration is contradicting your stated needs.  Fix one or the
other so that there is no contradiction.

Okay, so I obviously do not understand how I can tweak response_window and zombie_period to make sure that requests that can be serviced by many possible RADIUS home servers do not return an Access-Reject when one of those home servers does not respond.

Here are my stated needs.

The client sends a request to the proxy. If a home server does not respond within a short period of time to the request, a second home server is chosen. If the second home server does not respond to the same request, then a third is chosen. This continues until all possible home servers are exhausted. At that point, an Access-Reject packet is sent back to the client. Otherwise, the response from the home server is sent back to the client.

How do I configure that? It doesn't seem to matter what I set response_window or zombie_period to, once the first home server fails to respond, an Access-Reject (or nothing if I configure a post-proxy handler) is returned to the client. My client's not going to retry the request if he gets an Access-Reject, so I need the proxy to retry it.

Is that possible?

Philip
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html

Reply via email to