Alan DeKok wrote: > Adam Bultman wrote: >> How do I change that functionality? I'd *love* it if it didn't zombie >> their servers for no reason. > > No.. it marks the servers zombie for a reason: they're not responding. > But it may be too aggressive. > >> When I do a radiusd -CXXX, I see options I don't see documented for the >> latest releases of freeradius: >> - ping_check >> - ping_interval >> - num_pings_to_alive > > Those are for backwards compatibility with pre-releases of 2.0. They > should be removed. They are just different names for the status-server > checks. > Excellent; I was wondering if I was somehow not "seeing" something as I went through the documentation. >> - max_outstanding (I can't even find what this is for) > > You can put a limit on the total number of "outstanding" packets sent > to a home server. i.e. put it at 256, and if there are 256 packets sent > without a response, the proxy will *not* use that home server again, > until it gets at least one response. > > This is a way to do load-limiting on home servers. > >> As it is, my *.work files are "stuck" (And I've googled for that, and >> found other list posts regarding that) which seems to indicate that the >> home servers aren't responding... except that even when my detail.work >> file is 'stuck' at 24k, and the detail file keeps growing, I'm still >> sending data to the other side. So something's working, but only sort of.. > > It's re-transmitting the same packet over and over. If you install > 2.1.9, you can use "radmin" to see its progress in reading the detail file. > After some work getting 2.1.9, and v2.1.x from the git repository up and running, I had to go back to 2.1.7-7, that is patched (hopefully, anyway!) for the "zombie" problem, via the patch you sent me. The 2.1.9 and 2.1.10 versions would die unexpectedly, right around the time the "Info: ... ... adding new socket command file /var/run/radiusd/radiusd.sock " would scroll through the debug. I couldn't figure it out for the life of me, and strace didn't give me too much - it'd just segfault right around that time. It also did it on vanilla installs of 2.1.10, too - so I just gave it up.
At any rate, "radmin" *does* exist for 2.1.7-7 (from the redhat source, which I patched with the patch you gave me), but it's complaining about permissions on the sock file (which appear to be fine, but perhaps selinux is killing it, I have to take a gander) - once I get that ironed out, I'll take great pleasure in using radmin and seeing what it sees. >> I'm about to shoot an email to them to see if they can explain their 4 >> year old radius software, and perhaps maybe that's part of the problem. > > Yup. They can upgrade to a (cough) real radius server. :) > Turns out, they were a bit stand-offish. They didn't like their radius servers being implicated in the mix. "It's working for 30+ clients, so we have no plans to upgrade". One thing I also noticed was that it it doesn't look like freeradius is giving it very many tries on a packet before marking the system down. At least, that's the way it appears. I don't know how to use wireshark filters enough to find unacked packets, so I have to do that before I'll be able to piece that together. It is also noteworthy that upon pingscanning their network, I found two IP addresses that are up - and I'm getting packet loss to them. Between 4 and 7 percent, which while not a ton, might be enough to cause a problem if I'm relaying thousands of packets an hour. Thanks for the help, Alan. I appreciate it. -- Adam - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html

