Josip Rodin <[email protected]> writes: > On Tue, Dec 08, 2009 at 10:10:11AM +0100, Bj??rn Mork wrote: >> The symptoms are that all home servers are marked dead/zombie. Typical >> obfuscated home_server list in this state: >> >> server(bjorn) ~ 71$ radmin -e "show home_server list" >> 192.168.8.120 1812 auth alive 0 >> 192.168.8.246 1812 auth alive 0 >> 192.168.8.132 1645 auth dead 0 >> 192.168.8.132 1645 auth dead 3 >> 192.168.8.14 1812 auth alive 0 >> 192.168.8.10 1812 auth alive 0 >> 192.168.8.210 1812 auth alive 0 >> 192.168.8.50 1812 auth zombie 0 >> 192.168.8.20 1812 auth zombie 0 >> >> There are a number of servers marked "alive", but these are all servers >> which have been revived after the fixed period. When used, they will be >> marked dead/zombie again. > > What does the log file say? There should be many messages marked 'Proxy' > in the v2.1.x branch since a couple of weeks ago, and definitely in your > case if they keep changing state so often.
Sure. You'll find all "Proxy:" prefixed messages sinc log rotation at midnight here: http://www.mork.no/~bjorn/fr-218-prerelease-proxying.log (It's 300 kB, so it was a little over the limit for this list) The addresses have been replaced using the same pattern as for the "home_server list", so they are directly comparable. You'll notice that some of the home servers are truly unavailable. This is unfortunately something we have to live with. There are also some servers which are unavailable at certain times, but mostly available. At approximately 08:40 something happens, and a lot of servers are flagged as dead or zombie. This could of course have been caused by network problems, but there was no such problem at this time. Proxying goes over the same interface as the rest of the traffic, and non-proxied authentication and accounting continued to work without problems. The home servers are in a number of different networks, and any network incident taking them all out would be very visible. The server was restarted at 09:35 and you'll see that only the usual suspects are logged as zombies after this. >> But I will test that now, starting with the stable branch from >> git.freeradius.org, commit d7b4f003477644978f3fefa694305dce9b5dc8bf, >> which was the last point where things seemed to work > > BTW you could probably do a git bisect. Yes, if I can verify the good versions... As I said, I'm not entirely sure that there actually was a good version. And it looks like each positive test will have to take 3+ days. Unless I can find out what triggers this. Bjørn - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html

