Alan DeKok <[email protected]> writes: > Bjørn Mork wrote: >> I am now seeing this very same problem, and strongly suspect it to be >> related to dead proxy home servers. I was able to provoke the "Exiting >> normally" on a server with *no* traffic at all, by doing a couple of >> requests for a realm with dead home servers and then waiting: >> >> Wed Nov 25 18:03:56 2009 : Error: PROXY: Marking home server 88.a.b.158 >> port 1812 as zombie (it looks like it is dead). >> Wed Nov 25 18:04:35 2009 : Error: PROXY: Marking home server 84.c.d.222 >> port 1812 as zombie (it looks like it is dead). >> Wed Nov 25 19:38:13 2009 : Info: Exiting normally. >> >> No requests at all were sent to this server between the two last log >> lines. > > Hmm... the "exiting normally" means that it received a signal to exit > (internal or external). Otherwise, it just keeps running. > > Try using gdb, and: > > (gdb) break event_loop_exit > (gdb) break radius_signal_self > (gdb) cond 1 (flag == 2) > > (gdb) run > > And then when it stops: > > (gdb) thread apply all bt full > > That *should* catch the stack trace where it exits.
Will do. Thanks >> I was planning to use the 2.1.7 release, but hit the recursive mutex >> problem. > > Ugh. Some systems don't support recursive mutexes, and even better, > don't complain when you try to use them! > >> Now, adding the two facts, I'm starting to wonder whether the >> "Exiting normally" bug might be related to the fix for the recursive >> mutexes? They are both related to dead home servers. Makes me >> suspicious... > > Quite possibly, yes. But the fact that it exits a minute and a half > after the last packet is odd. Note that it's an hour and a half. Which I guess is even more odd. This is todays events for the server which is in production: server ~ 1004$ grep Exit log/radius.log Thu Nov 26 02:08:20 2009 : Info: Exiting normally. Thu Nov 26 04:16:52 2009 : Info: Exiting normally. Thu Nov 26 05:52:20 2009 : Info: Exiting normally. Thu Nov 26 07:40:19 2009 : Info: Exiting normally. Notice the pattern. There's 1.5 ~ 2 hours between each restart. Bjørn - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html

