For some time we've had an intermittent problem where freeradius becomes unresponsive, consuming at least 98% of the CPU. A 'kill -TERM' will sometimes kill the daemon, but usually a 'kill -9' is needed. This always seems to happen right about when we reload the config (with a HUP), or stop and restart the daemon. We're using freeradius 1.0.5 on FreeBSD 5.5-PRERELEASE, but we also saw this problem with FR 1.0.1 on FreeBSD 5.3 and 5.4.
We recently added an authorization step (using LDAP and perl) to our Kerberos-authenticated wireless service, and this might offer some clues about the timing of the problem. A sample from our radius.log is attached. At 14:12:04, two users (user1 and user2) connect at about the same time, are authorized by rlm_perl, and then authenticated with Kerberos. Then, at 14:12:06, user3 connects just as the daemon gets the HUP signal. User3 is authorized, but never gets authenticated; the daemon reloads, but is unresponsive until I kill and restart it at 14:30:01 I'm guessing that the daemon can hang if it gets a signal just as the rlm_krb5 module is called. It's marked RLM_TYPE_THREAD_UNSAFE, so it gets a mutex, and attaching gdb to the hung daemon showed: [Switching to LWP 100171] 0x28205940 in pthread_mutexattr_init () from /usr/lib/libpthread.so.1 (gdb) where #0 0x28205940 in pthread_mutexattr_init () from /usr/lib/libpthread.so.1 Is anyone aware of any freeradius thread problems (especially related to the FreeBSD thread libraries) that might explain this? Any suggestions for avoiding the problem or tracking it down in more detail? Thanks, -- George C. Kaplan [EMAIL PROTECTED] Communication & Network Services 510-643-0496 University of California at Berkeley
Tue Mar 7 14:12:04 2006 : rlm_perl: [user1] EMPLOYEE-TYPE-ACADEMIC OK Tue Mar 7 14:12:04 2006 : rlm_perl: [user2] STUDENT-TERM-SPRING OK Tue Mar 7 14:12:04 2006 : Auth: Login OK: [user1] (from client wireless-gw1 port 3002 cli 000e35a87f43) Tue Mar 7 14:12:04 2006 : Auth: Login OK: [user2] (from client wireless-gw1 port 1002 cli 00904b5de43b) Tue Mar 7 14:12:06 2006 : rlm_perl: [user3] STUDENT-TERM-SPRING OK Tue Mar 7 14:12:06 2006 : Info: Reloading configuration files. Tue Mar 7 14:12:06 2006 : Info: Using deprecated naslist file. Support for this will go away soon. Tue Mar 7 14:12:06 2006 : Info: rlm_exec: Wait=yes but no output defined. Did you mean output=none? Tue Mar 7 14:12:06 2006 : Auth: rlm_krb5: krb5_init ok Tue Mar 7 14:12:06 2006 : Info: rlm_passwd: nfields: 3 keyfield 0(User-Name) listable: no Tue Mar 7 14:12:06 2006 : Info: rlm_passwd: nfields: 6 keyfield 0(User-Name) listable: no Tue Mar 7 14:12:06 2006 : Info: rlm_passwd: nfields: 2 keyfield 0(User-Name) listable: no Tue Mar 7 14:12:07 2006 : Info: Ready to process requests. Tue Mar 7 14:30:01 2006 : Error: WARNING: Unresponsive child (id 136453120) for request 31420 Tue Mar 7 14:30:03 2006 : Info: Using deprecated naslist file. Support for this will go away soon. Tue Mar 7 14:30:03 2006 : Info: rlm_exec: Wait=yes but no output defined. Did you mean output=none? Tue Mar 7 14:30:03 2006 : Auth: rlm_krb5: krb5_init ok Tue Mar 7 14:30:03 2006 : Info: rlm_passwd: nfields: 3 keyfield 0(User-Name) listable: no Tue Mar 7 14:30:03 2006 : Info: rlm_passwd: nfields: 6 keyfield 0(User-Name) listable: no Tue Mar 7 14:30:03 2006 : Info: rlm_passwd: nfields: 2 keyfield 0(User-Name) listable: no Tue Mar 7 14:30:03 2006 : Info: Ready to process requests.
- List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html