For some time we've had an intermittent problem where freeradius becomes
unresponsive, consuming at least 98% of the CPU.  A 'kill -TERM' will
sometimes kill the daemon, but usually a 'kill -9' is needed.  This
always seems to happen right about when we reload the config (with a
HUP), or stop and restart the daemon.  We're using freeradius 1.0.5 on
FreeBSD 5.5-PRERELEASE, but we also saw this problem with FR 1.0.1 on
FreeBSD 5.3 and 5.4.

We recently added an authorization step (using LDAP and perl) to our
Kerberos-authenticated wireless service, and this might offer some clues
about the timing of the problem.  A sample from our radius.log is
attached.  At 14:12:04, two users (user1 and user2) connect at about the
same time, are authorized by rlm_perl, and then authenticated with
Kerberos.  Then, at 14:12:06, user3 connects just as the daemon gets the
HUP signal.  User3 is authorized, but never gets authenticated; the
daemon reloads, but is unresponsive until I kill and restart it at 14:30:01

I'm guessing that the daemon can hang if it gets a signal just as the
rlm_krb5 module is called.  It's marked RLM_TYPE_THREAD_UNSAFE, so it
gets a mutex, and attaching gdb to the hung daemon showed:

[Switching to LWP 100171] 0x28205940 in pthread_mutexattr_init () from
/usr/lib/libpthread.so.1
(gdb) where
#0  0x28205940 in pthread_mutexattr_init () from /usr/lib/libpthread.so.1

Is anyone aware of any freeradius thread problems (especially related to
the FreeBSD thread libraries) that might explain this?  Any suggestions
for avoiding the problem or tracking it down in more detail?

Thanks,

-- 
George C. Kaplan                            [EMAIL PROTECTED]
Communication & Network Services            510-643-0496
University of California at Berkeley
Tue Mar  7 14:12:04 2006 : rlm_perl: [user1] EMPLOYEE-TYPE-ACADEMIC OK
Tue Mar  7 14:12:04 2006 : rlm_perl: [user2] STUDENT-TERM-SPRING OK
Tue Mar  7 14:12:04 2006 : Auth: Login OK: [user1] (from client wireless-gw1 
port 3002 cli 000e35a87f43)
Tue Mar  7 14:12:04 2006 : Auth: Login OK: [user2] (from client wireless-gw1 
port 1002 cli 00904b5de43b)
Tue Mar  7 14:12:06 2006 : rlm_perl: [user3] STUDENT-TERM-SPRING OK
Tue Mar  7 14:12:06 2006 : Info: Reloading configuration files.
Tue Mar  7 14:12:06 2006 : Info: Using deprecated naslist file.  Support for 
this will go away soon.
Tue Mar  7 14:12:06 2006 : Info: rlm_exec: Wait=yes but no output defined. Did 
you mean output=none?
Tue Mar  7 14:12:06 2006 : Auth: rlm_krb5: krb5_init ok
Tue Mar  7 14:12:06 2006 : Info: rlm_passwd: nfields: 3 keyfield 0(User-Name) 
listable: no
Tue Mar  7 14:12:06 2006 : Info: rlm_passwd: nfields: 6 keyfield 0(User-Name) 
listable: no
Tue Mar  7 14:12:06 2006 : Info: rlm_passwd: nfields: 2 keyfield 0(User-Name) 
listable: no
Tue Mar  7 14:12:07 2006 : Info: Ready to process requests.
Tue Mar  7 14:30:01 2006 : Error: WARNING: Unresponsive child (id 136453120) 
for request 31420
Tue Mar  7 14:30:03 2006 : Info: Using deprecated naslist file.  Support for 
this will go away soon.
Tue Mar  7 14:30:03 2006 : Info: rlm_exec: Wait=yes but no output defined. Did 
you mean output=none?
Tue Mar  7 14:30:03 2006 : Auth: rlm_krb5: krb5_init ok
Tue Mar  7 14:30:03 2006 : Info: rlm_passwd: nfields: 3 keyfield 0(User-Name) 
listable: no
Tue Mar  7 14:30:03 2006 : Info: rlm_passwd: nfields: 6 keyfield 0(User-Name) 
listable: no
Tue Mar  7 14:30:03 2006 : Info: rlm_passwd: nfields: 2 keyfield 0(User-Name) 
listable: no
Tue Mar  7 14:30:03 2006 : Info: Ready to process requests.

- 
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html

Reply via email to