Re: Freeradius is not restarting properly (fails to quit and becomes a zombie process)
Jason Wittlin-Cohen [EMAIL PROTECTED] wrote: I have discovered the root of the problem. When I enable the check_cert_cn = %{User-Name} option in eap.conf and successfully authenticate 1 user , a restart or stop of the radiusd service leads to a zombie process which needs to be killed with kill -9. If this option is disabled, as is the default setting, radiusd can be restarted normally without issue. This issue does not occur if either a) no users have attempted to authenticate, or b) users have authenticated but were rejected. Is this a known issue? No. It's *very* weird. I'll try to take a look at it this week. Alan DeKok. -- http://deployingradius.com - The web site of the book http://deployingradius.com/blog/ - The blog - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Re: Freeradius is not restarting properly (fails to quit and becomes a zombie process)
Alan DeKok wrote: Jason Wittlin-Cohen [EMAIL PROTECTED] wrote: Over the last few days I've been having a recurring problem. Whenever I start Freeradius either with radiusd in a terminal or as a service in Debian, I can not restart/kill radiusd properly if it's authenticated any clients. Restarting the service says it's successful but the radius log states that port 1812 is already in use. "top" shows 100% cpu usage It looks like http://bugs.freeradius.org/show_bug.cgi?id=365 The solution is to not re-initialize the modules on HUP. It works in *most* cases, because the code handling the HUP tries to wait until all of the modules have stopped. But if your back-end DB's are slow, it doesn't have much choice but to proceed with handling the HUP. Most people don't see it because the modules respond quickly. I'd say the first step to a work-around is to make sure none of the modules you're using are blocking the server. Alan DeKok. -- http://deployingradius.com - The web site of the book http://deployingradius.com/blog/ - The blog - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html I have discovered the root of the problem. When I enable the "check_cert_cn = %{User-Name}" option in eap.conf and successfully authenticate 1 user , a restart or stop of the radiusd service leads to a zombie process which needs to be killed with "kill -9". If this option is disabled, as is the default setting, radiusd can be restarted normally without issue. This issue does not occur if either a) no users have attempted to authenticate, or b) users have authenticated but were rejected. Is this a known issue? Jason Wittlin-Cohen - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Re: Freeradius is not restarting properly (fails to quit and becomes a zombie process)
Jason Wittlin-Cohen [EMAIL PROTECTED] wrote: Over the last few days I've been having a recurring problem. Whenever I start Freeradius either with radiusd in a terminal or as a service in Debian, I can not restart/kill radiusd properly if it's authenticated any clients. Restarting the service says it's successful but the radius log states that port 1812 is already in use. top shows 100% cpu usage It looks like http://bugs.freeradius.org/show_bug.cgi?id=365 The solution is to not re-initialize the modules on HUP. It works in *most* cases, because the code handling the HUP tries to wait until all of the modules have stopped. But if your back-end DB's are slow, it doesn't have much choice but to proceed with handling the HUP. Most people don't see it because the modules respond quickly. I'd say the first step to a work-around is to make sure none of the modules you're using are blocking the server. Alan DeKok. -- http://deployingradius.com - The web site of the book http://deployingradius.com/blog/ - The blog - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Re: Freeradius is not restarting properly (fails to quit and becomes a zombie process)
Jason Wittlin-Cohen wrote: Over the last few days I've been having a recurring problem. Whenever I start Freeradius either with radiusd in a terminal or as a service in Debian, I can not restart/kill radiusd properly if it's authenticated any clients. Restarting the service says it's successful but the radius log states that port 1812 is already in use. top shows 100% cpu usage after I attempt to restart radiusd. In addition, kill will not work. I need to use kill -9. No errors are thrown when I try to kill it in debug mode either. It just says exiting and sits there but doesn't die. Howdy Jason, Might you get any useful info by running radiusd with strace? Cheers, -- James Wakefield, Unix Administrator, Information Technology Services Division Deakin University, Geelong, Victoria 3217 Australia. Phone: 03 5227 8690 International: +61 3 5227 8690 Fax: 03 5227 8866 International: +61 3 5227 8866 E-mail: [EMAIL PROTECTED] Website: http://www.deakin.edu.au - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Re: Freeradius is not restarting properly (fails to quit and becomes a zombie process)
select(5, [3 4], NULL, NULL, {6, 0})= 1 (in [3], left {5, 992000}) time(NULL) = 1159497421 recvfrom(3, \1\1\0\227\247\326\245\\\207\222(\352H\305\311\213\300..., 4096, 0, {sa_family=AF_INET, sin_port=htons(2054), sin_addr=inet_addr(192.168.0.1)}, [16]) = 151 write(1, rad_recv: Access-Request packet ..., 77rad_recv: Access-Request packet from host 192.168.0.1:2054, id=1, length=151 ) = 77 time(NULL) = 1159497421 write(1, \tUser-Name = \Jason Wittlin-Cohe..., 35User-Name = Jason Wittlin-Cohen ) = 35 write(1, \tNAS-IP-Address = 192.168.0.1\n, 30 NAS-IP-Address = 192.168.0.1 ) = 30 write(1, \tCalled-Station-Id = \00160112eb..., 36 Called-Station-Id = 00160112ebda ) = 36 write(1, \tCalling-Station-Id = \00095b934..., 37 Calling-Station-Id = 00095b93459e ) = 37 write(1, \tNAS-Identifier = \00160112ebda\..., 33 NAS-Identifier = 00160112ebda ) = 33 write(1, \tNAS-Port = 8\n, 14 NAS-Port = 8 )= 14 write(1, \tFramed-MTU = 1400\n, 19Framed-MTU = 1400 ) = 19 write(1, \tState = 0x8570d74429dcf8507949a..., 44 State = 0x8570d74429dcf8507949ae638bd52940 ) = 44 write(1, \tNAS-Port-Type = Wireless-802.11..., 33 NAS-Port-Type = Wireless-802.11 ) = 33 write(1, \tEAP-Message = 0x020800060d00\n, 30 EAP-Message = 0x020800060d00 ) = 30 write(1, \tMessage-Authenticator = 0xb781d..., 60 Message-Authenticator = 0xb781dd8563450fa51bff3ce9be35dac3 ) = 60 time(NULL) = 1159497421 write(1, Processing the authorize secti..., 51 Processing the authorize section of radiusd.conf ) = 51 time(NULL) = 1159497421 write(1, modcall: entering group authoriz..., 48modcall: entering group authorize for request 8 ) = 48 time(NULL) = 1159497421 write(1, modcall[authorize]: module \pr..., 67 modcall[authorize]: module preprocess returns ok for request 8 ) = 67 time(NULL) = 1159497421 write(1, modcall[authorize]: module \ch..., 63 modcall[authorize]: module chap returns noop for request 8 ) = 63 time(NULL) = 1159497421 write(1, modcall[authorize]: module \ms..., 65 modcall[authorize]: module mschap returns noop for request 8 ) = 65 time(NULL) = 1159497421 write(1, rlm_realm: No \'@\' in User-Na..., 82rlm_realm: No '@' in User-Name = Jason Wittlin-Cohen, looking up realm NULL ) = 82 time(NULL) = 1159497421 time(NULL) = 1159497421 write(1, rlm_realm: No such realm \NU..., 36rlm_realm: No such realm NULL ) = 36 time(NULL) = 1159497421 write(1, modcall[authorize]: module \su..., 65 modcall[authorize]: module suffix returns noop for request 8 ) = 65 time(NULL) = 1159497421 write(1, rlm_eap: EAP packet type respo..., 50 rlm_eap: EAP packet type response id 8 length 6 ) = 50 time(NULL) = 1159497421 write(1, rlm_eap: No EAP Start, assumin..., 68 rlm_eap: No EAP Start, assuming it's an on-going EAP conversation ) = 68 time(NULL) = 1159497421 write(1, modcall[authorize]: module \ea..., 65 modcall[authorize]: module eap returns updated for request 8 ) = 65 time(NULL) = 1159497421 write(1, users: Matched entry Jason W..., 56users: Matched entry Jason Wittlin-Cohen at line 96 ) = 56 time(NULL) = 1159497421 write(1, modcall[authorize]: module \fi..., 62 modcall[authorize]: module files returns ok for request 8 ) = 62 time(NULL) = 1159497421 write(1, modcall: leaving group authorize..., 65modcall: leaving group authorize (returns updated) for request 8 ) = 65 time(NULL) = 1159497421 write(1, rad_check_password: Found Aut..., 43 rad_check_password: Found Auth-Type EAP ) = 43 time(NULL) = 1159497421 write(1, auth: type \EAP\\n, 17auth: type EAP )= 17 time(NULL) = 1159497421 write(1, Processing the authenticate se..., 54 Processing the authenticate section of radiusd.conf ) = 54 time(NULL) = 1159497421 write(1, modcall: entering group authenti..., 51modcall: entering group authenticate for request 8 ) = 51 time(NULL) = 1159497421 write(1, rlm_eap: Request found, releas..., 49 rlm_eap: Request found, released from the list ) = 49 time(NULL) = 1159497421 write(1, rlm_eap: EAP/tls\n, 19 rlm_eap: EAP/tls )= 19 time(NULL) = 1159497421 write(1, rlm_eap: processing type tls\n, 31 rlm_eap: processing type tls ) = 31 time(NULL) = 1159497421 write(1, rlm_eap_tls: Authenticate\n, 28 rlm_eap_tls: Authenticate ) = 28 time(NULL)