AW: AW: AW: EAP-TLS Failed in handler question
Hi! Phil, thx again for your help - according to Extreme the bug has been fixed in summitX-15.2.2.7-patch1-2 PD4-3163943281 802.1x re-authentication fails when EAP ID reaches 255. This version fixes also a bug we reported which is related to 802.1x PD4-3271740739 While using Dot1x and MAC-based netlogin on the same port, the MAC reauthentication timer should stop after the client is authenticated with dot1x credentials. -Ursprüngliche Nachricht- Von: freeradius-users-bounces+robert.penz=tirol.gv...@lists.freeradius.org [mailto:freeradius-users-bounces+robert.penz=tirol.gv...@lists.freeradius.org] Im Auftrag von PENZ Robert Gesendet: Dienstag, 11. Dezember 2012 16:30 An: FreeRadius users mailing list Betreff: AW: AW: AW: EAP-TLS Failed in handler question Hi! Phil, Really BIG THANKS for your help! I'll talk to Extreme Networks. Robert - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Re: AW: AW: EAP-TLS Failed in handler question
On 12/10/2012 08:00 PM, PENZ Robert wrote: @PhilMayers: Did you get the Mail with the full logfile? do you need more? I did, but honestly I prioritise personal help emails lower than ones to the list, sorry. I'll see if I have time to look today. - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Re: AW: AW: EAP-TLS Failed in handler question
On 10/12/12 20:00, PENZ Robert wrote: @PhilMayers: Did you get the Mail with the full logfile? do you need more? Ok, your NAS is buggy I'm afraid. In some small percentage of cases, it is not handling the wrapping of EAP id values from 255 to 0. The following sequence of (redacted) packets shows the problem (see line ~2389268 in your debug for this example, but there are lots of others in there): Access-Request packet from host NAS port 54217, id=183, length=151 User-Name = host/blah EAP-Message = 0x02ff... NAS-IP-Address = NAS Service-Type = Login-User Calling-Station-Id = MAC NAS-Port-Id = x:y NAS-Port = x00y NAS-Port-Type = Ethernet Message-Authenticator = 0x26710066ee2e161ba4979519e82cde59 ... [eap] EAP packet type response id 255 length 33 ... +- entering group EAP {...} [eap] EAP Identity [eap] processing type tls [tls] Requiring client certificate [tls] Initiate [tls] Start returned 1 ... Sending Access-Challenge of id 183 to 10.15.132.5 port 54217 EAP-Message = 0x01060d20 Message-Authenticator = 0x State = 0xe043a0c1e043ad9227375e26b2f8cb62 Note that the access-request contains an EAP response with id=255, and we return an EAP request with id=0, having wrapped around. The NAS follows up with: Access-Request packet from host 10.15.132.5 port 54217, id=184, length=241 User-Name = host/blah EAP-Message = 0x02ff... NAS-IP-Address = NAS Service-Type = Login-User Calling-Station-Id = MAC NAS-Port-Id = x:y NAS-Port = x00y NAS-Port-Type = Ethernet State = 0xe043a0c1e043ad9227375e26b2f8cb62 Message-Authenticator = 0x03a814fd68371689281f1e66a4728614 ... [eap] EAP packet type response id 255 length 105 ... rlm_eap: No EAP session matching the State variable. That is - we send an Access-Challenge containing an EAP request id=0, the client responds with an Access-Request containing EAP response id=255. This is obviously wrong. FreeRADIUS mixes certain data into the State value with a xor including the EAP id - that's why you're getting that particular error message, but the underlying problem is that the NAS is not always handling EAP id value wrap correctly. I'm curious as to why the EAP id values are so large - I don't think most NASes do this, they start from id=1 on every conversation, but I don't know if it's legal. The ID wrapping seems to work in other cases; I'm not certain, but it *may* be that it only fails if the sequence is: C: access-request EAP-response id=255 EAP-Identity S: access-challenge EAP-request id=0 PEAP-start C: access-request EAP-response id=255 PEAP-data i.e. if the initial EAP-identity is the one with id=255. But anyway - I think your NAS is buggy. There's no way you can solve this in FreeRADIUS - you obviously can't rewrite the EAP id, so I think you'll need to open a bug report with the vendor. There is one thing you *might* be able to do which *might* work, but it's dependent on what the NAS does - if I'm right and it's only Identity packets that don't wrap properly, you might be able to detect EAP identity packets and modify the ID and *maybe* the Extreme switch will reply in-sequence. Like so: authorize { if (%{EAP-Message[0]} =~ /^0x02ff()01(.+)/) { # we have an EAP-identity packet id=255, see if we can force a wrap update request { EAP-Message := 0x0201%{1}01%{2} } } } However - I have no idea if this syntax will even work, and to be honest I'm extremely dubious that, if it does, the Extreme would respond properly. Cheers, Phil - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
AW: AW: AW: EAP-TLS Failed in handler question
Hi! Phil, Really BIG THANKS for your help! I'll talk to Extreme Networks. Robert - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
RE: AW: AW: EAP-TLS Failed in handler question
@PhilMayers: Did you get the Mail with the full logfile? do you need more? Mit freundlichen Grüßen Robert Penz Dipl. Inf. Robert Penz DVT-Daten-Verarbeitung-Tirol GmbH Adamgasse 22, 6020 Innsbruck Tel: +43 512 508 3334 / Fax: +43 512 508 3355 eMail: robert.p...@tirol.gv.at From: freeradius-users-bounces+robert.penz=tirol.gv...@lists.freeradius.org [freeradius-users-bounces+robert.penz=tirol.gv...@lists.freeradius.org] On Behalf Of PENZ Robert [robert.p...@tirol.gv.at] Sent: Wednesday, December 05, 2012 8:32 AM To: FreeRadius users mailing list Subject: AW: AW: AW: EAP-TLS Failed in handler question There is no other packet between this two and only 5 seconds, server has not been restarted. Weird. But we need the *full* debug please! some special option or the full log file? The second I send you in a private mail. Robert - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
AW: AW: EAP-TLS Failed in handler question
Hi! I was still not able to get a trace on the client site, but I believe these debug log entries should help. This time I got the start packet and it is within some seconds that I get the 2 packet to the radius server and the State variable seems to be the same. Ready to process requests. rad_recv: Access-Request packet from host 10.xx.xx.5 port 54217, id=11, length=152 User-Name = host/x.local EAP-Message = 0x02ff002101686f73742f4456542d303039363832322e7469726f6c2e6c6f63616c NAS-IP-Address = 10.xx.xx.5 Service-Type = Login-User Calling-Station-Id = xx-xx-xx-xx-xx-xx NAS-Port-Id = 1:29 NAS-Port = 1029 NAS-Port-Type = Ethernet Message-Authenticator = 0xd080844ef3e47a9bc21e8c848b5a8548 .. [eap] EAP packet type response id 255 length 33 [eap] No EAP Start, assuming it's an on-going EAP conversation +++[eap] returns updated ++- else else returns updated Found Auth-Type = EAP # Executing group from file /etc/raddb/sites-enabled/default +- entering group EAP {...} [eap] EAP Identity [eap] processing type tls [tls] Requiring client certificate [tls] Initiate [tls] Start returned 1 .. Sending Access-Challenge of id 11 to 10.xx.xx.5 port 54217 EAP-Message = 0x01060d20 Message-Authenticator = 0x State = 0x642534cc642539e20b4be1e3ae0328c0 Finished request 62603. Going to the next request Waking up in 4.9 seconds. rad_recv: Access-Request packet from host 10. xx.xx.5 port 54217, id=12, length=242 User-Name = host/x.tirol.local EAP-Message = 0x02ff00690d80005f160301005a0156030150bd9377fb696c9f5eaedc568220f9aa35ab65930cf2232f4131c054b056295418002f00350005000ac013c014c009c00a00320038001300040115ff0100010a0006000400170018000b00020100 NAS-IP-Address = 10.xx.xx.5 Service-Type = Login-User Calling-Station-Id = xx-xx-xx-xx-xx-xx NAS-Port-Id = 1:29 NAS-Port = 1029 NAS-Port-Type = Ethernet State = 0x642534cc642539e20b4be1e3ae0328c0 Message-Authenticator = 0xeada93f9da1ca47a6f0325e8ad0414a9 ... [eap] EAP packet type response id 255 length 105 [eap] No EAP Start, assuming it's an on-going EAP conversation +++[eap] returns updated ++- else else returns updated Found Auth-Type = EAP # Executing group from file /etc/raddb/sites-enabled/default +- entering group EAP {...} rlm_eap: No EAP session matching the State variable. [eap] Either EAP-request timed out OR EAP-response to an unknown EAP-request [eap] Failed in handler ++[eap] returns invalid There is no other packet between this two and only 5 seconds, server has not been restarted. Robert -Ursprüngliche Nachricht- Von: freeradius-users-bounces+robert.penz=tirol.gv...@lists.freeradius.org [mailto:freeradius-users-bounces+robert.penz=tirol.gv...@lists.freeradius.org] Im Auftrag von PENZ Robert Gesendet: Dienstag, 27. November 2012 17:38 An: FreeRadius users mailing list Betreff: AW: AW: EAP-TLS Failed in handler question With first packet I meant first packet the radius server saw in some time ... the switch forces a reauthentification every 2h A re-auth is a fresh EAP session. So even on a re-auth, the first packet would not have a State attribute, absent software bugs. ok It *could* be that the client just got stuck and is responding (very) late. But I'm quite surprised the NAS didn't timeout the EAP auth before that. We're running Extreme Networks Switches with following timers set: configure netlogin dot1x timers quiet-period 30 configure netlogin dot1x timers reauth-period 7200 We run SummitX edge, and when I've tested dot1x netlogin in the past, I haven't seen this issue. We've never widely deployed it, however, so it's possible there's an XOS bug where a small percentage of re-auths erroneously re-use the State. You'd need to get a packet capture to be sure. ok ... will try to get one .. is not easy ... but reject means the switch sets the port to the guest vlan, and therefor the PC loses the connections ... is there a way to request a new full eap/tls handshake from the client? You're not understanding, or I'm not making myself clear. Suggestion: fire up wireshark, and take a careful look at a normal EAP authentication. You'll see that the first packet is an EAP-Identity without a State attribute, which the server responds to with an Access-Challenge containing the default eap type start payload, and a State attribute. Are you *absolutely sure* that these packets are really the first RADIUS packet in the auth/re-auth? will check again and get back to you If you're sure, your problem seems to be that the correct first packet isn't being sent; the switch is just jumping straight in with the EAP payload *and* a State attribute
Re: AW: AW: EAP-TLS Failed in handler question
On 12/04/2012 03:59 PM, PENZ Robert wrote: There is no other packet between this two and only 5 seconds, server has not been restarted. Weird. But we need the *full* debug please! - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
AW: AW: AW: EAP-TLS Failed in handler question
There is no other packet between this two and only 5 seconds, server has not been restarted. Weird. But we need the *full* debug please! some special option or the full log file? The second I send you in a private mail. Robert - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
AW: AW: EAP-TLS Failed in handler question
With first packet I meant first packet the radius server saw in some time ... the switch forces a reauthentification every 2h A re-auth is a fresh EAP session. So even on a re-auth, the first packet would not have a State attribute, absent software bugs. ok It *could* be that the client just got stuck and is responding (very) late. But I'm quite surprised the NAS didn't timeout the EAP auth before that. We're running Extreme Networks Switches with following timers set: configure netlogin dot1x timers quiet-period 30 configure netlogin dot1x timers reauth-period 7200 We run SummitX edge, and when I've tested dot1x netlogin in the past, I haven't seen this issue. We've never widely deployed it, however, so it's possible there's an XOS bug where a small percentage of re-auths erroneously re-use the State. You'd need to get a packet capture to be sure. ok ... will try to get one .. is not easy ... but reject means the switch sets the port to the guest vlan, and therefor the PC loses the connections ... is there a way to request a new full eap/tls handshake from the client? You're not understanding, or I'm not making myself clear. Suggestion: fire up wireshark, and take a careful look at a normal EAP authentication. You'll see that the first packet is an EAP-Identity without a State attribute, which the server responds to with an Access-Challenge containing the default eap type start payload, and a State attribute. Are you *absolutely sure* that these packets are really the first RADIUS packet in the auth/re-auth? will check again and get back to you If you're sure, your problem seems to be that the correct first packet isn't being sent; the switch is just jumping straight in with the EAP payload *and* a State attribute. I am curious to know where it's getting that State attribute. The server source code assumes that a State attribute will be valid. There's no setting to just accept it. Interestingly, I see the RADIUS RFC does actually allow clients to send a previous State if you send an Access-Accept with: Termination-Action = RADIUS-request You're not doing that, are you? no, I'm not No. As above, re-auths start new EAP sessions. You would only reject any EAP sessions that were in the *middle* of performing an auth, as the state would be lost across restarts. But this is a very narrow window. so I would be best to set iptables to drop requests for 1min than restart the radius und remove the iptables rules? or can I set freeradius in a mode where is does not accept new sessions? and after 2 minutes I restart it? So that the switch is forced onto the other switch. or what is the best practice to never have falls rejects? - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
AW: EAP-TLS Failed in handler question
Hi! first thx for your response. My first question is, how can I decode a EAP-Message from the debug Wireshark, or read the EAP RFC and decode it manually (see below) ok, I'll believe i got lucky and got a tcpdump trace on a client yesterday ... need to check it and if it is the same problem I'll provide more info. log to check if the request is itself ok. Here is first packet from No, this is *not* the first packet, because it has a State attribute, which is only present in 2nd and subsequent packets of the EAP exchange. With first packet I meant first packet the radius server saw in some time ... the switch forces a reauthentification every 2h The reason you're getting the error message is that the State attribute is unknown, so FR can't proceed with the EAP session and has no choice but to drop it. Check you haven't reduced the timer_expire value in eap.conf to a too-low value. # A list is maintained to correlate EAP-Response # packets with EAP-Request packets. After a # configurable length of time, entries in the list # expire, and are deleted. # timer_expire = 120 default was 60 .. I doubled it some weeks ago, as I saw No EAP session matching the State variable entries in the log. How many FR servers do you have serving this NAS? Is it possible the NAS is sending packets in a round-robin fashion (which is bad) which is why you're seeing a packet for which you don't have State? In this case it is only one .. we're running in pre-production with the IT department clients (about 100 clients) to make sure it is stable before rollout. But in production it will be more than one ... good point, we need to check that too, before going into production. I guess it's possible something is mangling the State attribute from the previous packet (which is *actually* the first packet). Otherwise, the client or NAS is doing something odd. It *could* be that the client just got stuck and is responding (very) late. But I'm quite surprised the NAS didn't timeout the EAP auth before that. We're running Extreme Networks Switches with following timers set: configure netlogin dot1x timers quiet-period 30 configure netlogin dot1x timers reauth-period 7200 following other timers are set to the default values: server-timeout Configure RADIUS server timeout for 802.1X supp-resp-timeout Configure supplicant response timeout rad_recv: Access-Request packet from host 10.xxx.xxx.4 port 44519, id=151, length=244 User-Name = host/x.tirol.local EAP-Message = 0x02ff00690d80005f160301005a01 Ok so this says: 02 - eap response ff - eap ID 255 - bit odd.. 0069 - length in hex 0d - eap type 13 (EAP-TLS) 80 - eap TLS flags = length included 005f - tls length 160301 - TLS packet 0x16==22==handshake record, version 3,1 (TLS 1.0) 005a - record length 01 - handshake=client hello cool !! etc. etc. So, it's the start of an EAP-TLS exchange, but as above, it's *not* the first packet. If you start a tcpdump on the server, you'll see how this works: C: Access-Request, no state, EAP-Identity=abc S: Access-Challenge, state=, EAP-TLS blah C: Access-Request, state=, EAP-TLS blah ok i.e. the NAS has to reflect the State back to FreeRADIUS on each packet. Something is interfering with that, or erasing the State at your end (a timer or restart). rlm_eap: No EAP session matching the State variable See? But I didn't see a reason for it ;-) Invalid means I return a reject ... should I return something else? No. but reject means the switch sets the port to the guest vlan, and therefor the PC loses the connections ... is there a way to request a new full eap/tls handshake from the client? Is this a client problem or a misconfiguration on my part? It's probably a client or NAS problem, unless you've set timer_expire too low. However: I guess this could also happen right after the server is restarted. Could that be it - is a cron job restarting it maybe? no the server is running for 10 days but if I would restart the server I would reject all clients to the guest vlan on reauthentication after that ... that can't be the designed way. Robert - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Re: AW: EAP-TLS Failed in handler question
On 21/11/12 12:00, PENZ Robert wrote: With first packet I meant first packet the radius server saw in some time ... the switch forces a reauthentification every 2h A re-auth is a fresh EAP session. So even on a re-auth, the first packet would not have a State attribute, absent software bugs. It *could* be that the client just got stuck and is responding (very) late. But I'm quite surprised the NAS didn't timeout the EAP auth before that. We're running Extreme Networks Switches with following timers set: configure netlogin dot1x timers quiet-period 30 configure netlogin dot1x timers reauth-period 7200 We run SummitX edge, and when I've tested dot1x netlogin in the past, I haven't seen this issue. We've never widely deployed it, however, so it's possible there's an XOS bug where a small percentage of re-auths erroneously re-use the State. You'd need to get a packet capture to be sure. but reject means the switch sets the port to the guest vlan, and therefor the PC loses the connections ... is there a way to request a new full eap/tls handshake from the client? You're not understanding, or I'm not making myself clear. Suggestion: fire up wireshark, and take a careful look at a normal EAP authentication. You'll see that the first packet is an EAP-Identity without a State attribute, which the server responds to with an Access-Challenge containing the default eap type start payload, and a State attribute. Are you *absolutely sure* that these packets are really the first RADIUS packet in the auth/re-auth? If you're sure, your problem seems to be that the correct first packet isn't being sent; the switch is just jumping straight in with the EAP payload *and* a State attribute. I am curious to know where it's getting that State attribute. The server source code assumes that a State attribute will be valid. There's no setting to just accept it. Interestingly, I see the RADIUS RFC does actually allow clients to send a previous State if you send an Access-Accept with: Termination-Action = RADIUS-request You're not doing that, are you? Is this a client problem or a misconfiguration on my part? It's probably a client or NAS problem, unless you've set timer_expire too low. However: I guess this could also happen right after the server is restarted. Could that be it - is a cron job restarting it maybe? no the server is running for 10 days but if I would restart the server I would reject all clients to the guest vlan on reauthentication after that ... that can't be the designed way. No. As above, re-auths start new EAP sessions. You would only reject any EAP sessions that were in the *middle* of performing an auth, as the state would be lost across restarts. But this is a very narrow window. - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
EAP-TLS Failed in handler question
Hi! I've 802.1x (EAP-TLS) on a wired network activated, and it works 99% of the time ... just some authentications fail, but some minutes later the same client authenticates without a problem. As it happens only once every few days and always with a new client I cannot put a sniffer between the PC and switch, as I don't know which client is the next. But I enabled the debug logging on the freeradius server. The Clients are Windows 7 PCs and I'm running freeradius2-2.1.12-3.el5 on RHEL5. My first question is, how can I decode a EAP-Message from the debug log to check if the request is itself ok. Here is first packet from this client in some time, and it already generates the error. But the same client worked before and after it for days without a problem: rad_recv: Access-Request packet from host 10.xxx.xxx.4 port 44519, id=151, length=244 User-Name = host/x.tirol.local EAP-Message = 0x02ff00690d80005f160301005a0156030150a6115ee4ca2d9456a7fa7edad2fb1c7b221fc747eb78eb4d789ff077c48ef818002f00350005000ac013c014c009c00a00320038001300040115ff0100010a0006000400170018000b00020100 NAS-IP-Address = 10.xxx.xxx.4 Service-Type = Login-User Calling-Station-Id = xx-xx-xx-xx-xx-xx NAS-Port-Id = 2:3 NAS-Port = 2003 NAS-Port-Type = Ethernet State = 0x8df2b5f98df2b8eb6e43e372671f4335 Message-Authenticator = 0x6822006f5e7cf03d00a08b04869d19d8 and the relevant other log lines: ++? if (!EAP-Message) ? Evaluating !(EAP-Message) - FALSE ++? if (!EAP-Message) - FALSE ++- entering else else {...} [eap] EAP packet type response id 255 length 105 [eap] No EAP Start, assuming it's an on-going EAP conversation +++[eap] returns updated ++- else else returns updated Found Auth-Type = EAP # Executing group from file /etc/raddb/sites-enabled/default +- entering group EAP {...} rlm_eap: No EAP session matching the State variable. [eap] Either EAP-request timed out OR EAP-response to an unknown EAP-request [eap] Failed in handler ++[eap] returns invalid Invalid means I return a reject ... should I return something else? Is this a client problem or a misconfiguration on my part? Thx for your help! Mit freundlichen Grüßen Robert Penz -- Dipl.Inf. Robert Penz DVT - Daten-Verarbeitung-Tirol GmbH Adamgasse 22, 6020 Innsbruck Tel: +43 (0)512 508 3334 / Fax: +43 (0)512 508 3355 E-Mail: robert.p...@tirol.gv.at - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html
Re: EAP-TLS Failed in handler question
On 11/19/2012 08:23 AM, PENZ Robert wrote: My first question is, how can I decode a EAP-Message from the debug Wireshark, or read the EAP RFC and decode it manually (see below) log to check if the request is itself ok. Here is first packet from No, this is *not* the first packet, because it has a State attribute, which is only present in 2nd and subsequent packets of the EAP exchange. The reason you're getting the error message is that the State attribute is unknown, so FR can't proceed with the EAP session and has no choice but to drop it. Check you haven't reduced the timer_expire value in eap.conf to a too-low value. How many FR servers do you have serving this NAS? Is it possible the NAS is sending packets in a round-robin fashion (which is bad) which is why you're seeing a packet for which you don't have State? I guess it's possible something is mangling the State attribute from the previous packet (which is *actually* the first packet). Otherwise, the client or NAS is doing something odd. this client in some time, and it already generates the error. But the same client worked before and after it for days without a problem: It *could* be that the client just got stuck and is responding (very) late. But I'm quite surprised the NAS didn't timeout the EAP auth before that. rad_recv: Access-Request packet from host 10.xxx.xxx.4 port 44519, id=151, length=244 User-Name = host/x.tirol.local EAP-Message = 0x02ff00690d80005f160301005a01 Ok so this says: 02 - eap response ff - eap ID 255 - bit odd.. 0069 - length in hex 0d - eap type 13 (EAP-TLS) 80 - eap TLS flags = length included 005f - tls length 160301 - TLS packet 0x16==22==handshake record, version 3,1 (TLS 1.0) 005a - record length 01 - handshake=client hello etc. etc. So, it's the start of an EAP-TLS exchange, but as above, it's *not* the first packet. If you start a tcpdump on the server, you'll see how this works: C: Access-Request, no state, EAP-Identity=abc S: Access-Challenge, state=, EAP-TLS blah C: Access-Request, state=, EAP-TLS blah i.e. the NAS has to reflect the State back to FreeRADIUS on each packet. Something is interfering with that, or erasing the State at your end (a timer or restart). rlm_eap: No EAP session matching the State variable See? Invalid means I return a reject ... should I return something else? No. Is this a client problem or a misconfiguration on my part? It's probably a client or NAS problem, unless you've set timer_expire too low. However: I guess this could also happen right after the server is restarted. Could that be it - is a cron job restarting it maybe? - List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html