Fabrice,

Thank you for the tip.  A nicely commented configuration option.  I'll make
a mental note to scroll through pf.conf.defaults before asking if a
configuration option exists.

wait_for_redirect=20 seems to be ideal for hostapd and the speediest of
fingers on the speediest of smartphones using WISPr.

The progress bar kicks in nicely and the transistion was smooth.

I will note for the list that this impacts all aspects of the WebAPI,
including the GUI unregistration, so we have removed the liklihood of any
race condition there as well with recently authorized devices being
de-registered.

I am also going to open up a ticket with hostapd upstream to see if they
can change the CoA behavior to always initiate an Access-Request after CoA
so we can get back to the instant activation which gives the impression of
'faster/better connectivity' for the end user.

cheers,
Ian



On Tue, Mar 6, 2018 at 9:46 PM, Durand fabrice via PacketFence-users <
packetfence-users@lists.sourceforge.net> wrote:

> Hello Ian,
>
> there is a configuration parameter that can be change to raise the time
> between the deauth order and the CoA request.
>
> It's Config{'fencing'}{'wait_for_redirect'} , so in pf.conf:
>
> [fencing]
> #
> # fencing.wait_for_redirect
> #
> # How many seconds should the WebAPI sleep before actually triggering the
> VLAN change.
> # This is meant to give the device enough time to fetch the redirection
> page before
> # switching VLAN.
> wait_for_redirect = 31
>
> Regards
>
> Fabrice
>
>
>
> Le 2018-03-06 à 13:54, Ian MacDonald via PacketFence-users a écrit :
>
> Well,
>
> We easily narrowed this down to a timing issue related to the
> Access-Accept(2) messages and how long an authorization is considered valid
> by hostapd.
>
> In short, we need to make sure that the CoA from portal activation is NOT
> sent to the AP within 30 seconds of the initial connection to the
> registration VLAN.
>
> We don't think the redirect or network_detection delay is suitable for
> this as that happens after the CoA.
>
> *How do we bake in a delay between Access-Accept and CoA messages from the
> radius server for speedy WISPr clients? *
>
> Is there a portal setting that can help us here?
>
> Ideally, a "minimum_coa_delay" setting could allow packetfence to only
> issue a CoA after a minimum time period since the last Access-Request or
> Access-Accept was processed.
>
> Additionally, we are going to investigate configuration of hostapd to
> change the 30s to something smaller (which would solve most our use cases)
> and/or, having the 30s timer reset on receipt of a CoA.  Both these seem
> non-trivial vs adding this intelligence to packetfence which may solve
> similar race conditions in the wide variety of radius clients supported.
>
> Details follow,
>
> cheers,
> Ian
>
>
> Hostapd considers the last authorization valid for 30 seconds, and if a
> host disconnects and reconnects within that window, it does not check back
> with the radius server for a new authorization.   This holds true even if a
> CoA is received from the radius server;  The latter which I think is a
> behavior that could be enhanced by hostapd - after all it is called "Change
> of Authorization" request.
>
> How this is resulting the double prompt for some users, is that if a user
> is able to complete the portal activation within 30 seconds of the device
> connecting to the registration VLAN, hostapd does not issue another
> Access-Request, but instead considers the current authorization to be
> valid, and the devices remain on the registration VLAN evem after the CoA
> is sent  that should switch them to the Normal VLAN.
>
> This means you can actually re-Sign-In multiple times within that 30
> second window if you are quick (my email auto-populates on my phone making
> this easy). Each time the portal generates a CoA but returns the device to
> the registration VLAN.  At 31 seconds after the initial device connnection
> to the registration VLAN, any subsequent CoA results in the correct VLAN
> being assigned.
>
> The 30s authorization sort of makes sense in order to deal with a client
> that is authorized and disconnects and reconnects quickly a bunch of times,
> you avoid some unnecessary radius requests back and forth, but the benefit
> is minimized since the Radius-Accounting packets are still processed with
> each disconnect.
>
> This 30 second delay also impacts de-registration.  We were able to
> confirm that if we try and de-register a client from the PF GUI within 30
> seconds of them being activated, they remain on the Normal VLAN.   So there
> is some impact here as well -> And we remember seeing this little bug a
> long time ago, so we think this has always been there.  Often testing the
> captive portals we de-register devices quickly to "try them again" and we
> used to have to hit "re-evaluate" after save .. but it turns out, only if
> we do it under 30 seconds.
>
>
>
>
> On Mon, Mar 5, 2018 at 9:21 PM, Ian MacDonald <i...@netstatz.com> wrote:
>
>> We have packetfence 7.4 instances out-of-band running on Debian 8, and
>> use the captive portal with hostapd for WiFi client access.
>>
>> Our clients register using an email source for activation.
>>
>> Our configuration uses all captive_portal defaults except for the network
>> detection IP, shown below.
>>
>> network_detection_initial_delay = 5s
>> network_detection_retry_delay = 2s
>> network_redirect_delay = 20s
>> secure_redirect=enabled
>> detection_mecanism_bypass = disabled
>> wispr_redirection = enabled
>>
>> At some point in the past I recall our mobile clients used to see the
>> scrollbar for many seconds (10+?), which I believe was redirect_delay being
>> interrupted by network_detection; this would occur immediately after
>> submitting an email for activation.
>>
>> Now on PF 7.4.0, hostapd 17.01.4 and with Samsung S8 clients, the delay
>> between submitting the email and activation onto the user/normal VLAN is
>> instantaneous (about 2 seconds according to our logs).   I suspect this
>> speedy interaction has something to do with WiSPr and CoA working together
>> quickly, whereas previously we were maybe only doing captive portal.
>>
>> The problem is that intermittently, the Android clients, after
>> activation, are returned to the Registration VLAN instead of the
>> Normal/User VLAN, which causes those clients to pop up the "Sign Into
>> Network" WiSPR prompt a second time.  Clicking on it again loads a quick PF
>> Network Error page which disappears as quickly as it appears, and a second
>> later the client is dumped onto the Normal/User VLAN as it should have been
>> originally.
>>
>> This is confusing for end users, and most make it through. But not
>> everyone is happy.
>>
>> Below is the working scenario, as seen in terms of tcpdump of radius
>> packets on hostapd.
>>
>> PF server is 10.2.1.2 and the hostapd daemon is 10.2.1.11
>>
>> Client Connects to WiFi Network and Prestented with Email Activation in
>> Portal
>> 12:03:58.157726 10.2.1.11.49281 > 10.2.1.2.1812: Access-Request (1)
>> 12:03:58.211386 10.2.1.2.1812 > 10.2.1.11.49281: Access-Accept (2)
>> 12:03:58.279846 10.2.1.11.45520 > 10.2.1.2.1813: Accounting-Request (4)
>> 12:03:58.325525 10.2.1.2.1813 > 10.2.1.11.45520: Accounting-Response (5)
>> ~30s later Client hits Activation button after entering email
>> 12:04:28.735459 10.2.1.2.60410 > 10.2.1.11.3799: Disconnect-Request (40)
>> 12:04:28.736847 10.2.1.11.45520 > 10.2.1.2.1813: Accounting-Request (4)
>> 12:04:28.737166 10.2.1.11.3799 > 10.2.1.2.60410: Disconnect-ACK (41)
>> 12:04:28.762991 10.2.1.2.1813 > 10.2.1.11.45520: Accounting-Response (5)
>> 12:04:30.717764 10.2.1.11.49281 > 10.2.1.2.1812: Access-Request (1)
>> 12:04:30.782158 10.2.1.2.1812 > 10.2.1.11.49281: Access-Accept (2)
>> 12:04:30.906401 10.2.1.11.45520 > 10.2.1.2.1813: Accounting-Request (4)
>> 12:04:30.943903 10.2.1.2.1813 > 10.2.1.11.45520: Accounting-Response (5)
>> ~2s later Client is on normal/user VLAN with "Connected" message.
>>
>> Corresponding log entries from the pf radius log showing Registration
>> (81) and Normal (76) vlans in concert with the Access-Request packets:
>>
>> Mar  5 12:03:58 pf4 auth[1284]: [mac:a0:cc:2b:15:20:97] Accepted user:
>> and returned VLAN 81
>> Mar  5 12:03:58 pf4 auth[1284]: (89) Login OK: [a0cc2b152097] (from
>> client 10.2.1.11 port 0 cli a0:cc:2b:15:20:97)
>> Mar  5 12:04:30 pf4 auth[1284]: [mac:a0:cc:2b:15:20:97] Accepted user:
>> and returned VLAN 76
>> Mar  5 12:04:30 pf4 auth[1284]: (90) Login OK: [a0cc2b152097] (from
>> client 10.2.1.11 port 0 cli a0:cc:2b:15:20:97)
>>
>> The hostapd radius logs align with PF and show the disconnect is followed
>> by the new vlan received in the Access-Accept (2) message
>>
>> Mon Mar  5 12:03:58 2018 daemon.info hostapd: wlan0: STA
>> a0:cc:2b:15:20:97 RADIUS: VLAN ID 81
>> Mon Mar  5 12:03:58 2018 daemon.info hostapd: wlan0: STA
>> a0:cc:2b:15:20:97 RADIUS: starting accounting session 628C7DBAF341039F
>> Mon Mar  5 12:04:28 2018 daemon.info hostapd: wlan0: STA
>> a0:cc:2b:15:20:97 RADIUS: stopped accounting session 628C7DBAF341039F
>> Mon Mar  5 12:04:30 2018 daemon.info hostapd: wlan0: STA
>> a0:cc:2b:15:20:97 RADIUS: VLAN ID 76
>> Mon Mar  5 12:04:30 2018 daemon.info hostapd: wlan0: STA
>> a0:cc:2b:15:20:97 RADIUS: starting accounting session 628C7DBAF341039F
>>
>>
>> Here is what it looks like in the buggy scenario where the user is
>> prompted twice to "Sign into Network".  It looks like after the
>> Disconnect-Request (40) to change to the normal VLAN, the hostapd does not
>> send the Access-Request (1) but just uses the VLAN from the previous
>> Access-Accept(2).
>>
>> Client Connects to WiFi Network and Prestented with Email Activation in
>> Portal
>> 15:11:35.269960 10.2.1.11.49281 > 10.2.1.2.1812: Access-Request (1)
>> 15:11:35.326104 10.2.1.2.1812 > 10.2.1.11.49281: Access-Accept (2)
>> 15:11:35.386984 10.2.1.11.45520 > 10.2.1.2.1813: Accounting-Request (4)
>> 15:11:35.427652 10.2.1.2.1813 > 10.2.1.11.45520: Accounting-Response (5)
>> ~22s later Client hits Activation button after entering email
>> 15:11:57.417798 10.2.1.2.58842 > 10.2.1.11.3799: Disconnect-Request (40)
>> 15:11:57.420730 10.2.1.11.45520 > 10.2.1.2.1813: Accounting-Request (4)
>> 15:11:57.424759 10.2.1.11.3799 > 10.2.1.2.58842: Disconnect-ACK (41)
>> 15:11:57.446361 10.2.1.2.1813 > 10.2.1.11.45520: Accounting-Response (5)
>> 15:11:59.243604 10.2.1.11.45520 > 10.2.1.2.1813: Accounting-Request (4)
>> 15:11:59.283332 10.2.1.2.1813 > 10.2.1.11.45520: Accounting-Response (5)
>> ~29s later Client clicks on the second occurrence of "Sign into Network"
>> 15:12:28.554417 10.2.1.2.41995 > 10.2.1.11.3799: Disconnect-Request (40)
>> 15:12:28.557488 10.2.1.11.45520 > 10.2.1.2.1813: Accounting-Request (4)
>> 15:12:28.561537 10.2.1.11.3799 > 10.2.1.2.41995: Disconnect-ACK (41)
>> 15:12:28.580379 10.2.1.2.1813 > 10.2.1.11.45520: Accounting-Response (5)
>> 15:12:30.262645 10.2.1.11.49281 > 10.2.1.2.1812: Access-Request (1)
>> 15:12:30.322381 10.2.1.2.1812 > 10.2.1.11.49281: Access-Accept (2)
>> 15:12:30.446732 10.2.1.11.45520 > 10.2.1.2.1813: Accounting-Request (4)
>> 15:12:30.483542 10.2.1.2.1813 > 10.2.1.11.45520: Accounting-Response (5)
>>
>> Again the pf radius logs also show no autz request after the initial
>> disconnect request.
>>
>> Mar  5 15:11:35 pf4 auth[1284]: [mac:a0:cc:2b:15:20:97] Accepted user:
>> and returned VLAN 81
>> Mar  5 15:11:35 pf4 auth[1284]: (93) Login OK: [a0cc2b152097] (from
>> client 10.2.1.11 port 0 cli a0:cc:2b:15:20:97)
>> Mar  5 15:12:30 pf4 auth[1284]: [mac:a0:cc:2b:15:20:97] Accepted user:
>> and returned VLAN 76
>> Mar  5 15:12:30 pf4 auth[1284]: (94) Login OK: [a0cc2b152097] (from
>> client 10.2.1.11 port 0 cli a0:cc:2b:15:20:97)
>>
>> Additionally the hostapd logs show that the registration vlan 81 seemed
>> to be automatically assigned a second time after the initial
>> Disconnect-Request:
>>
>> Mon Mar  5 15:11:35 2018 daemon.info hostapd: wlan0: STA
>> a0:cc:2b:15:20:97 RADIUS: VLAN ID 81
>> Mon Mar  5 15:11:35 2018 daemon.info hostapd: wlan0: STA
>> a0:cc:2b:15:20:97 RADIUS: starting accounting session 27395E82D8E75E24
>> Mon Mar  5 15:11:57 2018 daemon.info hostapd: wlan0: STA
>> a0:cc:2b:15:20:97 RADIUS: stopped accounting session 27395E82D8E75E24
>> Mon Mar  5 15:11:59 2018 daemon.info hostapd: wlan0: STA
>> a0:cc:2b:15:20:97 RADIUS: VLAN ID 81
>> Mon Mar  5 15:11:59 2018 daemon.info hostapd: wlan0: STA
>> a0:cc:2b:15:20:97 RADIUS: starting accounting session 27395E82D8E75E24
>> Mon Mar  5 15:12:28 2018 daemon.info hostapd: wlan0: STA
>> a0:cc:2b:15:20:97 RADIUS: stopped accounting session 27395E82D8E75E24
>> Mon Mar  5 15:12:30 2018 daemon.info hostapd: wlan0: STA
>> a0:cc:2b:15:20:97 RADIUS: VLAN ID 76
>> Mon Mar  5 15:12:30 2018 daemon.info hostapd: wlan0: STA
>> a0:cc:2b:15:20:97 RADIUS: starting accounting session 27395E82D8E75E24
>>
>> My gut thinks that
>> 1) maybe hostapd has a problem, so shortly after this email we are going
>> to go back and stage a legacy hostapd/15.05 AP to rule out recent hostapd
>> issues and figure out how to get more debug out of hostapd logging.
>>
>> 2)  Speedy Android 7 WiSPr may be a problem for hostapd.  Try some slower
>> clients with captive portal detection only and long redirect delay (like we
>> used to see)
>>
>> Posting here for some additional insights and see if anyone else has seen
>> similar problems or has some workaround ideas.
>>
>> cheers,
>> Ian
>>
>
>
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>
>
>
> _______________________________________________
> PacketFence-users mailing 
> listPacketFence-users@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/packetfence-users
>
>
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> PacketFence-users mailing list
> PacketFence-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/packetfence-users
>
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
PacketFence-users mailing list
PacketFence-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/packetfence-users

Reply via email to