Re: [PacketFence-users] Hostapd causing duplicate Sign-Into-Network prompts intermittently

2018-03-06 Thread Durand fabrice via PacketFence-users

Hello Ian,

there is a configuration parameter that can be change to raise the time 
between the deauth order and the CoA request.


It's Config{'fencing'}{'wait_for_redirect'} , so in pf.conf:

[fencing]
#
# fencing.wait_for_redirect
#
# How many seconds should the WebAPI sleep before actually triggering 
the VLAN change.
# This is meant to give the device enough time to fetch the redirection 
page before

# switching VLAN.
wait_for_redirect = 31

Regards

Fabrice



Le 2018-03-06 à 13:54, Ian MacDonald via PacketFence-users a écrit :

Well,

We easily narrowed this down to a timing issue related to the 
Access-Accept(2) messages and how long an authorization is considered 
valid by hostapd.


In short, we need to make sure that the CoA from portal activation is 
NOT sent to the AP within 30 seconds of the initial connection to the 
registration VLAN.


We don't think the redirect or network_detection delay is suitable for 
this as that happens after the CoA.


*How do we bake in a delay between Access-Accept and CoA messages from 
the radius server for speedy WISPr clients? *

*
*
Is there a portal setting that can help us here?

Ideally, a "minimum_coa_delay" setting could allow packetfence to only 
issue a CoA after a minimum time period since the last Access-Request 
or Access-Accept was processed.


Additionally, we are going to investigate configuration of hostapd to 
change the 30s to something smaller (which would solve most our use 
cases) and/or, having the 30s timer reset on receipt of a CoA.  Both 
these seem non-trivial vs adding this intelligence to packetfence 
which may solve similar race conditions in the wide variety of radius 
clients supported.


Details follow,

cheers,
Ian


Hostapd considers the last authorization valid for 30 seconds, and if 
a host disconnects and reconnects within that window, it does not 
check back with the radius server for a new authorization.   This 
holds true even if a CoA is received from the radius server;  The 
latter which I think is a behavior that could be enhanced by hostapd - 
after all it is called "Change of Authorization" request.


How this is resulting the double prompt for some users, is that if a 
user is able to complete the portal activation within 30 seconds of 
the device connecting to the registration VLAN, hostapd does not issue 
another Access-Request, but instead considers the current 
authorization to be valid, and the devices remain on the registration 
VLAN evem after the CoA is sent  that should switch them to the Normal 
VLAN.


This means you can actually re-Sign-In multiple times within that 30 
second window if you are quick (my email auto-populates on my phone 
making this easy). Each time the portal generates a CoA but returns 
the device to the registration VLAN.  At 31 seconds after the initial 
device connnection to the registration VLAN, any subsequent CoA 
results in the correct VLAN being assigned.


The 30s authorization sort of makes sense in order to deal with a 
client that is authorized and disconnects and reconnects quickly a 
bunch of times, you avoid some unnecessary radius requests back and 
forth, but the benefit is minimized since the Radius-Accounting 
packets are still processed with each disconnect.


This 30 second delay also impacts de-registration.  We were able to 
confirm that if we try and de-register a client from the PF GUI within 
30 seconds of them being activated, they remain on the Normal VLAN.  
 So there is some impact here as well -> And we remember seeing this 
little bug a long time ago, so we think this has always been there.  
Often testing the captive portals we de-register devices quickly to 
"try them again" and we used to have to hit "re-evaluate" after save 
.. but it turns out, only if we do it under 30 seconds.





On Mon, Mar 5, 2018 at 9:21 PM, Ian MacDonald > wrote:


We have packetfence 7.4 instances out-of-band running on Debian 8,
and use the captive portal with hostapd for WiFi client access.

Our clients register using an email source for activation.

Our configuration uses all captive_portal defaults except for the
network detection IP, shown below.

network_detection_initial_delay = 5s
network_detection_retry_delay = 2s
network_redirect_delay = 20s
secure_redirect=enabled
detection_mecanism_bypass = disabled
wispr_redirection = enabled

At some point in the past I recall our mobile clients used to see
the scrollbar for many seconds (10+?), which I believe was
redirect_delay being interrupted by network_detection; this would
occur immediately after submitting an email for activation.

Now on PF 7.4.0, hostapd 17.01.4 and with Samsung S8 clients, the
delay between submitting the email and activation onto the
user/normal VLAN is instantaneous (about 2 seconds according to
our logs).   I suspect this speedy interaction has something to do
with 

Re: [PacketFence-users] Hostapd causing duplicate Sign-Into-Network prompts intermittently

2018-03-06 Thread Ian MacDonald via PacketFence-users
Well,

We easily narrowed this down to a timing issue related to the
Access-Accept(2) messages and how long an authorization is considered valid
by hostapd.

In short, we need to make sure that the CoA from portal activation is NOT
sent to the AP within 30 seconds of the initial connection to the
registration VLAN.

We don't think the redirect or network_detection delay is suitable for this
as that happens after the CoA.

*How do we bake in a delay between Access-Accept and CoA messages from the
radius server for speedy WISPr clients? *

Is there a portal setting that can help us here?

Ideally, a "minimum_coa_delay" setting could allow packetfence to only
issue a CoA after a minimum time period since the last Access-Request or
Access-Accept was processed.

Additionally, we are going to investigate configuration of hostapd to
change the 30s to something smaller (which would solve most our use cases)
and/or, having the 30s timer reset on receipt of a CoA.  Both these seem
non-trivial vs adding this intelligence to packetfence which may solve
similar race conditions in the wide variety of radius clients supported.

Details follow,

cheers,
Ian


Hostapd considers the last authorization valid for 30 seconds, and if a
host disconnects and reconnects within that window, it does not check back
with the radius server for a new authorization.   This holds true even if a
CoA is received from the radius server;  The latter which I think is a
behavior that could be enhanced by hostapd - after all it is called "Change
of Authorization" request.

How this is resulting the double prompt for some users, is that if a user
is able to complete the portal activation within 30 seconds of the device
connecting to the registration VLAN, hostapd does not issue another
Access-Request, but instead considers the current authorization to be
valid, and the devices remain on the registration VLAN evem after the CoA
is sent  that should switch them to the Normal VLAN.

This means you can actually re-Sign-In multiple times within that 30 second
window if you are quick (my email auto-populates on my phone making this
easy). Each time the portal generates a CoA but returns the device to the
registration VLAN.  At 31 seconds after the initial device connnection to
the registration VLAN, any subsequent CoA results in the correct VLAN being
assigned.

The 30s authorization sort of makes sense in order to deal with a client
that is authorized and disconnects and reconnects quickly a bunch of times,
you avoid some unnecessary radius requests back and forth, but the benefit
is minimized since the Radius-Accounting packets are still processed with
each disconnect.

This 30 second delay also impacts de-registration.  We were able to confirm
that if we try and de-register a client from the PF GUI within 30 seconds
of them being activated, they remain on the Normal VLAN.   So there is some
impact here as well -> And we remember seeing this little bug a long time
ago, so we think this has always been there.  Often testing the captive
portals we de-register devices quickly to "try them again" and we used to
have to hit "re-evaluate" after save .. but it turns out, only if we do it
under 30 seconds.




On Mon, Mar 5, 2018 at 9:21 PM, Ian MacDonald  wrote:

> We have packetfence 7.4 instances out-of-band running on Debian 8, and use
> the captive portal with hostapd for WiFi client access.
>
> Our clients register using an email source for activation.
>
> Our configuration uses all captive_portal defaults except for the network
> detection IP, shown below.
>
> network_detection_initial_delay = 5s
> network_detection_retry_delay = 2s
> network_redirect_delay = 20s
> secure_redirect=enabled
> detection_mecanism_bypass = disabled
> wispr_redirection = enabled
>
> At some point in the past I recall our mobile clients used to see the
> scrollbar for many seconds (10+?), which I believe was redirect_delay being
> interrupted by network_detection; this would occur immediately after
> submitting an email for activation.
>
> Now on PF 7.4.0, hostapd 17.01.4 and with Samsung S8 clients, the delay
> between submitting the email and activation onto the user/normal VLAN is
> instantaneous (about 2 seconds according to our logs).   I suspect this
> speedy interaction has something to do with WiSPr and CoA working together
> quickly, whereas previously we were maybe only doing captive portal.
>
> The problem is that intermittently, the Android clients, after activation,
> are returned to the Registration VLAN instead of the Normal/User VLAN,
> which causes those clients to pop up the "Sign Into Network" WiSPR prompt a
> second time.  Clicking on it again loads a quick PF Network Error page
> which disappears as quickly as it appears, and a second later the client is
> dumped onto the Normal/User VLAN as it should have been originally.
>
> This is confusing for end users, and most make it through. But not
> everyone is happy.
>
> Below is the