Re: [PacketFence-users] Interruption in upgrade from 11.0 to 12.1 Results in radius-acct service binding issue

2023-01-28 Thread Ian MacDonald via PacketFence-users
To close this thread off,

The package issues are being handled in
https://github.com/inverse-inc/packetfence/issues/7473

On the 1813 bindings;  It looks like I just had not yet noticed that there
are now two accounting options;  pfacct and radius-acct and although they
are both present, the radius-acct is only used when needed for heavy
lifting in special use cases with eap, or non-standard radius commands,
and otherwise has been replaced with the newer lightweight pfacct which
suits my use case.   I guess I must have decided to turn it on in my test
environment, thinking I was using it.  I mean at some point in the past I
was pre 11.  Anyways it doesn't start up a listener right away other than
localhost. Just user error.


On Sun, Jan 22, 2023 at 4:41 PM Ian MacDonald  wrote:

> In an upgrade of a PF environment from 11.0 to 12.1, following the upgrade
> instructions, we stumbled on a bug due to some package dependencies.
>
> These redudant packages are being tracked in a bug here
> https://github.com/inverse-inc/packetfence/issues/7246 however they cause
> an upgrade from 11.0 to 12.1 to fail.
>
> From an uptodate 11.0 instance running on Debian 11.6, we ran the upgrade
> script, following prompts etc.
>
> # /usr/local/pf/addons/full-upgrade/run-upgrade.sh
>
> The process ended due to the package conflict, but we were able to resolve
> the conflicts, getting apt sorted out, and then resume the upgrade script,
> which completed the DB upgrades and appeared to exit cleanly.
>
> # dpkg --force-depends -r packetfence-captive-portal-javascript
> # apt --fix-broken install
> # /usr/local/pf/addons/full-upgrade/run-upgrade.sh
>
> However it did not leave us with a working configuration.   The
> radius-acct binding is not working properly; we are not really sure what
> else might be a problem, so likely we have to restage to 11.0 and try again
> once we figure out an upgrade strategy.
>
> Previously we had the same issue in another environment going from 11.0
> ->11.1 ->11.2 ->12.1 where the upgrade process was interrupted by a problem
> with our firewall causing timeouts during the docker pull.   In that case,
> there was no package issue, but we were restarting the upgrade script.  The
> result was the same binding issue, and in that case we just decided to
> restage to 12.1 rather than figure it out, as our goal was to complete
> testing of the captive portal on 12.1.
>
> Our hypothesis is that interruption of the upgrade wrapper script, in one
> case due to a timeout on docker image pulls, and in this latest case from
> package dependencies, led to a 12.1 environment with this service binding
> issue.
>
> Our next step will be to restage 11.0 -> 12.1 without interruption by
> starting with
>
> # dpkg --force-depends -r packetfence-captive-portal-javascript
>
> to see if we can get to 12.1 with radius-acct/pfacct bindings in proper
> working order.
>
> Below are some details, maybe if there is an obvious quick fix that might
> be useful for anyone going through the upgrade process with similar
> results, or save us the time to restage later this month.
>
> cheers,
> Ian
>
> We did some quick comparisons to try and find the difference in systemctl
> packetfence units and configuration between working and no-working 12.1
> system.  We greped config and systemctl units for "1813" and did not find
> any differences but we know its there somewhere.
>
> On the working PF 12.1 instance, staged directly to 12.1, the binding
> looks like the following, where 10.2.1.2 is the management network where
> switches connect to radius-acct.
>
> pf4:/usr/local/pf/conf# netstat -tunap | grep ":1813"
> udp0  0 10.2.1.2:1813   0.0.0.0:*
>   42375/pfacct
>
> On the non-working instance the binding seems to be to the localhost;
>
> pf3:/usr/local/pf/conf# netstat -tunap | grep ":1813"
> udp0  0 127.0.0.1:1813  0.0.0.0:*
>   1942/pfacct
>
>
> We can see radius-acct complaining in the logs as well and working hard to
> restart what we believe is a docker container with radius running inside
> that the actual binding should be pointed at.
>
> radius-acct.log:Jan 22 15:48:16 pf3 radiusd-acct-docker-wrapper[93781]:
> Sun Jan 22 15:48:16 2023 : Error: Failed binding to acct address * port
> 1813 bound to server packetfence: Address already in use
> radius-acct.log:Jan 22 15:48:16 pf3 radiusd-acct-docker-wrapper[93781]:
> Sun Jan 22 15:48:16 2023 : Error: /usr/local/pf/raddb/acct.conf[8]: Error
> binding to port for 0.0.0.0 port 1813
> radius-acct.log:Jan 22 15:48:19 pf3 radiusd-acct-docker-wrapper[93890]:
> Error: No such container: radiusd-acct
> radius-acct.log:Jan 22 15:48:19 pf3 radiusd-acct-docker-wrapper[93890]:
> Error: No such container: 1
>
> If we stop the pfacct service only
>
> systemctl stop packetfence-pfacct.service
>
> this allows the docker  binding to work but not the pfacct process can not
> be restarted, and the binding appears to then be directly with freeradius,
> 

Re: [PacketFence-users] iPhones trapped on reload of Captive Portal Log In Page after switch to Default network

2023-01-27 Thread Ian MacDonald via PacketFence-users
roblem.
>>>>
>>>> All our other clients that we have tested seem to be okay with the
>>>> current javascript implementation.  We just need to get the iPhones fixed
>>>> until RFC8908 is supported. I can see it has been discussed
>>>> <https://github.com/inverse-inc/packetfence/issues/7040> but it seems
>>>> what used to work in IOS 13/14 using the RFC7710bis
>>>> <https://github.com/inverse-inc/packetfence/issues/5638> introduced in
>>>> PF10 isn't working any longer.   The clients do not use the network
>>>> detection IP or image embedded in the captive portal javascript as seen in
>>>> the captures we posted.   They simply 'test' if they are still trapped
>>>> based on reachability to the captive portal URL.
>>>>
>>>> I believe if we can somehow separate the ConfNet.PortalFQDN  used by
>>>> the captive portal redirect from the one used in email activation, we can
>>>> use our Default network local DNS to make the current RFC7710bis
>>>> implementation work with iPhones.  If you know how this could be done, it
>>>> would be a second to workaround our iPhone network detection problem.
>>>>
>>>>
>>>>> El mar, 24 ene 2023 a las 19:59, Ian MacDonald ()
>>>>> escribió:
>>>>> >
>>>>> > Quick inline response to your questions;  Thank you for having a
>>>>> peek.
>>>>> >
>>>>> > On Tue, Jan 24, 2023 at 5:45 PM Enrique Gross via PacketFence-users <
>>>>> packetfence-users@lists.sourceforge.net> wrote:
>>>>> >>
>>>>> >> Regarding DNS, domain resolves to your public address? is that
>>>>> >> correct? And that is the same domain as captive portal?
>>>>> >
>>>>> > Yes, as seen from the Default network, and the Internet, the
>>>>> packetfence server hostname resolves using public DNS, and lands on the
>>>>> portal listener attached to the management network interface on the 
>>>>> server.
>>>>> >
>>>>> >>
>>>>> >> On your topology, port 80/443 redirected to “PF redirection URL”?
>>>>> >
>>>>> >
>>>>> > Yes, in hindsight, that detail should have been removed, as it is
>>>>> confusing in that it is unrelated to the issue, and that redirection rule
>>>>> is not active in this test environment.
>>>>> >
>>>>> > In production, there was a redirection URL in the Captive Portal
>>>>> configuration that goes to a web site;  Provided by the ISP that is
>>>>> providing the free Internet.
>>>>> > A similar redirection, if you happen to point your web browser at
>>>>> the Default network gateway, also goes to that same location.  The 
>>>>> thinking
>>>>> here was if you are surfing in the Public space, and get curious and do a
>>>>> myipaddress.com look up and then go to that IP in your browser, you
>>>>> hit the same landing page as the captive portal redirection.
>>>>> >
>>>>> > It is not active in this environment, so I should have purged it
>>>>> from the topology snapshot
>>>>> >
>>>>> >>
>>>>> >> Enrique
>>>>> >>
>>>>> >> El mar, 24 ene 2023 a las 8:19, James Andrewartha via
>>>>> >> PacketFence-users ()
>>>>> >> escribió:
>>>>> >> >
>>>>> >> > Hi Ian,
>>>>> >> >
>>>>> >> > So looking through the registration PCAP, one thing I notice is
>>>>> that there's three requests for
>>>>> http://captive.apple.com/hotspot-detect.html before it tries to
>>>>> follow the redirect that page returns. Also your DNS is returning the same
>>>>> IP (66.70.255.147) for captive.apple.com as for pf4.dotto-one.com.
>>>>> Are you doing DNS enforcement on the portal? Then on the default network,
>>>>> you return 104.244.196.73 for pf4.dotto-one.com. I don't think that's
>>>>> wrong per se but just wanted to be clear.
>>>>> >> >
>>>>> >> > I see some accesses to https://pf4.dotto-one.com/rfc7710 after
>>>>> it joins the default network, can you see what content is returned? Since
>>>>> it tri

Re: [PacketFence-users] iPhones trapped on reload of Captive Portal Log In Page after switch to Default network

2023-01-26 Thread Ian MacDonald via PacketFence-users
 know how this could be done, it would be a
>> second to workaround our iPhone network detection problem.
>>
>>
>>> El mar, 24 ene 2023 a las 19:59, Ian MacDonald ()
>>> escribió:
>>> >
>>> > Quick inline response to your questions;  Thank you for having a peek.
>>> >
>>> > On Tue, Jan 24, 2023 at 5:45 PM Enrique Gross via PacketFence-users <
>>> packetfence-users@lists.sourceforge.net> wrote:
>>> >>
>>> >> Regarding DNS, domain resolves to your public address? is that
>>> >> correct? And that is the same domain as captive portal?
>>> >
>>> > Yes, as seen from the Default network, and the Internet, the
>>> packetfence server hostname resolves using public DNS, and lands on the
>>> portal listener attached to the management network interface on the server.
>>> >
>>> >>
>>> >> On your topology, port 80/443 redirected to “PF redirection URL”?
>>> >
>>> >
>>> > Yes, in hindsight, that detail should have been removed, as it is
>>> confusing in that it is unrelated to the issue, and that redirection rule
>>> is not active in this test environment.
>>> >
>>> > In production, there was a redirection URL in the Captive Portal
>>> configuration that goes to a web site;  Provided by the ISP that is
>>> providing the free Internet.
>>> > A similar redirection, if you happen to point your web browser at the
>>> Default network gateway, also goes to that same location.  The thinking
>>> here was if you are surfing in the Public space, and get curious and do a
>>> myipaddress.com look up and then go to that IP in your browser, you hit
>>> the same landing page as the captive portal redirection.
>>> >
>>> > It is not active in this environment, so I should have purged it from
>>> the topology snapshot
>>> >
>>> >>
>>> >> Enrique
>>> >>
>>> >> El mar, 24 ene 2023 a las 8:19, James Andrewartha via
>>> >> PacketFence-users ()
>>> >> escribió:
>>> >> >
>>> >> > Hi Ian,
>>> >> >
>>> >> > So looking through the registration PCAP, one thing I notice is
>>> that there's three requests for
>>> http://captive.apple.com/hotspot-detect.html before it tries to follow
>>> the redirect that page returns. Also your DNS is returning the same IP
>>> (66.70.255.147) for captive.apple.com as for pf4.dotto-one.com. Are you
>>> doing DNS enforcement on the portal? Then on the default network, you
>>> return 104.244.196.73 for pf4.dotto-one.com. I don't think that's wrong
>>> per se but just wanted to be clear.
>>> >> >
>>> >> > I see some accesses to https://pf4.dotto-one.com/rfc7710 after it
>>> joins the default network, can you see what content is returned? Since it
>>> tries that first before going to the captive portal URL on the default
>>> network. Short of that, could you remove option 114 and 160 from both
>>> registration and default network DHCP scopes? My feeling is that it's
>>> holding onto the URL from the registration network and re-using that on the
>>> default network instead of looking at the cappport:unrestricted value
>>> returned on the default network.
>>> >> >
>>> >> > So was the iPhone not re-DHCPing problem solved by very short lease
>>> times on the registration network?
>>> >> >
>>> >> > Thanks,
>>> >> >
>>> >> > --
>>> >> > James Andrewartha
>>> >> > Network & Projects Engineer
>>> >> > Christ Church Grammar School
>>> >> > Claremont, Western Australia
>>> >> > Ph. (08) 9442 1757
>>> >> > Mob. 0424 160 877
>>> >> >
>>> >> > On 24/1/23 06:53, Ian MacDonald via PacketFence-users wrote:
>>> >> >
>>> >> > Okay,
>>> >> >
>>> >> > We have, again, scoped down our issue further to iPhones not
>>> properly detecting they are no longer behind the Packetfence Captive
>>> Portal.  I am going to frame it up once again to see if anyone has any new
>>> insights.
>>> >> >
>>> >> > Problem:  The iPhone is holding on to the captive portal page it
>>> learns on the Registration network, and whe

Re: [PacketFence-users] iPhones trapped on reload of Captive Portal Log In Page after switch to Default network

2023-01-25 Thread Ian MacDonald via PacketFence-users
tal?
> >
> > Yes, as seen from the Default network, and the Internet, the packetfence
> server hostname resolves using public DNS, and lands on the portal listener
> attached to the management network interface on the server.
> >
> >>
> >> On your topology, port 80/443 redirected to “PF redirection URL”?
> >
> >
> > Yes, in hindsight, that detail should have been removed, as it is
> confusing in that it is unrelated to the issue, and that redirection rule
> is not active in this test environment.
> >
> > In production, there was a redirection URL in the Captive Portal
> configuration that goes to a web site;  Provided by the ISP that is
> providing the free Internet.
> > A similar redirection, if you happen to point your web browser at the
> Default network gateway, also goes to that same location.  The thinking
> here was if you are surfing in the Public space, and get curious and do a
> myipaddress.com look up and then go to that IP in your browser, you hit
> the same landing page as the captive portal redirection.
> >
> > It is not active in this environment, so I should have purged it from
> the topology snapshot
> >
> >>
> >> Enrique
> >>
> >> El mar, 24 ene 2023 a las 8:19, James Andrewartha via
> >> PacketFence-users ()
> >> escribió:
> >> >
> >> > Hi Ian,
> >> >
> >> > So looking through the registration PCAP, one thing I notice is that
> there's three requests for http://captive.apple.com/hotspot-detect.html
> before it tries to follow the redirect that page returns. Also your DNS is
> returning the same IP (66.70.255.147) for captive.apple.com as for
> pf4.dotto-one.com. Are you doing DNS enforcement on the portal? Then on
> the default network, you return 104.244.196.73 for pf4.dotto-one.com. I
> don't think that's wrong per se but just wanted to be clear.
> >> >
> >> > I see some accesses to https://pf4.dotto-one.com/rfc7710 after it
> joins the default network, can you see what content is returned? Since it
> tries that first before going to the captive portal URL on the default
> network. Short of that, could you remove option 114 and 160 from both
> registration and default network DHCP scopes? My feeling is that it's
> holding onto the URL from the registration network and re-using that on the
> default network instead of looking at the cappport:unrestricted value
> returned on the default network.
> >> >
> >> > So was the iPhone not re-DHCPing problem solved by very short lease
> times on the registration network?
> >> >
> >> > Thanks,
> >> >
> >> > --
> >> > James Andrewartha
> >> > Network & Projects Engineer
> >> > Christ Church Grammar School
> >> > Claremont, Western Australia
> >> > Ph. (08) 9442 1757
> >> > Mob. 0424 160 877
> >> >
> >> > On 24/1/23 06:53, Ian MacDonald via PacketFence-users wrote:
> >> >
> >> > Okay,
> >> >
> >> > We have, again, scoped down our issue further to iPhones not properly
> detecting they are no longer behind the Packetfence Captive Portal.  I am
> going to frame it up once again to see if anyone has any new insights.
> >> >
> >> > Problem:  The iPhone is holding on to the captive portal page it
> learns on the Registration network, and when it gets to the Default
> network, it fails at detecting it is on the Internet, and it returns to the
> Captive Portal page and traps the user there in the iphone's Log In
> interface.
> >> >
> >> > If we block the iPhone from the Packetfence portal listener, after it
> is on the Default network, it works and believes it is no longer Captive.
> Unfortunately this also blocks registration activation links sent via
> Email, so it doesn't quite qualify as a workaround unless we can separate
> the hostname used for Email Activation from the hostname used for the
> Captive Portal and block the latter with DNS overrides on our Default
> network.
> >> >
> >> > It seems like the correct configuration would have Packetfence
> instruct the iPhones to not use the Captive Portal URL reachability as
> network detection, and possibly we have no control over this OR possibly it
> can be done somehow through the Captive Portal API TBD.
> >> >
> >> > Help on how to do either of these things in Packetfence config
> appreciated.
> >> >
> >> > Here is our lab v12.1 setup.
> >> >
> >> >
> >> > As we moved through our testing we have made a

Re: [PacketFence-users] iPhones trapped on reload of Captive Portal Log In Page after switch to Default network

2023-01-24 Thread Ian MacDonald via PacketFence-users
Quick inline response to your questions;  Thank you for having a peek.

On Tue, Jan 24, 2023 at 5:45 PM Enrique Gross via PacketFence-users <
packetfence-users@lists.sourceforge.net> wrote:

> Regarding DNS, domain resolves to your public address? is that
> correct? And that is the same domain as captive portal?
>
Yes, as seen from the Default network, and the Internet, the packetfence
server hostname resolves using public DNS, and lands on the portal listener
attached to the management network interface on the server.


> On your topology, port 80/443 redirected to “PF redirection URL”?
>

Yes, in hindsight, that detail should have been removed, as it is confusing
in that it is unrelated to the issue, and that redirection rule is not
active in this test environment.

In production, there was a redirection URL in the Captive Portal
configuration that goes to a web site;  Provided by the ISP that is
providing the free Internet.
A similar redirection, if you happen to point your web browser at the
Default network gateway, also goes to that same location.  The thinking
here was if you are surfing in the Public space, and get curious and do a
myipaddress.com look up and then go to that IP in your browser, you hit the
same landing page as the captive portal redirection.

It is not active in this environment, so I should have purged it from the
topology snapshot


> Enrique
>
> El mar, 24 ene 2023 a las 8:19, James Andrewartha via
> PacketFence-users ()
> escribió:
> >
> > Hi Ian,
> >
> > So looking through the registration PCAP, one thing I notice is that
> there's three requests for http://captive.apple.com/hotspot-detect.html
> before it tries to follow the redirect that page returns. Also your DNS is
> returning the same IP (66.70.255.147) for captive.apple.com as for
> pf4.dotto-one.com. Are you doing DNS enforcement on the portal? Then on
> the default network, you return 104.244.196.73 for pf4.dotto-one.com. I
> don't think that's wrong per se but just wanted to be clear.
> >
> > I see some accesses to https://pf4.dotto-one.com/rfc7710 after it joins
> the default network, can you see what content is returned? Since it tries
> that first before going to the captive portal URL on the default network.
> Short of that, could you remove option 114 and 160 from both registration
> and default network DHCP scopes? My feeling is that it's holding onto the
> URL from the registration network and re-using that on the default network
> instead of looking at the cappport:unrestricted value returned on the
> default network.
> >
> > So was the iPhone not re-DHCPing problem solved by very short lease
> times on the registration network?
> >
> > Thanks,
> >
> > --
> > James Andrewartha
> > Network & Projects Engineer
> > Christ Church Grammar School
> > Claremont, Western Australia
> > Ph. (08) 9442 1757
> > Mob. 0424 160 877
> >
> > On 24/1/23 06:53, Ian MacDonald via PacketFence-users wrote:
> >
> > Okay,
> >
> > We have, again, scoped down our issue further to iPhones not properly
> detecting they are no longer behind the Packetfence Captive Portal.  I am
> going to frame it up once again to see if anyone has any new insights.
> >
> > Problem:  The iPhone is holding on to the captive portal page it learns
> on the Registration network, and when it gets to the Default network, it
> fails at detecting it is on the Internet, and it returns to the Captive
> Portal page and traps the user there in the iphone's Log In interface.
> >
> > If we block the iPhone from the Packetfence portal listener, after it is
> on the Default network, it works and believes it is no longer Captive.
> Unfortunately this also blocks registration activation links sent via
> Email, so it doesn't quite qualify as a workaround unless we can separate
> the hostname used for Email Activation from the hostname used for the
> Captive Portal and block the latter with DNS overrides on our Default
> network.
> >
> > It seems like the correct configuration would have Packetfence instruct
> the iPhones to not use the Captive Portal URL reachability as network
> detection, and possibly we have no control over this OR possibly it can be
> done somehow through the Captive Portal API TBD.
> >
> > Help on how to do either of these things in Packetfence config
> appreciated.
> >
> > Here is our lab v12.1 setup.
> >
> >
> > As we moved through our testing we have made a few changes, none of
> which seem to impact the expected outcome.  We have enabled proxy
> interception, changed our network detection to a local IP,  modified the
> detection delay (30s) so that it starts after the fencing del

[PacketFence-users] Interruption in upgrade from 11.0 to 12.1 Results in radius-acct service binding issue

2023-01-22 Thread Ian MacDonald via PacketFence-users
In an upgrade of a PF environment from 11.0 to 12.1, following the upgrade
instructions, we stumbled on a bug due to some package dependencies.

These redudant packages are being tracked in a bug here
https://github.com/inverse-inc/packetfence/issues/7246 however they cause
an upgrade from 11.0 to 12.1 to fail.

>From an uptodate 11.0 instance running on Debian 11.6, we ran the upgrade
script, following prompts etc.

# /usr/local/pf/addons/full-upgrade/run-upgrade.sh

The process ended due to the package conflict, but we were able to resolve
the conflicts, getting apt sorted out, and then resume the upgrade script,
which completed the DB upgrades and appeared to exit cleanly.

# dpkg --force-depends -r packetfence-captive-portal-javascript
# apt --fix-broken install
# /usr/local/pf/addons/full-upgrade/run-upgrade.sh

However it did not leave us with a working configuration.   The radius-acct
binding is not working properly; we are not really sure what else might be
a problem, so likely we have to restage to 11.0 and try again once we
figure out an upgrade strategy.

Previously we had the same issue in another environment going from 11.0
->11.1 ->11.2 ->12.1 where the upgrade process was interrupted by a problem
with our firewall causing timeouts during the docker pull.   In that case,
there was no package issue, but we were restarting the upgrade script.  The
result was the same binding issue, and in that case we just decided to
restage to 12.1 rather than figure it out, as our goal was to complete
testing of the captive portal on 12.1.

Our hypothesis is that interruption of the upgrade wrapper script, in one
case due to a timeout on docker image pulls, and in this latest case from
package dependencies, led to a 12.1 environment with this service binding
issue.

Our next step will be to restage 11.0 -> 12.1 without interruption by
starting with

# dpkg --force-depends -r packetfence-captive-portal-javascript

to see if we can get to 12.1 with radius-acct/pfacct bindings in proper
working order.

Below are some details, maybe if there is an obvious quick fix that might
be useful for anyone going through the upgrade process with similar
results, or save us the time to restage later this month.

cheers,
Ian

We did some quick comparisons to try and find the difference in systemctl
packetfence units and configuration between working and no-working 12.1
system.  We greped config and systemctl units for "1813" and did not find
any differences but we know its there somewhere.

On the working PF 12.1 instance, staged directly to 12.1, the binding looks
like the following, where 10.2.1.2 is the management network where switches
connect to radius-acct.

pf4:/usr/local/pf/conf# netstat -tunap | grep ":1813"
udp0  0 10.2.1.2:1813   0.0.0.0:*
42375/pfacct

On the non-working instance the binding seems to be to the localhost;

pf3:/usr/local/pf/conf# netstat -tunap | grep ":1813"
udp0  0 127.0.0.1:1813  0.0.0.0:*
1942/pfacct


We can see radius-acct complaining in the logs as well and working hard to
restart what we believe is a docker container with radius running inside
that the actual binding should be pointed at.

radius-acct.log:Jan 22 15:48:16 pf3 radiusd-acct-docker-wrapper[93781]: Sun
Jan 22 15:48:16 2023 : Error: Failed binding to acct address * port 1813
bound to server packetfence: Address already in use
radius-acct.log:Jan 22 15:48:16 pf3 radiusd-acct-docker-wrapper[93781]: Sun
Jan 22 15:48:16 2023 : Error: /usr/local/pf/raddb/acct.conf[8]: Error
binding to port for 0.0.0.0 port 1813
radius-acct.log:Jan 22 15:48:19 pf3 radiusd-acct-docker-wrapper[93890]:
Error: No such container: radiusd-acct
radius-acct.log:Jan 22 15:48:19 pf3 radiusd-acct-docker-wrapper[93890]:
Error: No such container: 1

If we stop the pfacct service only

systemctl stop packetfence-pfacct.service

this allows the docker  binding to work but not the pfacct process can not
be restarted, and the binding appears to then be directly with freeradius,
rather than pfacct, and on all interfaces rather than just the registration
VLAN.

pf3:/usr/local/pf/logs# netstat -tunap | grep ":1813"
udp0  0 0.0.0.0:18130.0.0.0:*
93957/freeradius

Without more knowledge of the difference between the PF Go (pfacct) and
Freeradius processes and how the new docker container bindings work, it
looks like restage and retry the upgrade is probably the next step for us.
___
PacketFence-users mailing list
PacketFence-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/packetfence-users


Re: [PacketFence-users] Default Network Detection failure after RADIUS Disconnect from Captive Portal

2023-01-20 Thread Ian MacDonald via PacketFence-users
>
>
> Do you happen to have the
> https://www.packetfence.org/doc/PacketFence_Installation_Guide.html#_dhcp_remote_sensor
>  installed
> on your DHCP server?
> On a side note, what is the DHCP max lease time in your registration
> network ?
>
We do not use this sensor;  It looks like it just puts the DHCP packets
into the PF Server accounting;  I don't think it changes the nature of the
DHCP requests, which we can see.

We did shorten my registration VLAN dhcp lease time, and lease max time
from 30s to 15s, but it did not appear to have any impact on how the iPhone
handles things.

[10.2.2.0]
netmask=255.255.255.0
type=vlan-registration
split_network=disabled
fake_mac_enabled=disabled
coa=disabled
netflow_accounting_enabled=disabled
dns=10.2.2.2
named=enabled
gateway=10.2.2.2
dhcp_end=10.2.2.246
nat_enabled=disabled
dhcp_default_lease_time=15
dhcpd=enabled
dhcp_max_lease_time=15
nat_dns=disabled
domain-name=vlan-registration.test.com
dhcp_start=10.2.2.10
pool_backend=memory
id=10.2.2.0


>
> Thanks,
>
> *Ludovic Zammit*
> *Product Support Engineer Principal Lead*
> *Cell:* +1.613.670.8432
> Akamai Technologies - Inverse
> 145 Broadway
> Cambridge, MA 02142
> Connect with Us: <https://community.akamai.com> <http://blogs.akamai.com>
> <https://twitter.com/akamai> <http://www.facebook.com/AkamaiTechnologies>
> <http://www.linkedin.com/company/akamai-technologies>
> <http://www.youtube.com/user/akamaitechnologies?feature=results_main>
>
> On Jan 19, 2023, at 12:37 AM, Ian MacDonald via PacketFence-users <
> packetfence-users@lists.sourceforge.net> wrote:
>
> Well,
>
> I guess we are stuck without more insight on how to get the iPhone to
> update it's IP after the WiFi disconnect/reconnect to the Default VLAN.
>
> The only scenario I can think of, is that the iPhone believes it is on the
> same network, and behaves like it is moving between different  ESSIDs on a
> mesh network which would not initiate a DHCP request; which isn't the case
> here.
>
> I wonder if we plugged the iPhone into a USB dongle and tried the same
> thing on a wired connection, if it would behave the same way.I'm not going
> to test this as it doesn't solve anything.
>
> Ideas
>
> - shorten registration DHCP to inside the captive portal delay seems like
> a good option to try
> - I am wondering if there is any way I can get our DHCP server to respond
> to the request we saw from the iPhone on the old registration VLAN .. that
> may have actually been due to the lease expiring; at 30s as you suggested
> it may be configured. Hopefully expiring at 10s before the disconnect
> creates a DISCOVER rather than request for refresh on old registration VLAN
> subnet to a DHCP server that is unreachable
> - I am wondering if there is any signal PF can send after registration via
> captive portal URL to help - like tell it to behave like an old device
> - I am thinking maybe including a forced redirect URL in the connection
> profile might help the trigger - I don't quite understand how it is passed,
> but must be from the portal server URL to exist after the VLAN switch
> - double check there are no authoritative setting for DHCP on my Default
> network that would allow it to signal to the iPhone when it sees any DHCP
> traffic, like the frame you saw requesting a URL.
> - is there a wifi feature we can turn off at a lower layer that would
> ensure all wifi disconnects are treated like an interface up/down - like
> disabling roaming features on the AP
>
> Maybe the PF team are aware of this scenario; it would be nice to know we
> are near the end of the road.
>
> Any help appreciated
>
> Cheers,
> Ian
>
>
> On Wed., Jan. 18, 2023, 10:57 p.m. Diego Garcia del Rio, <
> garc...@gmail.com> wrote:
>
>> Hi Ian
>>
>> thanks for the extremely thorough troubleshooting.
>>
>> Its very weird that the client is not requesting dhcp after
>> re-connecting. On the AP side, it seems like its disconnecting the client,
>> so im surprised its not doing a new DHCP request,
>>
>> An option you might have is to change the dhcp lease time of the
>> registration server. By default, its setup for 30 seconds, but it _might_
>> be editable.. I would try setting it for maybe 10 seconds?
>>
>> the client is requesting the dhcp option 114 (you can see the mention to
>> "URL" in the dhcp capture below)
>>Parameter-Request Option 55, length 9:
>>  Subnet-Mask, Classless-Static-Route, Default-Gateway,
>> Domain-Name-Server
>>  Domain-Name, Option 108, *URL*, Option 119
>>  Option 252
>>
>>
>> And your reply is correct... (with the urn saying no por

Re: [PacketFence-users] Newer Model iPhones and Android Devices showing MAC:0 in Captive Portal

2023-01-17 Thread Ian MacDonald via PacketFence-users
 pf4 httpd.aaa-docker-wrapper[2464]: httpd.aaa(8) INFO:
[mac:f2:ef:bb:22:8c:62] PID: "endu...@gmail.com", Status: reg Returned
VLAN: (undefined), Role: guest (pf::role::fetchRoleForNode)
Jan 17 15:37:45 pf4 httpd.aaa-docker-wrapper[2464]: httpd.aaa(8) INFO:
[mac:f2:ef:bb:22:8c:62] (10.2.1.11) Added VLAN 83 to the returned RADIUS
Access-Accept (pf::Switch::returnRadiusAccessAccept)
Jan 17 15:37:45 pf4 httpd.aaa-docker-wrapper[2464]: httpd.aaa(8) INFO:
[mac:f2:ef:bb:22:8c:62] Updating locationlog from accounting request
(pf::api::handle_accounting_metadata)
Jan 17 15:37:45 pf4 httpd.aaa-docker-wrapper[2464]: httpd.aaa(8) WARN:
[mac:f2:ef:bb:22:8c:62] Cannot find any combination ID in any schemas
(fingerbank::Source::LocalDB::_getCombinationID)
Jan 17 15:37:45 pf4 httpd.aaa-docker-wrapper[2464]: httpd.aaa(8) INFO:
[mac:f2:ef:bb:22:8c:62] Upstream is configured and unable to fullfil an
exact match locally. Will ignore result from local database
(fingerbank::Source::LocalDB::match)
Jan 17 15:37:45 pf4 httpd.aaa-docker-wrapper[2464]: httpd.aaa(8) INFO:
[mac:f2:ef:bb:22:8c:62] Successfully interrogate upstream Fingerbank
project for matching. Got device : 264
(fingerbank::Source::Collector::match)
Jan 17 15:37:45 pf4 httpd.aaa-docker-wrapper[2464]: httpd.aaa(8) WARN:
[mac:f2:ef:bb:22:8c:62] Unable to pull accounting history for device
f2:ef:bb:22:8c:62. The history set doesn't exist yet.
(pf::accounting_events_history::latest_mac_history)



On Wed, Jan 11, 2023 at 3:09 PM Diego Garcia del Rio 
wrote:

> im Guessing it might be related to the rfc7710bis / rfc8910  portal support
>
> this means that via dhcp, the client is provided with an URL they can use
> to check the status of the device in the portal (whether they are still
> jailed or no)
>
> normally this information is served on the same interface as the portal if
> im not mistaken. you might want to check the logs for pf.log or the
> haproxy-portal log for urls matching "/rfc7710"
>
> if so.. it might be that the clients are too fast re-accessing that url
> and determining they are still locked
>
> in my case, forcing a disconnect via the COA will cause the client to
> re-issue a dhcp request.. and thus, a new portal request?
>
>
>
>
> On Wed, Jan 11, 2023 at 2:36 PM Ian MacDonald via PacketFence-users <
> packetfence-users@lists.sourceforge.net> wrote:
>
>> Daniel,
>>
>> The random MAC would seem like an obvious culprit, but it is not.
>>
>> On an iPhone, if you click on the little "i" for information next to each
>> connection, you see that the OS uses the same [random] mac per SSID, so it
>> will never change for a given WiFi network after is has connected.  It will
>> be different for each SSID/network.
>>
>> Since it does not change per SSID, the MAC can be used for auth, but
>> obviously the OUI will no longer indicate it is an Apple device.
>>
>> cheers,
>> Ian
>>
>>
>> On Wed, Jan 11, 2023 at 11:48 AM Daniel Silva  wrote:
>>
>>> Good afternoon,
>>>
>>> We are having the same problem, in the new version people have captured
>>> random macs, and they redirect to a page with information on how to disable
>>> random macs. Would that really be the best way to solve it? I don't know, I
>>> just know that it generates complaints, tickets in our environment. If you
>>> have any idea of how to work around this situation, please send it to me at
>>> dan...@unifor.br, thank you in advance.
>>>
>>>
>>>
>>>
>>> *Daniel Ricardo*
>>>
>>> Analista de Infraestrutura
>>> NATI - Núcleo de Aplicação em Tecnologia da Informação
>>>
>>> Universidade de Fortaleza
>>>
>>> Tel.: (85) 3477.3302
>>>
>>>
>>>
>>> Em qua., 11 de jan. de 2023 às 10:52, Ian MacDonald via
>>> PacketFence-users  escreveu:
>>>
>>>> Hi Packetfence,
>>>>
>>>> We have been struggling with some newer model mobile devices with our
>>>> WiFi captive portal implementation using Packetfence, and have not seen any
>>>> change in the behavior on 11.0 thru 12.1 with our current connection
>>>> profile.
>>>>
>>>> We do not use an inline configuration, and now we are upgraded to 12.1
>>>> on Debian 11, though we have not seen any related changelogs for specific
>>>> device enumeration related to our issue, so we believe there is some new
>>>> capabilities in how these platforms handle WiFi Login that we are missing
>>>> configuration for.
>>>>
>>>> A bit more about our environment.
>>>>
>>

Re: [PacketFence-users] New PF 12.1 Installation on Debian 11.6 Bullseye

2023-01-17 Thread Ian MacDonald via PacketFence-users
It seems that the firewall protecting our management portal was responsible
for the issue we were having.

As noted in another thread we were experiencing timeout issues pulling the
PF images during the installation process;  We were able to resolve this
issue with our firewall.  What is not absolutely clear to us is if the
problems with the Configurator, which used the same firewall, were caused
by possible session issues on the firewall, or corruption in the images
possibly created during the restarted installation image pulling process.

During our various attempts at v11.2, 12.0 and 12.1, we saw some strange
GUI artifacts, for example the Pulldowns that allows selection of 'ssl' for
SMTP or 'timezone' for system clock were showing empty lists, but appeared
populated after some GUI Previous/Next operations, which suggests to us the
firewall and not image corruption was the root cause of our issues.

A freshly deployed 12.1, without any timeouts during image pull, is now
launching the Configurator with interface detection as we expect.

This issue is resolved from our perspective, and seemed to be squarely in
our court from the start.

cheers,
Ian

On Mon, Jan 16, 2023 at 4:40 PM Ian MacDonald  wrote:

> Hello Packetfence Users,
>
> We tested a fresh install of v12.1 on a freshly spun up Debian 11.6
> today.
>
>
> packetfence_12.1.0+20230116163629+748667390+0011+maintenance~12~1+bullseye1_all.deb
>
> Prior to installing packetfence, we deployed some basic packages, listed
> here as part of our default staging script.  I do not see any reason any of
> these would cause detection scripts for the interfaces to fail, and both
> ifconfig and ip are providing valid output on the CLI.
>
> apt install gnupg arptables dnsutils unzip pigz mtr-tiny less vim screen
> curl iperf3 wget tcpdump dialog subnetcalc vlan bridge-utils ethtool iftop
> iotop deborphan apt-show-versions ethtool pv systemd-timesyncd
>
> The configurator however did not detect any interfaces.  The simple
> interface configuration is shown below, for the Management, Registration
> and Isolation (eth0, eth1, eth2 respectively).
>
> ip addr output is below.  The configurator web page is stuck at step 1,
> with no interfaces shown to select for the Management network.  Our next
> step will be to add our interfaces into the configuration manually via CLI
> and see if the configurator picks them up, or possibly revert to  v11.1 and
> see if it happens there too.
>
> Any insights here on why this might be happening appreciated.
>
> pf5:~# ip addr
> 1: lo:  mtu 65536 qdisc noqueue state UNKNOWN group
> default qlen 1000
> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> inet 127.0.0.1/8 scope host lo
>valid_lft forever preferred_lft forever
> inet6 ::1/128 scope host
>valid_lft forever preferred_lft forever
> 2: eth0:  mtu 1500 qdisc mq state UP
> group default qlen 1000
> link/ether 00:16:3e:dc:7d:fd brd ff:ff:ff:ff:ff:ff
> inet 10.2.1.2/24 brd 10.2.1.255 scope global eth0
>valid_lft forever preferred_lft forever
> inet6 fe80::216:3eff:fedc:7dfd/64 scope link
>valid_lft forever preferred_lft forever
> 3: eth1:  mtu 1500 qdisc mq state UP
> group default qlen 1000
> link/ether 00:16:3e:dc:7d:fe brd ff:ff:ff:ff:ff:ff
> inet 10.2.2.2/24 brd 10.2.2.255 scope global eth1
>valid_lft forever preferred_lft forever
> inet6 fe80::216:3eff:fedc:7dfe/64 scope link
>valid_lft forever preferred_lft forever
> 4: eth2:  mtu 1500 qdisc mq state UP
> group default qlen 1000
> link/ether 00:16:3e:dc:7d:ff brd ff:ff:ff:ff:ff:ff
> inet 10.2.3.2/24 brd 10.2.3.255 scope global eth2
>valid_lft forever preferred_lft forever
> inet6 fe80::216:3eff:fedc:7dff/64 scope link
>valid_lft forever preferred_lft forever
> 153: docker0:  mtu 1500 qdisc noqueue
> state DOWN group default
> link/ether 02:42:e7:4c:45:ca brd ff:ff:ff:ff:ff:ff
> inet 100.64.0.1/24 brd 100.64.0.255 scope global docker0
>valid_lft forever preferred_lft forever
>
___
PacketFence-users mailing list
PacketFence-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/packetfence-users


Re: [PacketFence-users] Strange Timeout Errors during image download for PF 11 and 12 Upgrades

2023-01-17 Thread Ian MacDonald via PacketFence-users
After ruling out DNS and IPv6, we were able to isolate the timeout issues
to our firewall.

We are no longer seeing issues related to the pulling of the images,
similar to this one below.

error pulling image configuration: Get "
https://ghcr.io/v2/inverse-inc/packetfence/pfpki/blobs/sha256:02d3d13b0a52bd4a460a944ad0b960b45ec71c35b8c89ae282b9a68142e2cb1d":
dial tcp: lookup ghcr.io on 8.8.8.8
:53: read udp 10.2.1.2:51197->8.8.8.8:53: i/o timeout



On Thu, Jan 12, 2023 at 12:21 PM Ian MacDonald  wrote:

> An update,
>
> A maintenance release on 12.1 available today provided an opportunity to
> catch this timeout issue again, so we tcpdumped our DNS traffic during the
> apt upgrade process.
>
> The issue did not occur.  All we saw were what appeared to be normal
> requests for ghcr.io and pkg-containers.githubusercontent.com and we did
> not experience any timeouts.
>
> Will report back if we see this issue again,
>
> cheers,
> Ian
>
> 12:06:17.484308 IP 105.244.196.73.37145 > 105.244.196.155.53: 23286+ ?
> ghcr.io. (25)
> 12:06:17.484357 IP 105.244.196.73.43760 > 105.244.196.155.53: 14626+ A?
> ghcr.io. (25)
> 12:06:17.484516 IP 105.244.196.155.53 > 105.244.196.73.37145: 23286 0/1/0
> (109)
> 12:06:17.484562 IP 105.244.196.155.53 > 105.244.196.73.43760: 14626 1/0/0
> A 140.82.114.34 (41)
> 12:06:17.649272 IP 105.244.196.73.35596 > 105.244.196.155.53: 37537+ A?
> pkg-containers.githubusercontent.com. (54)
> 12:06:17.649312 IP 105.244.196.73.59271 > 105.244.196.155.53: 34028+ ?
> pkg-containers.githubusercontent.com. (54)
> 12:06:17.649490 IP 105.244.196.155.53 > 105.244.196.73.35596: 37537 4/0/0
> A 185.199.111.154, A 185.199.110.154, A 185.199.108.154, A 185.199.109.154
> (118)
> 12:06:17.649562 IP 105.244.196.155.53 > 105.244.196.73.59271: 34028 4/0/0
>  2606:50c0:8000::154,  2606:50c0:8002::154, 
> 2606:50c0:8003::154,  2606:50c0:8001::154 (166)
>
> Thu Jan 12 12:02:55 EST 2023 - Images detected:
> - proxysql
> - haproxy-portal
> - pfsso
> - radiusd-eduroam
> - httpd.aaa
> - radiusd-cli
> - pfconfig
> - fingerbank-db
> - pfcmd
> - radiusd-load-balancer
> - httpd.admin_dispatcher
> - radiusd-acct
> - pfpki
> - httpd.portal
> - httpd.dispatcher
> - pfcron
> - pfconnector
> - httpd.webservices
> - radiusd-auth
> - haproxy-admin
> - pfqueue
> - api-frontend
> - pfperl-api
> ghcr.io/inverse-inc/packetfence/proxysql:maintenance-12-1
> ghcr.io/inverse-inc/packetfence/haproxy-portal:maintenance-12-1
> ghcr.io/inverse-inc/packetfence/pfsso:maintenance-12-1
> ghcr.io/inverse-inc/packetfence/radiusd-eduroam:maintenance-12-1
> ghcr.io/inverse-inc/packetfence/httpd.aaa:maintenance-12-1
> ghcr.io/inverse-inc/packetfence/radiusd-cli:maintenance-12-1
> ghcr.io/inverse-inc/packetfence/pfconfig:maintenance-12-1
> ghcr.io/inverse-inc/packetfence/fingerbank-db:maintenance-12-1
> ghcr.io/inverse-inc/packetfence/pfcmd:maintenance-12-1
> ghcr.io/inverse-inc/packetfence/radiusd-load-balancer:maintenance-12-1
> ghcr.io/inverse-inc/packetfence/httpd.admin_dispatcher:maintenance-12-1
> ghcr.io/inverse-inc/packetfence/radiusd-acct:maintenance-12-1
> ghcr.io/inverse-inc/packetfence/pfpki:maintenance-12-1
> ghcr.io/inverse-inc/packetfence/httpd.portal:maintenance-12-1
> ghcr.io/inverse-inc/packetfence/httpd.dispatcher:maintenance-12-1
> ghcr.io/inverse-inc/packetfence/pfcron:maintenance-12-1
> ghcr.io/inverse-inc/packetfence/pfconnector:maintenance-12-1
> ghcr.io/inverse-inc/packetfence/httpd.webservices:maintenance-12-1
> ghcr.io/inverse-inc/packetfence/radiusd-auth:maintenance-12-1
> ghcr.io/inverse-inc/packetfence/haproxy-admin:maintenance-12-1
> ghcr.io/inverse-inc/packetfence/pfqueue:maintenance-12-1
> ghcr.io/inverse-inc/packetfence/api-frontend:maintenance-12-1
> ghcr.io/inverse-inc/packetfence/pfperl-api:maintenance-12-1
> Thu Jan 12 12:11:47 EST 2023 - Pull of images finished
> Thu Jan 12 12:11:49 EST 2023 - Tag of images finished
>
> On Tue, Jan 10, 2023 at 3:27 PM Ian MacDonald  wrote:
>
>> Hey PF Users,
>>
>> For recent versions; I believe 11.1, 12.0 and now 12.1 and possibly 11.0
>> (Fairly certain since the images below were downloaded from Inverse repos
>> all at once during the installation or upgrade process) We have been having
>> to restart the upgrade process due to timeout related errors.  Often 1-5
>> times re-executions are required to complete the upgrade process.
>>
>> - proxysql
>> - haproxy-portal
>> - pfsso
>> - radiusd-eduroam
>> - httpd.aaa
>> - radiusd-cli
>> - pfconfig
>> - fingerbank-db
>> - pfcmd
>> - radiusd-load-balancer
>> - httpd.admin_dispatcher
>> - radiusd-acct
>> - pfpki
>> - httpd.portal
>> - httpd.dispatcher
>> - pfcron
>> - pfconnector
>> - httpd.webservices
>> - radiusd-auth
>> - haproxy-admin
>> - pfqueue
>> - api-frontend
>> - pfperl-api
>>
>> We do not really understand why, as there does not appear to be any
>> connectivity or DNS lookup issues that would cause this type of behavior.
>>  Below are some of the output lines captured during our installation
>> 

Re: [PacketFence-users] New PF 12.1 Installation on Debian 11.6 Bullseye

2023-01-16 Thread Ian MacDonald via PacketFence-users
We restaged our environment

https://github.com/inverse-inc/packetfence/issues/7403 describes some
similar symptoms, so I have added some additional debug below from

cat /etc/network/interfaces
/usr/local/pf/sbin/pfperl-api get /api/v1/config/interfaces | jq
ip -br a
docker container ls

Following some of the triage steps in 7403, I additionally enabled debug on
pfperl-api, restarted the service and hit the Wizard Step 1 again, and
while I was capturing output for this email (below), the interfaces all of
a sudden appeared. The log output is also below from the process
restart, and I note there are some WARN level messages related to an ip
command exiting with a non-zero value for interfaces that do not appear in
my interface list.

packetfence.log:Jan 17 00:33:19 pf5 pfperl-api-docker-wrapper[69063]:
pfperl-api(15) WARN: [mac:[undef]] Problem trying to run command: LANG=C
sudo ip -4 -o addr show veth17036df called from (eval). Child exited with
non-zero value 1 (pf::util::pf_run)

It seemed odd, so I reverted the DEBUG log to INFO, and rebooted the system
to see if I could reproduce the behavior.  Sure enough, sitting on the
wizard step 1, as I had believed to have done previously, yielded no
interfaces in the Configurator and no WARN messages in the log.

It seemed that the service restart somehow resulted in the GUI being
updated.  So we restart the service, and sure enough the interfaces
populated in the Configurator.  No sign of the WARN messages either in the
log which seemed suppressed when set to INFO.

I re-ran the dump of the interfaces via API, and it gave the following
strange result

pf5:~# /usr/local/pf/sbin/pfperl-api get /api/v1/config/interfaces | jq
Device "veth7629818" does not exist.
parse error: Invalid numeric literal at line 1, column 19
Device "veth7629818" does not exist.
Device "veth6d6535d" does not exist.
Device "veth6d6535d" does not exist.
Unable to flush stdout: Broken pipe

cheers,
Ian

packetfence.log:Jan 17 00:31:15 pf5 pfperl-api-docker-wrapper[69046]:
Running with args --sig-proxy=true --rm --name=pfperl-api
--add-host=containers-gateway.internal:host-gateway -h pf5  -v
/var/lib/mysql:/var/lib/mysql -v /etc/sudoers:/etc/sudoers -v
/etc/sudoers.d/:/etc/sudoers.d/ -v
/usr/local/fingerbank/conf:/usr/local/fingerbank/conf -v
/usr/local/fingerbank/db:/usr/local/fingerbank/db -v
/usr/local/pf/var/run:/usr/local/pf/var/run -ePF_UID=996 -e PF_GID=996
-eFINGERBANK_UID=997 -e FINGERBANK_GID=997 -eIS_A_CLASSIC_PF_CONTAINER=yes
-v /etc/localtime:/etc/localtime:ro -v
/usr/local/pf/conf:/usr/local/pf/conf -v
/usr/local/pf/raddb/certs:/usr/local/pf/raddb/certs --privileged -v
/run/systemd/system:/run/systemd/system -v
/var/run/dbus/system_bus_socket:/var/run/dbus/system_bus_socket -v
/usr/local/fingerbank/conf:/usr/local/fingerbank/conf -v
/etc/sysconfig/:/etc/sysconfig -v /etc/network:/etc/network -v
/etc/resolv.conf:/etc/resolv.conf --network=host -e HOST_OS
-v/usr/local/pf/var/conf/:/usr/local/pf/var/conf/
-v/usr/local/pf/html/captive-portal/profile-templates:/usr/local/pf/html/captive-portal/profile-templates
packetfence.log:Jan 17 00:31:15 pf5 pfperl-api-docker-wrapper[69053]:
Error: No such container: pfperl-api
packetfence.log:Jan 17 00:31:17 pf5 pfperl-api-docker-wrapper[69063]:
pfperl-api(8) DEBUG: [mac:[undef]] invalid IP:  from cluster::__ANON__
(pf::util::valid_ip)
packetfence.log:Jan 17 00:31:19 pf5 pfperl-api-docker-wrapper[69063]:
pfperl-api(8) DEBUG: [mac:[undef]] cache get for namespace='configfiles',
key='/usr/local/pf/conf/roles.conf', cache='Redis:l1_cache', time='0ms':
MISS (not in cache) (CHI::Driver::_log_get_result)
packetfence.log:Jan 17 00:31:19 pf5 pfperl-api-docker-wrapper[69063]:
pfperl-api(8) DEBUG: [mac:[undef]] cache get for namespace='Default',
key='HASH(0x55f9c40983b0)', cache='RawMemory', time='0ms': MISS (not in
cache) (CHI::Driver::_log_get_result)
packetfence.log:Jan 17 00:31:19 pf5 pfperl-api-docker-wrapper[69063]:
pfperl-api(8) DEBUG: [mac:[undef]] cache set for namespace='Default',
key='{"encoding":null,"reconnect":"60","server":"containers-gateway.internal:6379"}',
size=1, expires='never', cache='RawMemory', time='0ms'
(CHI::Driver::_log_set_result)
packetfence.log:Jan 17 00:31:19 pf5 pfperl-api-docker-wrapper[69063]:
pfperl-api(8) DEBUG: [mac:[undef]] cache get for namespace='configfiles',
key='/usr/local/pf/conf/roles.conf', cache='Redis', time='1ms': HIT
(CHI::Driver::_log_get_result)
packetfence.log:Jan 17 00:31:19 pf5 pfperl-api-docker-wrapper[69063]:
pfperl-api(8) DEBUG: [mac:[undef]] cache set for namespace='configfiles',
key='/usr/local/pf/conf/roles.conf', size=1, expires='never',
cache='Redis:l1_cache', time='0ms' (CHI::Driver::_log_set_result)
packetfence.log:Jan 17 00:31:20 pf5 pfperl-api-docker-wrapper[69063]:
pfperl-api(8) DEBUG: [mac:[undef]] cache get for namespace='configfiles',
key='/usr/local/pf/conf/switches.conf', cache='Redis:l1_cache', time='0ms':
MISS (not in cache) (CHI::Driver::_log_get_result)

[PacketFence-users] New PF 12.1 Installation on Debian 11.6 Bullseye

2023-01-16 Thread Ian MacDonald via PacketFence-users
Hello Packetfence Users,

We tested a fresh install of v12.1 on a freshly spun up Debian 11.6 today.

packetfence_12.1.0+20230116163629+748667390+0011+maintenance~12~1+bullseye1_all.deb

Prior to installing packetfence, we deployed some basic packages, listed
here as part of our default staging script.  I do not see any reason any of
these would cause detection scripts for the interfaces to fail, and both
ifconfig and ip are providing valid output on the CLI.

apt install gnupg arptables dnsutils unzip pigz mtr-tiny less vim screen
curl iperf3 wget tcpdump dialog subnetcalc vlan bridge-utils ethtool iftop
iotop deborphan apt-show-versions ethtool pv systemd-timesyncd

The configurator however did not detect any interfaces.  The simple
interface configuration is shown below, for the Management, Registration
and Isolation (eth0, eth1, eth2 respectively).

ip addr output is below.  The configurator web page is stuck at step 1,
with no interfaces shown to select for the Management network.  Our next
step will be to add our interfaces into the configuration manually via CLI
and see if the configurator picks them up, or possibly revert to  v11.1 and
see if it happens there too.

Any insights here on why this might be happening appreciated.

pf5:~# ip addr
1: lo:  mtu 65536 qdisc noqueue state UNKNOWN group
default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
   valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
   valid_lft forever preferred_lft forever
2: eth0:  mtu 1500 qdisc mq state UP group
default qlen 1000
link/ether 00:16:3e:dc:7d:fd brd ff:ff:ff:ff:ff:ff
inet 10.2.1.2/24 brd 10.2.1.255 scope global eth0
   valid_lft forever preferred_lft forever
inet6 fe80::216:3eff:fedc:7dfd/64 scope link
   valid_lft forever preferred_lft forever
3: eth1:  mtu 1500 qdisc mq state UP group
default qlen 1000
link/ether 00:16:3e:dc:7d:fe brd ff:ff:ff:ff:ff:ff
inet 10.2.2.2/24 brd 10.2.2.255 scope global eth1
   valid_lft forever preferred_lft forever
inet6 fe80::216:3eff:fedc:7dfe/64 scope link
   valid_lft forever preferred_lft forever
4: eth2:  mtu 1500 qdisc mq state UP group
default qlen 1000
link/ether 00:16:3e:dc:7d:ff brd ff:ff:ff:ff:ff:ff
inet 10.2.3.2/24 brd 10.2.3.255 scope global eth2
   valid_lft forever preferred_lft forever
inet6 fe80::216:3eff:fedc:7dff/64 scope link
   valid_lft forever preferred_lft forever
153: docker0:  mtu 1500 qdisc noqueue
state DOWN group default
link/ether 02:42:e7:4c:45:ca brd ff:ff:ff:ff:ff:ff
inet 100.64.0.1/24 brd 100.64.0.255 scope global docker0
   valid_lft forever preferred_lft forever
___
PacketFence-users mailing list
PacketFence-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/packetfence-users


[PacketFence-users] RADIUS Accounting service binding issue

2023-01-12 Thread Ian MacDonald via PacketFence-users
Hey PF Users,

Our environment is PF 12.1 on Debian 11.  It is in our lab.  Our production
is still on 11.0.

We recently upgraded this lab from 11.0, right through to 12.1.   We did
some light testing of our captive portal at each stage to make sure the
basic functions were working properly, and now that we have arrived at 12.1
we found some issues.

Its been a while since we debugged PF, so jumping in after a lot of
changes.

The first error we are dealing with is what looks like some sort of failure
to bind to port 1813.

Jan 11 14:03:15 pf4 radiusd-acct-docker-wrapper[1106167]: Wed Jan 11
14:03:15 2023 : Error: Failed binding to acct address * port 1813 bound to
server packetfence: Address already in use
Yet, we see that there is a localhost binding from process pfacct

pf4:/usr/local/pf/logs# netstat -tunap | grep 1813
udp0  0 127.0.0.1:1813  0.0.0.0:*
1741/pfacct

It looks like the service endlessly respawns with a similar set of errors
shown in the logs  below (removed the Info/Warn lines).

It seems like packetfence somehow interfering with its own radius binding,
or not detecting it properly, as it looks like pfacct is the packetfence
radiusd-acct instance, so its not clear to us what is going on here and why
it is respawning constantly.

Jan 12 18:03:57 pf4 radiusd-acct-docker-wrapper[54467]: Thu Jan 12 18:03:57
2023 : Error: Failed binding to acct address * port 1813 bound to server
packetfence: Address already in use
Jan 12 18:03:57 pf4 radiusd-acct-docker-wrapper[54467]: Thu Jan 12 18:03:57
2023 : Error: /usr/local/pf/raddb/acct.conf[8]: Error binding to port for
0.0.0.0 port 1813
Jan 12 18:07:13 pf4 radiusd-acct-docker-wrapper[5922]: Error: No such
container: radiusd-acct
Jan 12 18:07:13 pf4 radiusd-acct-docker-wrapper[5922]: Error: No such
container: 1
Jan 12 18:07:17 pf4 radiusd-acct-docker-wrapper[5936]: Failed to create
stream fd: No such file or directory

Browsing the journal for other error-level logs after startup, we found
some pfstats errors, that we do not think are related

pfstats.log:Jan 12 18:06:44 pf4 pfstats[1064]: t=2023-01-12T18:06:44-0500
lvl=eror msg="Error while creating statsd client: write udp 127.0.0.1:56358
->127.0.0.1:8125: write: connection refused" pid=1064
pfstats.log:Jan 12 18:06:46 pf4 pfstats[1064]: t=2023-01-12T18:06:46-0500
lvl=eror msg="Seek error on /usr/local/pf/logs/mariadb.log: seek
/usr/local/pf/logs/mariadb.log: invalid argument" pid=1064
pfstats.log:Jan 12 18:06:46 pf4 pfstats[1064]: t=2023-01-12T18:06:46-0500
lvl=eror msg="Error while processing metric config: Could not parse
duration: " pid=1064
pfstats.log:Jan 12 18:07:10 pf4 pfstats[1064]: t=2023-01-12T18:07:10-0500
lvl=eror msg="API error: Proxy Error" pid=1064


cheers,
Ian
___
PacketFence-users mailing list
PacketFence-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/packetfence-users


Re: [PacketFence-users] Strange Timeout Errors during image download for PF 11 and 12 Upgrades

2023-01-12 Thread Ian MacDonald via PacketFence-users
An update,

A maintenance release on 12.1 available today provided an opportunity to
catch this timeout issue again, so we tcpdumped our DNS traffic during the
apt upgrade process.

The issue did not occur.  All we saw were what appeared to be normal
requests for ghcr.io and pkg-containers.githubusercontent.com and we did
not experience any timeouts.

Will report back if we see this issue again,

cheers,
Ian

12:06:17.484308 IP 105.244.196.73.37145 > 105.244.196.155.53: 23286+ ?
ghcr.io. (25)
12:06:17.484357 IP 105.244.196.73.43760 > 105.244.196.155.53: 14626+ A?
ghcr.io. (25)
12:06:17.484516 IP 105.244.196.155.53 > 105.244.196.73.37145: 23286 0/1/0
(109)
12:06:17.484562 IP 105.244.196.155.53 > 105.244.196.73.43760: 14626 1/0/0 A
140.82.114.34 (41)
12:06:17.649272 IP 105.244.196.73.35596 > 105.244.196.155.53: 37537+ A?
pkg-containers.githubusercontent.com. (54)
12:06:17.649312 IP 105.244.196.73.59271 > 105.244.196.155.53: 34028+ ?
pkg-containers.githubusercontent.com. (54)
12:06:17.649490 IP 105.244.196.155.53 > 105.244.196.73.35596: 37537 4/0/0 A
185.199.111.154, A 185.199.110.154, A 185.199.108.154, A 185.199.109.154
(118)
12:06:17.649562 IP 105.244.196.155.53 > 105.244.196.73.59271: 34028 4/0/0
 2606:50c0:8000::154,  2606:50c0:8002::154, 
2606:50c0:8003::154,  2606:50c0:8001::154 (166)

Thu Jan 12 12:02:55 EST 2023 - Images detected:
- proxysql
- haproxy-portal
- pfsso
- radiusd-eduroam
- httpd.aaa
- radiusd-cli
- pfconfig
- fingerbank-db
- pfcmd
- radiusd-load-balancer
- httpd.admin_dispatcher
- radiusd-acct
- pfpki
- httpd.portal
- httpd.dispatcher
- pfcron
- pfconnector
- httpd.webservices
- radiusd-auth
- haproxy-admin
- pfqueue
- api-frontend
- pfperl-api
ghcr.io/inverse-inc/packetfence/proxysql:maintenance-12-1
ghcr.io/inverse-inc/packetfence/haproxy-portal:maintenance-12-1
ghcr.io/inverse-inc/packetfence/pfsso:maintenance-12-1
ghcr.io/inverse-inc/packetfence/radiusd-eduroam:maintenance-12-1
ghcr.io/inverse-inc/packetfence/httpd.aaa:maintenance-12-1
ghcr.io/inverse-inc/packetfence/radiusd-cli:maintenance-12-1
ghcr.io/inverse-inc/packetfence/pfconfig:maintenance-12-1
ghcr.io/inverse-inc/packetfence/fingerbank-db:maintenance-12-1
ghcr.io/inverse-inc/packetfence/pfcmd:maintenance-12-1
ghcr.io/inverse-inc/packetfence/radiusd-load-balancer:maintenance-12-1
ghcr.io/inverse-inc/packetfence/httpd.admin_dispatcher:maintenance-12-1
ghcr.io/inverse-inc/packetfence/radiusd-acct:maintenance-12-1
ghcr.io/inverse-inc/packetfence/pfpki:maintenance-12-1
ghcr.io/inverse-inc/packetfence/httpd.portal:maintenance-12-1
ghcr.io/inverse-inc/packetfence/httpd.dispatcher:maintenance-12-1
ghcr.io/inverse-inc/packetfence/pfcron:maintenance-12-1
ghcr.io/inverse-inc/packetfence/pfconnector:maintenance-12-1
ghcr.io/inverse-inc/packetfence/httpd.webservices:maintenance-12-1
ghcr.io/inverse-inc/packetfence/radiusd-auth:maintenance-12-1
ghcr.io/inverse-inc/packetfence/haproxy-admin:maintenance-12-1
ghcr.io/inverse-inc/packetfence/pfqueue:maintenance-12-1
ghcr.io/inverse-inc/packetfence/api-frontend:maintenance-12-1
ghcr.io/inverse-inc/packetfence/pfperl-api:maintenance-12-1
Thu Jan 12 12:11:47 EST 2023 - Pull of images finished
Thu Jan 12 12:11:49 EST 2023 - Tag of images finished

On Tue, Jan 10, 2023 at 3:27 PM Ian MacDonald  wrote:

> Hey PF Users,
>
> For recent versions; I believe 11.1, 12.0 and now 12.1 and possibly 11.0
> (Fairly certain since the images below were downloaded from Inverse repos
> all at once during the installation or upgrade process) We have been having
> to restart the upgrade process due to timeout related errors.  Often 1-5
> times re-executions are required to complete the upgrade process.
>
> - proxysql
> - haproxy-portal
> - pfsso
> - radiusd-eduroam
> - httpd.aaa
> - radiusd-cli
> - pfconfig
> - fingerbank-db
> - pfcmd
> - radiusd-load-balancer
> - httpd.admin_dispatcher
> - radiusd-acct
> - pfpki
> - httpd.portal
> - httpd.dispatcher
> - pfcron
> - pfconnector
> - httpd.webservices
> - radiusd-auth
> - haproxy-admin
> - pfqueue
> - api-frontend
> - pfperl-api
>
> We do not really understand why, as there does not appear to be any
> connectivity or DNS lookup issues that would cause this type of behavior.
>  Below are some of the output lines captured during our installation
> process during a recent upgrade from 11.1 to 12.0 and then again from 12.0
> to 12.1.
>
> In a minor 12.0 upgrade we saw this one referencing an IPv6 github
> address, yet the system is IPv4, so no idea why it is attempting IPv6
> error pulling image configuration: Get "
> https://pkg-containers.githubusercontent.com/ghcr1/blobs/sha256:c31d236d97e3beb137f8c2b02bfbe88d0093b5592d9f181935c9c03a0132a142?se=2023-01-10T14%3A40%3A00Z=%2B
> HBahj6l0521Bm%2FB40v51MhZmNHztLYxzxBgJlsefEE%3D=r=https=b=2019-12-12":
> dial tcp [2606:50c0:8001::154]:443: connect: network is unreachable
>
> In another 12.0 upgrade attempt, we saw this one, which looks like a
> timeout to our DNS recursor, but to which there are no I/O 

Re: [PacketFence-users] Strange Timeout Errors during image download for PF 11 and 12 Upgrades

2023-01-11 Thread Ian MacDonald via PacketFence-users
Zammit,

No proxy.  The management VLAN operates the portal listener behind an
OpenWRT firewall, which forwards 80/443 to Packetfence portal.   iPerf
shows no bandwidth issues whatsoever, in our lab,  and I we have never seen
any other connection issues, during apt upgrades, inbound portal
connections for authentication/login, etc.  Qualify that with the fact it
is a lab environment, and only sees real use when upgrades and testing are
underway.

Unfortunately, we have not upgraded our Prod yet (until we get this other
pesky mobile login / detection issue sorted from my other thread) which
will tell us if we see the same thing in our twinned Prod setup which is on
a geographically separate connection and location, but has all the same
pieces, and would further rule out anything network related if it repeated,
or if it was fine, squarely having us looking at the specific DNS and NAT
pieces involved for the versioned differences between or lab and prod bits,
which amount to a Xen Hypevisor, OpenWRT firewall and PowerDNS recursor
beyond physical L2 connectivity.

The IPv6 lookup had me a bit baffled, and I am wondering if there is
something unique about the DNS lookups for the image pulls,  noting that
the timeout appears to happen on DNS frames in most cases (port 53), that
might point us at an actual issue with timeouts at the DNS resolver, which
is ours, running PowerDNS doing about 40-50 queries per second for other
IPv4 only hosts which are not reporting any DNS issues.

Since the retry attempts eventually work, and upgrades are once in a while,
we have not worked very hard to figure it out;  There is probably more
debug we could be doing to observe an actual timeout on DNS by isolating
the traffic there during the image pull phase of the upgrade, which we
would probably be able to do going from 11.1-12.0-12.1, if we see this
during our 11.0-11.1 upgrade of Prod, after we fix the mobile device
captive portal on this 12.1 instance.

cheers,
Ian

pf4:~# iperf3 -c 105.244.196.67
Connecting to host 105.244.196.67, port 5201
[  5] local 10.2.1.2 port 46870 connected to 105.244.196.67 port 5201
[ ID] Interval   Transfer Bitrate Retr  Cwnd
[  5]   0.00-1.00   sec   284 MBytes  2.38 Gbits/sec0   1.56 MBytes

[  5]   1.00-2.00   sec   309 MBytes  2.59 Gbits/sec0   1.56 MBytes

[  5]   2.00-3.00   sec   314 MBytes  2.63 Gbits/sec0   1.56 MBytes

pf4:~# iperf3 -R -c 105.244.196.67
Connecting to host 105.244.196.67, port 5201
Reverse mode, remote host 105.244.196.67 is sending
[  5] local 10.2.1.2 port 51046 connected to 105.244.196.67 port 5201
[ ID] Interval   Transfer Bitrate
[  5]   0.00-1.00   sec   280 MBytes  2.35 Gbits/sec
[  5]   1.00-2.00   sec   333 MBytes  2.80 Gbits/sec
[  5]   2.00-3.00   sec   315 MBytes  2.64 Gbits/sec



On Tue, Jan 10, 2023 at 4:06 PM Zammit, Ludovic  wrote:

> Hello Ian,
>
> Is your PF server is behind a HTTP proxy ?
>
> Thanks,
>
>
> *Ludovic Zammit*
> *Product Support Engineer Principal Lead*
> *Cell:* +1.613.670.8432
> Akamai Technologies - Inverse
> 145 Broadway
> Cambridge, MA 02142
> Connect with Us: <https://community.akamai.com> <http://blogs.akamai.com>
> <https://twitter.com/akamai> <http://www.facebook.com/AkamaiTechnologies>
> <http://www.linkedin.com/company/akamai-technologies>
> <http://www.youtube.com/user/akamaitechnologies?feature=results_main>
>
> On Jan 10, 2023, at 3:27 PM, Ian MacDonald via PacketFence-users <
> packetfence-users@lists.sourceforge.net> wrote:
>
> Hey PF Users,
>
> For recent versions; I believe 11.1, 12.0 and now 12.1 and possibly 11.0
> (Fairly certain since the images below were downloaded from Inverse repos
> all at once during the installation or upgrade process) We have been having
> to restart the upgrade process due to timeout related errors.  Often 1-5
> times re-executions are required to complete the upgrade process.
>
> - proxysql
> - haproxy-portal
> - pfsso
> - radiusd-eduroam
> - httpd.aaa
> - radiusd-cli
> - pfconfig
> - fingerbank-db
> - pfcmd
> - radiusd-load-balancer
> - httpd.admin_dispatcher
> - radiusd-acct
> - pfpki
> - httpd.portal
> - httpd.dispatcher
> - pfcron
> - pfconnector
> - httpd.webservices
> - radiusd-auth
> - haproxy-admin
> - pfqueue
> - api-frontend
> - pfperl-api
>
> We do not really understand why, as there does not appear to be any
> connectivity or DNS lookup issues that would cause this type of behavior.
>  Below are some of the output lines captured during our installation
> process during a recent upgrade from 11.1 to 12.0 and then again from 12.0
> to 12.1.
>
> In a minor 12.0 upgrade we saw this one referencing an IPv6 github
> address, yet the system is IPv4, so no idea why it is attempting IPv6
> error pulling imag

Re: [PacketFence-users] Newer Model iPhones and Android Devices showing MAC:0 in Captive Portal

2023-01-11 Thread Ian MacDonald via PacketFence-users
Daniel,

The random MAC would seem like an obvious culprit, but it is not.

On an iPhone, if you click on the little "i" for information next to each
connection, you see that the OS uses the same [random] mac per SSID, so it
will never change for a given WiFi network after is has connected.  It will
be different for each SSID/network.

Since it does not change per SSID, the MAC can be used for auth, but
obviously the OUI will no longer indicate it is an Apple device.

cheers,
Ian


On Wed, Jan 11, 2023 at 11:48 AM Daniel Silva  wrote:

> Good afternoon,
>
> We are having the same problem, in the new version people have captured
> random macs, and they redirect to a page with information on how to disable
> random macs. Would that really be the best way to solve it? I don't know, I
> just know that it generates complaints, tickets in our environment. If you
> have any idea of how to work around this situation, please send it to me at
> dan...@unifor.br, thank you in advance.
>
>
>
>
> *Daniel Ricardo*
>
> Analista de Infraestrutura
> NATI - Núcleo de Aplicação em Tecnologia da Informação
>
> Universidade de Fortaleza
>
> Tel.: (85) 3477.3302
>
>
>
> Em qua., 11 de jan. de 2023 às 10:52, Ian MacDonald via PacketFence-users <
> packetfence-users@lists.sourceforge.net> escreveu:
>
>> Hi Packetfence,
>>
>> We have been struggling with some newer model mobile devices with our
>> WiFi captive portal implementation using Packetfence, and have not seen any
>> change in the behavior on 11.0 thru 12.1 with our current connection
>> profile.
>>
>> We do not use an inline configuration, and now we are upgraded to 12.1 on
>> Debian 11, though we have not seen any related changelogs for specific
>> device enumeration related to our issue, so we believe there is some new
>> capabilities in how these platforms handle WiFi Login that we are missing
>> configuration for.
>>
>> A bit more about our environment.
>>
>> Our switch groups are OpenWRT/hostapd based with CoA configured for
>> Registration/Isolation/Management VLANs connected to our server, and a
>> Default local VLAN for Internet access that varies depending on the switch
>> location.
>>
>> Based on MAC, devices connect, and receive Internet access for a short
>> period of time, before completing email-based activation to grant them a
>> longer access window.
>>
>> All devices detect the WiFi login request on the registration network and
>> prompt users for an email to complete authentication.  When the email is
>> sent,  PF completes radius accounting, sets the device MAC as registered
>> and issues a CoA to boot the device after a brief delay to account for
>> hostapd delay in processing radius changes.  It is at this point where
>> the process fails for some devices.
>>
>> Samsung Galaxy S9 / S10 devices (Android 12) move through this step and
>> are handed the default redirection page per their connection profile.
>>
>> Newer S22 and iPhone 14 devices are shown a Packetfence error occurred
>> page, which shows the IP address of the gateway for the Default VLAN and
>> MAC:0.  So they made it to the Normal/Default VLAN and in packetfence they
>> are registered.
>>
>> It seems that right after the CoA disconnect, when the device reconnects
>> to the WiFi on the correct VLAN,  it detects a sign-in requirement (Or
>> simply retries the login page) and heads to the portal on the server which
>> feeds it an error message.   But is on the Default VLAN, so a smart user
>> can cancel the Login and choose to "stay connected without Internet" and
>> they are fine.
>>
>> Clearly the Internet detection is failing for the device, or it believes
>> this due to cancelling a login process.
>>
>> Just reading this makes me think perhaps we are missing a setting defined
>> for newer devices, as the config is pretty simple (from the command line
>> anyway).   I think it is time to try a default connection profile
>> configured from scratch and see if it adds something we are missing.
>>
>> If anyone has experience with this issue, feel free to post back so we
>> can shortcut our triage, and finally move on to upgrading our production
>> systems from 11.0 and fixing these newer mobile devices.
>>
>> cheers,
>> Ian
>>
>> Our profile in testing:
>> [Lab]
>> description=Captive Portal
>> filter=ssid:PFTest_WiFi
>> locale=
>> redirecturl=https://someplace.com/
>> logo=/common/logo.png
>> sources=email
>> preregistration=disabled
>>
>> Our switch config h

[PacketFence-users] Newer Model iPhones and Android Devices showing MAC:0 in Captive Portal

2023-01-11 Thread Ian MacDonald via PacketFence-users
Hi Packetfence,

We have been struggling with some newer model mobile devices with our WiFi
captive portal implementation using Packetfence, and have not seen any
change in the behavior on 11.0 thru 12.1 with our current connection
profile.

We do not use an inline configuration, and now we are upgraded to 12.1 on
Debian 11, though we have not seen any related changelogs for specific
device enumeration related to our issue, so we believe there is some new
capabilities in how these platforms handle WiFi Login that we are missing
configuration for.

A bit more about our environment.

Our switch groups are OpenWRT/hostapd based with CoA configured for
Registration/Isolation/Management VLANs connected to our server, and a
Default local VLAN for Internet access that varies depending on the switch
location.

Based on MAC, devices connect, and receive Internet access for a short
period of time, before completing email-based activation to grant them a
longer access window.

All devices detect the WiFi login request on the registration network and
prompt users for an email to complete authentication.  When the email is
sent,  PF completes radius accounting, sets the device MAC as registered
and issues a CoA to boot the device after a brief delay to account for
hostapd delay in processing radius changes.  It is at this point where
the process fails for some devices.

Samsung Galaxy S9 / S10 devices (Android 12) move through this step and are
handed the default redirection page per their connection profile.

Newer S22 and iPhone 14 devices are shown a Packetfence error occurred
page, which shows the IP address of the gateway for the Default VLAN and
MAC:0.  So they made it to the Normal/Default VLAN and in packetfence they
are registered.

It seems that right after the CoA disconnect, when the device reconnects to
the WiFi on the correct VLAN,  it detects a sign-in requirement (Or simply
retries the login page) and heads to the portal on the server which feeds
it an error message.   But is on the Default VLAN, so a smart user can
cancel the Login and choose to "stay connected without Internet" and they
are fine.

Clearly the Internet detection is failing for the device, or it believes
this due to cancelling a login process.

Just reading this makes me think perhaps we are missing a setting defined
for newer devices, as the config is pretty simple (from the command line
anyway).   I think it is time to try a default connection profile
configured from scratch and see if it adds something we are missing.

If anyone has experience with this issue, feel free to post back so we can
shortcut our triage, and finally move on to upgrading our production
systems from 11.0 and fixing these newer mobile devices.

cheers,
Ian

Our profile in testing:
[Lab]
description=Captive Portal
filter=ssid:PFTest_WiFi
locale=
redirecturl=https://someplace.com/
logo=/common/logo.png
sources=email
preregistration=disabled

Our switch config has one setting I am not sure about, but otherwise has
our VLANs, radius creds in it.
always_trigger=1

Of pf.conf [Captive Portal] includes:
wispr_redirection=enabled
network_detection_ip=
network_redirect_delay=25s
___
PacketFence-users mailing list
PacketFence-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/packetfence-users


[PacketFence-users] Strange Timeout Errors during image download for PF 11 and 12 Upgrades

2023-01-10 Thread Ian MacDonald via PacketFence-users
Hey PF Users,

For recent versions; I believe 11.1, 12.0 and now 12.1 and possibly 11.0
(Fairly certain since the images below were downloaded from Inverse repos
all at once during the installation or upgrade process) We have been having
to restart the upgrade process due to timeout related errors.  Often 1-5
times re-executions are required to complete the upgrade process.

- proxysql
- haproxy-portal
- pfsso
- radiusd-eduroam
- httpd.aaa
- radiusd-cli
- pfconfig
- fingerbank-db
- pfcmd
- radiusd-load-balancer
- httpd.admin_dispatcher
- radiusd-acct
- pfpki
- httpd.portal
- httpd.dispatcher
- pfcron
- pfconnector
- httpd.webservices
- radiusd-auth
- haproxy-admin
- pfqueue
- api-frontend
- pfperl-api

We do not really understand why, as there does not appear to be any
connectivity or DNS lookup issues that would cause this type of behavior.
 Below are some of the output lines captured during our installation
process during a recent upgrade from 11.1 to 12.0 and then again from 12.0
to 12.1.

In a minor 12.0 upgrade we saw this one referencing an IPv6 github address,
yet the system is IPv4, so no idea why it is attempting IPv6
error pulling image configuration: Get "
https://pkg-containers.githubusercontent.com/ghcr1/blobs/sha256:c31d236d97e3beb137f8c2b02bfbe88d0093b5592d9f181935c9c03a0132a142?se=2023-01-10T14%3A40%3A00Z=%2B
HBahj6l0521Bm%2FB40v51MhZmNHztLYxzxBgJlsefEE%3D=r=https=b=2019-12-12":
dial tcp [2606:50c0:8001::154]:443: connect: network is unreachable

In another 12.0 upgrade attempt, we saw this one, which looks like a
timeout to our DNS recursor, but to which there are no I/O bound or
restrictive conditions we can see.

error pulling image configuration: Get "
https://ghcr.io/v2/inverse-inc/packetfence/pfcmd/blobs/sha256:5631317df2b6910aa8da1f20a382c04ecc0ffb572aeb7fd3201a18b0bee18633":
dial tcp: lookup ghcr.io on 105.244
.196.155:53: read udp 10.2.1.2:35975->105.244.196.155:53: i/o timeout

In 12.1 using the do-upgrade script we saw these similar messages

Error response from daemon: Get "https://ghcr.io/v2/": dial tcp: lookup
ghcr.io on 105.244.196.155:53: read udp 10.2.1.2:60065->105.244.196.155:53:
i/o timeout

Error response from daemon: Head "
https://ghcr.io/v2/inverse-inc/packetfence/radiusd-eduroam/manifests/maintenance-12-1":
dial tcp: lookup ghcr.io on 104.244.196.155:53: read udp 10.2.1.2:55582->
104.244.196.155:53: i/o timeout

We just repeated, which seemed like a good time to send this email, and it
worked (3rd time just now on 12.1).

Tue Jan 10 15:22:56 EST 2023 - Pull of images finished
Tue Jan 10 15:22:58 EST 2023 - Tag of images finished
Tue Jan 10 15:23:45 EST 2023 - Previous images cleaned

It seems very odd that we get these timeouts when doing the image
downloading.   Maybe somebody else has seen this or knows why it may be
occurring during this stage of the installation/upgrade process.

cheers,
Ian
___
PacketFence-users mailing list
PacketFence-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/packetfence-users


[PacketFence-users] PF11.0 Switch Configuration Ignoring CoA/Disconnect Port

2021-10-20 Thread Ian MacDonald via PacketFence-users
We are using an OpenWRT 21.02.0 Hostapd client as a switch.

Our PF is Debian 11 with PF 11 on the most recent stable production build:
11.0.0+20211019052209+390787654+0011+maintenance~11.0+bullseye1

Our PF server is configured with management IP 10.2.1.2 and our switch
10.2.1.60 in the information shared below.

Both our switch group, and our switch have the disconnectPort and
coaPort specified as 3798, as show below from our switches.conf

[group Office_APS]
guestVlan=73
VoIPDHCPDetect=N
VoIPLLDPDetect=N
guestRole=default
type=Hostapd
VoIPCDPDetect=N
description=office test devices
deauthMethod=RADIUS
gamingRole=default
defaultVlan=73
isolationVlan=82
gamingVlan=73
RoleMap=Y
registrationVlan=81
radiusSecret=
always_trigger=1

*disconnectPort=3798coaPort=3798*

[10.2.1.60]
guestVlan=73

*disconnectPort=3798coaPort=3798*
defaultVlan=73
group=Office_APS
description=TEST-AP
gamingVlan=73

We can see our listener on 3798 on our hostapd client / switch.

~# netstat -tunap | grep 3798
udp0  0 0.0.0.0:37980.0.0.0:*
1617/hostapd

Watching the packetflow, PF11 is still sending on 3799, despite being
configured to use 3798.   It seems to be ignoring the configuration for
some reason.  The result of not receiving the disconnect on the correct
port is that clients are not being disconnected from registration vlan.

11:51:48.021157 IP 10.2.1.60.58398 > 10.2.1.2.1812: RADIUS, Access-Request
(1), id: 0x21 length: 165
11:51:48.199840 IP 10.2.1.2.1812 > 10.2.1.60.58398: RADIUS, Access-Accept
(2), id: 0x21 length: 36
11:51:48.636694 IP 10.2.1.60.32955 > 10.2.1.2.1813: RADIUS,
Accounting-Request (4), id: 0x22 length: 183
11:51:48.772328 IP 10.2.1.2.1813 > 10.2.1.60.32955: RADIUS,
Accounting-Response (5), id: 0x22 length: 35
11:51:48.853940 IP 10.2.1.60.32955 > 10.2.1.2.1813: RADIUS,
Accounting-Request (4), id: 0x23 length: 225
11:51:48.856610 IP 10.2.1.60.32955 > 10.2.1.2.1813: RADIUS,
Accounting-Request (4), id: 0x24 length: 183
11:51:48.891889 IP 10.2.1.2.1813 > 10.2.1.60.32955: RADIUS,
Accounting-Response (5), id: 0x23 length: 35
11:51:48.894938 IP 10.2.1.2.22 > 10.2.1.5.32778: Flags [P.], seq 100:200,
ack 1, win 501, options [nop,nop,TS val 1817400362 ecr 1256487595], length
100
11:51:48.902378 IP 10.2.1.2.1813 > 10.2.1.60.32955: RADIUS,
Accounting-Response (5), id: 0x24 length: 35
11:51:53.204975 ARP, Request who-has 10.2.1.2 tell 10.2.1.60, length 28
11:51:53.205396 ARP, Reply 10.2.1.2 is-at 00:16:3e:dc:9d:fd (oui Unknown),
length 42
11:51:53.361125 ARP, Request who-has 10.2.1.60 tell 10.2.1.2, length 42
11:51:53.361213 ARP, Reply 10.2.1.60 is-at 50:d4:f7:6b:87:0e (oui Unknown),
length 28
*11:52:18.919296 IP 10.2.1.2.54653 > 10.2.1.60.3799: RADIUS,
Disconnect-Request (40), id: 0x31 length: 39*

Is it possible that the disconnect port is learned from something other
than the switch configuration, or is this possibly a bug in the radius
configuration and port enumeration.

Any insights here appreciated.

cheers,
ian
___
PacketFence-users mailing list
PacketFence-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/packetfence-users


[PacketFence-users] Hostapd Captive Portal OpenWRT 19.07.8 and PF11.0

2021-09-26 Thread Ian MacDonald via PacketFence-users
Attached is a modified /lib/netifd/hostapd.sh that will allow you have a
captive portal on a single radio with OpenWRT authenticating by email.
Seems to work with just about any OpenWRT device allowing a lot of
flexibility across hardware and wireless radios.  Maybe this will be useful
for others.

If you diff the current hostapd of 19.07.8 against attached, it is pretty
straightforward to cut and paste the auth none) section with the radius
bits to work with any OpenWRT hostapd.sh version.

Have not checked to see if you can add another wan IP to then allow hostapd
to handle CoA on two radios, and setup the single AP as two switches in PF
with different NASIDs,  but last we checked at 17.x, the one radio
configuration was all we could muster to make CoA work.

The configuration example below applies to 5Ghz radio on a ZyXEL NBG6817
(Armor Z2) where its logical wan is kernel interface eth0 connected via
switchport 0 and physical WAN port connected via switchport 5 with tagged
PF VLANs (70-72) and a tagged local access network VLAN (73) passing
through the switchchip ('0t 5t')

A complimentary /etc/config/wireless

config wifi-iface 'default_radio0'
option device radio0
option mode 'ap'
option encryption 'none'
option vlan_file '/etc/config/hostapd.vlan'
option vlan_tagged_interface 'eth0'
option vlan_bridge 'br-vlan'
option vlan_naming '0'
option dynamic_vlan '2'
option auth_port '1812'
option auth_server '<>'
option auth_secret '<>'
option dae_port '3799'
option dae_client '<>'
option dae_secret '<>'
option acct_port '1813'
option acct_server '<>'
option acct_secret '<>'
option nasid '<>'
option ssid '<>'

Additionally, you will want to ensure your OpenWRT firewall allows
connections from your PF Server for CoA/Accounting sessions in
/etc/config/firewall  The blanket rules below allow for ssh, http, etc. as
well.

config rule
   option name 'PF_1'
   option src 'wan'
   option src_ip '<>'
   option family 'ipv4'
   option proto 'icmp'
   option target 'ACCEPT'

config rule
   option name 'PF_2'
   option src 'wan'
   option src_ip '<>'
   option family 'ipv4'
   option proto 'tcp'
   option target 'ACCEPT'

config rule
   option name 'PF_3'
   option src 'wan'
   option src_ip '<>'
   option family 'ipv4'
   option proto 'udp'
   option target 'ACCEPT'

You will need to define this configuration for vlan changes; wlan0 = radio0
from wireless config.

PF_AP# cat /etc/config/hostapd.vlan
*   wlan0.#

You will need to drop in hostapd in place of wpad-basic; drop in your
modified hostapd.sh after this step.

opkg update
opkg remove wpad-basic wpad-basic-wolfssl
opkg install wpad hostapd-common

You might want some extra tools, these are optional but may be helpful

opkg install luci ipset screen iftop tcpdump curl mtr wget ca-certificates
diffutils iperf3 iwinfo snmpd ethtool sqm-scripts luci-app-sqm

Optionally add the hostapd.sh path to sysupgrade.conf if you want it to
hang around across upgrades.

In this instance, the AP network configuration uses a trunk on wan;
Understanding the nuances of the switchchip and kernel network interfaces
makes this unique to the device.  In this case, the PF Management network
is VLAN 70,  Registration is VLAN 71,  Isolation is VLAN 72 and  VLAN 73
provides Internet access.  It is not inline, so VLAN 73 is the unprotected,
normal network that has DNS/DHCP, etc. provided to clients independent of
the Packetfence configuration.  Excerpts from /etc/config/network are below

config interface 'wan'
  option force_link '1'
  option proto 'static'
  option ipaddr '<>'
  option netmask '255.255.255.0'
  option gateway '<>'
  option dns '<>'
  option ifname 'eth0.70'

config interface 'vlan71'
  option type 'bridge'
  option force_link '1'
  option ifname 'eth0.71'

config interface 'vlan72'
  option force_link '1'
  option type 'bridge'
  option ifname 'eth0.72'

config interface 'vlan73'
  option type 'bridge'
  option force_link '1'
  option ifname 'eth0.73'

config switch_vlan
  option device 'switch0'
  option vlan '70'
  option ports '0t 5t'
  option vid '70'

config switch_vlan
  option device 'switch0'
  option vlan '71'
  option ports '0t 5t'
  option vid '71'

config switch_vlan
  option device 'switch0'
  option vlan '72'
  option ports '0t 5t'
  option vid '72'

config switch_vlan
  option device 'switch0'
  option vlan '73'
  option ports '0t 5t'
  option vid '73'

You can use logread and tcpdump to debug and/or watch radius/CoA
interactions.

The packetfence side requires appropriate switch, connection profiles,
auth, smtp, certs, interfaces, etc.  where we use email address to
authorize access to 1D beyond the default 10min; Although we configure by
GUI, some hints at the resulting configuration are below that may be
helpful looking at your own resulting configuration.

profiles.conf

[<>]
filter=ssid:<>

[PacketFence-users] Some Debian Upgrade Notes

2021-09-24 Thread Ian MacDonald via PacketFence-users
We just completed a long overdue upgrade from 9.3 to 11.0.  We are happy to
share that at least one of our instances has been evergreening since PF 6
on Debian 8.

A few things we noted during the upgrade, consistently across our
environments that might be noteworthy for Debian users

1) The 10.1 to 10.2 SQL schema upgrade script is broken in the Debian
packages.  Specifically, we had to replace the section below from
https://fossies.org/linux/packetfence/db/upgrade-10.1.0-10.2.0.sql to avoid
an error about invalid SQL.  Not sure why it is different in the current
10.3 packages, possibly a minor regression during packaging for Debian.
Notably it is corrected in 11.0, but not in the 10.3 packages that would be
present for anyone doing an upgrade from before 10.2

 220
 221 \! echo "Adding default radreply row";
 222 INSERT INTO `radreply` (`tenant_id`, `username`, `attribute`, `value`,
`op`)
 223 SELECT * FROM (SELECT '1', '00:00:00:00:00:00','User-Name', '*',
'=*') as x
 224  WHERE NOT EXISTS ( SELECT 1 FROM `radreply` WHERE `tenant_id`='1'
AND `username`='00:00:00:00:00:00' AND `attribute`='User-Name' AND
`value`='*' AND `op`='=*');
 225

2) Running the export on 10.3, we received this error preventing the export
ERROR 1141 (42000) at line 8: There is no such grant defined for user
'root' on host ''

To workaround, we removed the  "-o errexit" to allow the script to proceed;
we didn't investigate what was actually missing, but it was missing on all
our instances
# diff export.sh /usr/local/pf/addons/full-import/export.sh
16,17c16
< #set -o nounset -o pipefail -o errexit
< set -o nounset -o pipefail

3) Also running the export on 10.3, we received this error preventing the
export
cp: cannot stat '/usr/local/fingerbank/conf/fingerbank.conf\n': No such
file or directory

To workaround, we simply touched the file it was looking for, with the
newline character in the name rather than correct the script
touch '/usr/local/fingerbank/conf/fingerbank.conf\n'

The new export/import process looks like a welcome update, reducing human
error, and the amount of time to review the upgrade steps/process -
something that may have contributed to us holding off on upgrades in the
past.

Thankyou inverse team.
___
PacketFence-users mailing list
PacketFence-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/packetfence-users


Re: [PacketFence-users] Upgrading from 11.0 to 11.1

2021-09-24 Thread Ian MacDonald via PacketFence-users
We just rolled back to a fresh image of 11.0 and re-imported from prior to
the 11.1 as we had just come from 10.3 and we really need the parity
between the environments for testing future upgrades.

I don't think you need to worry about anyone else doing this; without
release notes it was obvious we were beyond stable, and 'devel' in the
package names; we were just in upgrade mode, on a big roll (9.3->11.0)
after upgrading everything to 11.0 and kept going in our dev environment
without paying much attention :)

On Fri, Sep 24, 2021 at 1:28 AM Quiniou-Briand, Nicolas 
wrote:

> Hello Ian,
>
>
>
> > We ran the export / import anyways in the 11.1 and noted there was no
> schema migration script.
>
>
>
> There is a schema migration script but the export/import certainly doesn’t
> run it due to its name.
>
>
>
> > Since we are 11.0 in our other environments we are okay to wait for 11.1
> release and then upgrade this environment again. Hopefully we can just
> leave as is and the upgrade into the release when it is ready.
>
>
>
> Your best solution is to downgrade to latest 11.0 packages.
>
>
>
> *Nicolas Quiniou-Briand*
> *Product Support Engineer*
>
> *Office:* +33156696210
>
> Akamai Technologies
> 145 Broadway
> Cambridge, MA 02142
>
> Connect with Us:
>
>   
>   
>   
> 
>
>
>
___
PacketFence-users mailing list
PacketFence-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/packetfence-users


Re: [PacketFence-users] Upgrading from 11.0 to 11.1

2021-09-23 Thread Ian MacDonald via PacketFence-users
Okay,

I suspected that was the case, but possibly we were just hours ahead of the
release notes.  We seem to have forgotten dev was in the same repository.

The intent was just to test the latest stable.

We ran the export / import anyways in the 11.1 and noted there was no
schema migration script.

Since we are 11.0 in our other environments we are okay to wait for 11.1
release and then upgrade this environment again. Hopefully we can just
leave as is and the upgrade into the release when it is ready.

Cheers,
Ian


On Thu., Sep. 23, 2021, 1:45 a.m. Quiniou-Briand, Nicolas, <
nquin...@akamai.com> wrote:

> Hello Ian,
>
>
>
> Thanks for your message.
>
> 11.0.0 is your latest stable release, 11.1.0 is our current development
> release.
>
> It seems there is some issues in version number in documentation, I will
> take a look.
>
>
>
> Is your purpose to upgrade from 11.0.0 (stable) to 11.1.0 (dev) on the
> same machine ?
>
>
>
>
>
> *Nicolas Quiniou-Briand*
> *Product Support Engineer*
>
> *Office:* +33156696210
>
> Akamai Technologies
> 145 Broadway
> Cambridge, MA 02142
>
> Connect with Us:
>
>   
>   
>   
> 
>
>
>
___
PacketFence-users mailing list
PacketFence-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/packetfence-users


[PacketFence-users] Upgrading from 11.0 to 11.1

2021-09-22 Thread Ian MacDonald via PacketFence-users
I noticed 11.1 packages are available, and v11.1 appeared in the coverpage
for some docs

https://www.inverse.ca/downloads/PacketFence/doc/11.0.0/PacketFence_Template_Guide.pdf


so we decided to see how the new upgrade process >11.0 went on our dev
packetfence environment after recently bringing others up to v11.0

I was reading the upgrade instructions, which would apply for 11.0 -> 11.1
(even uses these specific versions in the example)

https://www.packetfence.org/doc/PacketFence_Upgrade_Guide.html#_upgrade_to_a_major_or_minor_version_major_minor_z

It says to complete the export / import process, however reading those
instructions, they seem very clearly directed as fresh installations, and
coming from 10.3.0 to 11.0

The package upgrade from 11.0 to 11.1 went smoothly, executed some upgrade
scripts and seemed to perform the upgrade similar to the recent
11.0.0+20210917123720 to 11.0.0+20210922054120

At the end it just says, login to the admin gui;  And we did, and it just
sort of seems fine so far.

Maybe we are jumping the gun here, but should we stop services, and do the
export and import moving from 11.0 to 11.1 on Debian 11?

cheers,
Ian
___
PacketFence-users mailing list
PacketFence-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/packetfence-users


Re: [PacketFence-users] Samsung Galaxy S10 PF 9.3 Captive Portal Detection

2020-03-20 Thread Ian MacDonald via PacketFence-users
Thanks,

I was actually just reviewing the commits; There are a bunch.   I am hoping
to just grab the one for the popup.

Our registration is same layer 2 network as the server's captive portal;
so we wait for PF10 on Debian or cherry pick the pop-up stuff from the
rest.

If you want to point us to the critical commits, as we don't really know
what Samsung changed, and it looks like a bunch of stuff was fixed and
added in that change.

cheers,
Ian



On Fri, Mar 20, 2020 at 2:49 PM Fabrice Durand via PacketFence-users <
packetfence-users@lists.sourceforge.net> wrote:

> Hello Ian,
>
> it's a know issue with Samsung devices, in fact if the device won't pop
> the portal if the device is on the same layer 2 network.
>
> It has been fixed in https://github.com/inverse-inc/packetfence/pull/5086
> and will be part in the incoming packetfence v10.
>
> Btw if the registration network is a layer 3 network then it should work.
>
> I don't know why Samsung did that ...
>
> Regards
>
> Fabrice
>
>
> Le 20-03-20 à 13 h 58, Ian MacDonald via PacketFence-users a écrit :
>
> Hi,
>
> We noticed Samsung Devices on Android 10 are no longer being redirected to
> our Packetfence portal on the registration network.
>
> Up until now we have our portal configured with,
> a)  Secure redirect ON
> b)  WISPr redirection capabilities ON
>
> We do not use the detection mechanism bypass.
>
> When the devices connect to the registration VLAN, they simply note
> "Connected without Internet" and never detect the portal and redirect to
> the registration page.
>
> Is anyone successfully capturing new Galaxy devices / Android 10, and do
> they have any insights as to what mechanism we can use, or DNS filters we
> can apply?
>
> We are hoping to shortcut the next step of looking at the traffic and
> trying to determine what/how to intercept.
>
> We also seem to have good compatibility with many other existing devices
> and platforms, so we are hesitant to start bypassing the Captive Portal
> detection, which would likely stop a lot of other platforms from working
> effectively.
>
> We are using PF9.3 (9.3.0+20200113144930+108928498+0009+v9.3.0+stretch1)
> with hostapd + CoA/Disconnect, Out of Band.
>
> cheers,
> Ian
>
>
> ___
> PacketFence-users mailing 
> listPacketFence-users@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/packetfence-users
>
> --
> Fabrice durandfdur...@inverse.ca ::  +1.514.447.4918 (x135) ::  www.inverse.ca
> Inverse inc. :: Leaders behind SOGo (http://www.sogo.nu) and PacketFence 
> (http://packetfence.org)
>
> ___
> PacketFence-users mailing list
> PacketFence-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/packetfence-users
>
___
PacketFence-users mailing list
PacketFence-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/packetfence-users


[PacketFence-users] Samsung Galaxy S10 PF 9.3 Captive Portal Detection

2020-03-20 Thread Ian MacDonald via PacketFence-users
Hi,

We noticed Samsung Devices on Android 10 are no longer being redirected to
our Packetfence portal on the registration network.

Up until now we have our portal configured with,
a)  Secure redirect ON
b)  WISPr redirection capabilities ON

We do not use the detection mechanism bypass.

When the devices connect to the registration VLAN, they simply note
"Connected without Internet" and never detect the portal and redirect to
the registration page.

Is anyone successfully capturing new Galaxy devices / Android 10, and do
they have any insights as to what mechanism we can use, or DNS filters we
can apply?

We are hoping to shortcut the next step of looking at the traffic and
trying to determine what/how to intercept.

We also seem to have good compatibility with many other existing devices
and platforms, so we are hesitant to start bypassing the Captive Portal
detection, which would likely stop a lot of other platforms from working
effectively.

We are using PF9.3 (9.3.0+20200113144930+108928498+0009+v9.3.0+stretch1)
with hostapd + CoA/Disconnect, Out of Band.

cheers,
Ian
___
PacketFence-users mailing list
PacketFence-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/packetfence-users


Re: [PacketFence-users] Using Hostapd 17.01.4 with WISPr and a Captive Portal on PF 7.4.0

2018-03-12 Thread Ian MacDonald via PacketFence-users
Completed the pull request,

https://github.com/inverse-inc/packetfence/pull/3008

cheers,
Ian

On Wed, Mar 7, 2018 at 5:15 PM, Fabrice Durand via PacketFence-users <
packetfence-users@lists.sourceforge.net> wrote:

> if you can do a pull request on github with this change then it will be
> integrate to PacketFence for the next release.
>
> And also thank for the support :-)
>
> Le 2018-03-07 à 17:08, Ian MacDonald via PacketFence-users a écrit :
>
> Below is a quick addendum to the current Hostapd Quick Install Guide.
>
> Hopefully it will help new users looking to leverage the flexibility of
> OpenWRT (aka LEDE) with the powerful captive portal functionality of
> Packetfence.
>
> There is a great guide from Inverse, and this email just adds a few
> missing bits that will help keep hostapd related posts out of the list.
> Instead invest that savings in some professional services from Inverse.
>
> The Guide
> https://packetfence.org/doc/PacketFence_OpenWrt-Hostapd-15-
> 05_Quick_Install_Guide.html
>
> Hostapd 15.05 is dead and old.  At this time, 17.01.4 is current.
>
> With hostapd, you have to pick a band (i.e 5G or 2.4G) as CoA only works
> with one radio.
>
> Below are some additions to the guide that should be helpful based on a
> configuration with a packetfence server with an IP of 192.168.10.10 on the
> management VLAN70, registration VLAN71, isolation VLAN72 and an out-of-band
> user/normal VLAN76 (PF server does nothing on VLAN76 and does not see it).
> The hostapd AP running LEDE/OpenWRT 17.01.4 has an IP of 192.168.10.19 on
> the management VLAN70.
>
> Step 4.1:  Have your Internet connected to the WAN port of a default
> OpenWRT configuration for staging and connect to one of the LAN ports on
> 192.168.1.1.  If your network is 192.168.1.X, you might consider making a
> change you your local lan to make this type of access easier.
>
> The actual steps to update the packages:
>
> #opkg update
> #opkg remove wpad-mini hostapd
> #opkg install hostapd-common wpad
>
> We typically dump a bunch of useful tools and stuff too; none of this is
> required
> # opkg install ipset screen iftop tcpdump curl mtr wget diffutils iperf3
> iwinfo snmpd  kmod-gpio-button-hotplug
>
> We like to bring all the packages up to the latest version as well (also
> not required)
> #opkg update
> #opkg list-upgradable | awk -F ' - ' '{print $1}' | xargs opkg upgrade
>
> Step 4.2:
> Most people choose to run on 5G for increased bandwidth and control of
> range.  Usually this is wlan0, but on some hardware (i.e. WD N600) it will
> be wlan1.  You can use iwinfo to see which device is your 5G radio.
>
> Step 4.3: Attached script is a modified version of the 17.04.1 hostapd.sh
>
> Step 4.4: The example is not for the scenario where you just have an open
> SSID where you are providing access via WISPr/Captive Portal. It also has
> some issues; it is missing "wifi-iface" on the  PF-Open SSID, uses the 2.4G
> radio (which is often not wlan0), and has vlan_naming set to '0', where we
> find '1' is the preferred option.
>
> Below is an updated example with a single open SSID 'Public WiFi'.
>
> Some notes on where yours might differ for the radio (wifi-device):
> 1) the 5G radio 'path' is specific to the device (in this case a TP-Link
> C2600)
> 2) The country 'CA' is Canada (that's where we are)
> 3) The channel is '149' ; Higher channels deliver more power on some
> devices. DFS channels like '100', when supported are typically very clear.  
> Checkout
> 'iw list' output for power and channel support.
>
> Some notes on where yours might differ for the interface (wifi-iface):
> 1) The 192.168.10.10 address should be your PF server on the management
> subnet. We like to leave the default 192.168.1.1 on the lan interface for
> local configuration and access, and so it is not used for any PF vlans
> (management/registration/isolation/normal).
> 2) The vlan_tagged_interface 'eth0' should be the interface where your
> tagged/trunked vlans are connected for registration/isolation/normal that
> clients will be connecting to. Typically this is connected to the
> "Internet" port.  In some rare cases this eth1 is the "Internet" port,
> depending on the hardware. You need to know your hardware.
> 3) option network 'lan' should be removed, it is the default where wifi
> interfaces normally are bridged to the lan
> 4) vlan_naming '1' seems to produce less bridge naming warnings in our
> experience, though both '1' and '0' work.
> 5) Note that radio1 (2.4G) is set with option disabled '1' - we are not
> using it
>
> config wifi-device 'radio0'
> option type 'mac80211'
> option hwmode '11a'
> option path 'soc/1b50

Re: [PacketFence-users] No roles assignment and no rules matching in the authentication source

2018-03-12 Thread Ian MacDonald via PacketFence-users
Eugene,

On the note of patch application;  Are you sure you applied the entire
patch? The output of your patching below indicates 3 hunks that still need
to be manually applied.

cheers,
Ian

 [root@PacketFence-ZEN pf]# patch -p1 <
./34405d44b203ce2fd4a4dac435ff62d69c4ed00f.diff
 patching file lib/pf/config.pm
 Hunk #1 succeeded at 326 (offset 5 lines).
 Hunk #2 FAILED at 947.
 1 out of 2 hunks FAILED -- saving rejects to file lib/pf/config.pm.rej


 [root@PacketFence-ZEN pf]# patch -p1 <
1eef967ad1ee589136a097166c440cb30107ddfb.diff
 patching file lib/pf/enforcement.pm
 Reversed (or previously applied) patch detected!  Assume -R? [n] n
 Apply anyway? [n] y
 Hunk #1 FAILED at 43.
 Hunk #2 FAILED at 169.
 2 out of 2 hunks FAILED -- saving rejects to file lib/pf/enforcement.pm.rej

On Sun, Mar 11, 2018 at 6:44 PM, E.P. via PacketFence-users <
packetfence-users@lists.sourceforge.net> wrote:

> And also this issue still bothers me, Fabrice.
>
> I applied the patch but it is all about deauthentication
>
> What does it have to do with role assignment and not matching conditions
> in the authentication source?
>
> Is there any other logs or outputs to analyze to find the root cause ?
>
>
>
> Eugene
>
>
>
> *From:* Fabrice Durand [mailto:fdur...@inverse.ca]
> *Sent:* Thursday, March 08, 2018 11:30 AM
>
> *To:* E.P. ; packetfence-users@lists.sourceforge.net
> *Subject:* Re: [PacketFence-users] No roles assignment and no rules
> matching in the authentication source
>
>
>
> cd /usr/local/pf
>
> patch -p1 
>
>
>
> Le 2018-03-08 à 13:57, E.P. a écrit :
>
> And what file are we patching ?
>
>
>
> *patch -p1 < 1eef967ad1ee589136a097166c440cb30107ddfb.diff*
>
> *can't find file to patch at input line 5*
>
> *Perhaps you used the wrong -p or --strip option?*
>
> *The text leading up to this was:*
>
> *--*
>
> *|diff --git a/lib/pf/enforcement.pm 
> b/lib/pf/enforcement.pm *
>
> *|index 8ff56b4252b..05589bba682 100644*
>
> *|--- a/lib/pf/enforcement.pm *
>
> *|+++ b/lib/pf/enforcement.pm *
>
> *--*
>
> *File to patch:*
>
>
>
>
>
>
>
> *From:* Fabrice Durand [mailto:fdur...@inverse.ca ]
> *Sent:* Thursday, March 08, 2018 5:28 AM
> *To:* E.P.  ; packetfence-users@lists.
> sourceforge.net
> *Subject:* Re: [PacketFence-users] No roles assignment and no rules
> matching in the authentication source
>
>
>
> https://github.com/inverse-inc/packetfence/pull/2735/commits/
> 1eef967ad1ee589136a097166c440cb30107ddfb.diff is suppose to return that:
>
>
>
> diff --git a/lib/pf/enforcement.pm b/lib/pf/enforcement.pm
>
> index 8ff56b4252b..05589bba682 100644
>
> --- a/lib/pf/enforcement.pm
>
> +++ b/lib/pf/enforcement.pm
>
> @@ -43,6 +43,7 @@ use pf::config qw(
>
>  %connection_type_explained
>
>  $WIRED
>
>  $WIRELESS
>
> +$WEBAUTH
>
>  );
>
>  use pf::inline::custom $INLINE_API_LEVEL;
>
>  use pf::iptables;
>
> @@ -169,7 +170,7 @@ sub _vlan_reevaluation {
>
>  $client->notify( 'ReAssignVlan', %data );
>
>  }
>
>  }
>
> -elsif ( ( $conn_type & $WIRELESS ) == $WIRELESS ) {
>
> +elsif ( ( ( $conn_type & $WIRELESS ) == $WIRELESS ) || ( ( 
> $conn_type & $WEBAUTH ) == $WEBAUTH ) ) {
>
>  $logger->debug("Calling API with desAssociate request on switch 
> (".$switch_id.")");
>
>  if ($cluster_deauth) {
>
>  $client->notify( 'desAssociate_in_queue', %data );
>
>
>
> And it work on my side, so do wget instead and after patch -p1 <
> 1eef967ad1ee589136a097166c440cb30107ddfb.diff
>
> Same for the other patch.
>
> Regards
>
> Fabrice
>
>
>
>
>
>
>
> Le 2018-03-08 à 00:48, E.P. a écrit :
>
> Am I applying this patch in the wrong way ?
>
>
>
> [root@PacketFence-ZEN conf]# curl https://github.com/inverse-
> inc/packetfence/pull/2735/commits/1eef967ad1ee589136a097166c440c
> b30107ddfb.diff | patch -p1
>
>
>   % Total% Received % Xferd  Average Speed   TimeTime Time
> Current
>
>  Dload  Upload   Total   SpentLeft
> Speed
>
> 100   1610   1610 0241  0 --:--:-- --:--:-- --:--:--
>  242
>
> patch unexpectedly ends in middle of line
>
> *patch:  Only garbage was found in the patch input.*
>
>
>
> [root@PacketFence-ZEN conf]# curl https://github.com/inverse-
> inc/packetfence/pull/2735/commits/34405d44b203ce2fd4a4dac435ff62
> d69c4ed00f.diff | patch -p1
>
>  % Total% Received % Xferd  Average Speed   TimeTime Time
> Current
>
>  Dload  Upload   Total   SpentLeft
> Speed
>
> 100   1610   1610 0218  0 --:--:-- --:--:-- --:--:--
> 218
>
> patch unexpectedly ends in middle of line
>
> *patch:  Only garbage was found in the patch input*
>
>
>
> wget seems to fetch this file
>
>
>
> 

Re: [PacketFence-users] Wrong language after upgrade to 7.0.4

2018-03-08 Thread Ian MacDonald via PacketFence-users
Unfortunately, I use en_US, but it did get me wondering what our french
pages look like, so at some point we may do some digging into better locale
application.  I am pretty confident it works, since I bet a good chunk of
the Inverse team are fr_CA, and you know they must be using their own tools
internally.

I did find some good hints with a quick grep;

pf4:/usr/local/pf/html# grep -i "en_US" * -R | less

It includes the following lines which are likely to lead you down the right
path:

captive-portal/lib/captiveportal/PacketFence/Controller/Root.pm:#
Browser = fr_FR and portal is en_US and fr_CA then fr_CA will be used
captive-portal/lib/captiveportal/PacketFence/Controller/Root.pm:will return
qw(en_US en fr fr_CA no es)
captive-portal/lib/captiveportal/PacketFence/Controller/Root.pm:my $s =
$c->request->header('Accept-language') || 'en_US';
pfappserver/lib/pfappserver/I18N/i_default.po:msgid "en_US"
pfappserver/lib/pfappserver/I18N/fr.po:msgid "en_US"
pfappserver/lib/pfappserver/I18N/fr.po:msgstr "en_US"
pfappserver/lib/pfappserver/I18N/en.po:msgid "en_US"
pfappserver/root/static/doc/PacketFence_Developers_Guide.html:Same goes for
a browser with locale en_US, where creating a template named
aup_text.en.html will be used for English browsers.
pfappserver/root/static/doc/PacketFence_Developers_Guide.html:Note that
you cannot use the full locale in the template name (i.e.
aup_text.en_US.html) as only the two letter prefix should be
used.

cheers,
Ian


On Thu, Mar 8, 2018 at 5:07 AM, Virginie Girou via PacketFence-users <
packetfence-users@lists.sourceforge.net> wrote:

> Hi,
>
> I am interrested in your problem (/solution ;-) ), because i encounter the
> same with french language.
>
> I first though it was a translation message problem but actually it acts
> as if the only active language was "en" while my profile parameters are set
> to "fr" and my browser request the "fr" language ...
>
> I wrote the list in january (subject : No translation on "Enabling network
> access" page) and i have the problem on a 7.3 version (install directly in
> 7.3) but not in a 7.0.
>
> I tried to compare different files but I am not good enough in programming
> to understand the differences.
>
> Hope that packetfence team could help us !
>
> Regards,
>
> Virginie Girou
> Equipe systeme
> DSI - UT1 Capitole
> Tel : +33 (0)5.61.63.39.19 <+33%205%2061%2063%2039%2019>
>
> Le 07/03/2018 14:23, Hubert Kupper via PacketFence-users a écrit :
>
> Hi Ian,
>
> yes, I mean 7.3.0 to 7.4.0 sorry ;-)
>
> mybrowserinfo.com shows the correct language. When I set locale= I
> get the english version of the profile preview and the preview of the html
> files like layout.html. When I set the locale=de_DE in the profiles.conf
> file, I get the profile preview in the german version but all other htlm
> files in the english version.
>
> cheers,
> Hubert
>
> Am 06.03.2018 um 06:19 schrieb Ian MacDonald via PacketFence-users:
>
> Hubert,
>
> I would assume you mean upgraded from 7.3.0 to 7.4.0;
>
> I think by default, connection profiles support whatever locale your
> browser requests.   Maybe confirm your local setup using mybrowserinfo.com
> and make sure you are requesting the language you want.
>
> Using locale= will let your browser select the locale.
>
> You should probably just start with a clone of the default connection
> profile with no languages specified and see if the problem occurs there
> with the correct browser locale, both in the Preview from the admin GUI as
> well as when you hit the captive portal directly.
>
> Compare different devices with your desired locale setting set to rule out
> the client.
>
> If the problem goes away with a clone of the default connection profile,
> you can focus on your profile configuration and files, opting to find the
> difference by direct comparison.
>
> I am not sure, but your locale setting may also require subtags;  I'm not
> sure the "de" language identifier will actually match one of the German
> locales. Check the notation is correct
> <https://www.w3.org/TR/ltli/#language-terminology>https://www.w3.org/TR/
> ltli/#language-terminology
>
> I don't use the locale setting myself.
>
> cheers,
> Ian
>
>
>
> On Wed, Feb 28, 2018 at 3:26 AM, Hubert Kupper via PacketFence-users <
> <packetfence-users@lists.sourceforge.net>packetfence-users@lists.
> sourceforge.net> wrote:
>
>> Hello,
>>
>> I did a upgrade from 7.0.3 to 7.0.4. After the upgrade, the profile
>> language is english while in pf.conf and profiles.conf the language is
>> already set to "de".  Any idea

[PacketFence-users] Using Hostapd 17.01.4 with WISPr and a Captive Portal on PF 7.4.0

2018-03-07 Thread Ian MacDonald via PacketFence-users
Below is a quick addendum to the current Hostapd Quick Install Guide.

Hopefully it will help new users looking to leverage the flexibility of
OpenWRT (aka LEDE) with the powerful captive portal functionality of
Packetfence.

There is a great guide from Inverse, and this email just adds a few missing
bits that will help keep hostapd related posts out of the list.  Instead
invest that savings in some professional services from Inverse.

The Guide
https://packetfence.org/doc/PacketFence_OpenWrt-Hostapd-
15-05_Quick_Install_Guide.html

Hostapd 15.05 is dead and old.  At this time, 17.01.4 is current.

With hostapd, you have to pick a band (i.e 5G or 2.4G) as CoA only works
with one radio.

Below are some additions to the guide that should be helpful based on a
configuration with a packetfence server with an IP of 192.168.10.10 on the
management VLAN70, registration VLAN71, isolation VLAN72 and an out-of-band
user/normal VLAN76 (PF server does nothing on VLAN76 and does not see it).
The hostapd AP running LEDE/OpenWRT 17.01.4 has an IP of 192.168.10.19 on
the management VLAN70.

Step 4.1:  Have your Internet connected to the WAN port of a default
OpenWRT configuration for staging and connect to one of the LAN ports on
192.168.1.1.  If your network is 192.168.1.X, you might consider making a
change you your local lan to make this type of access easier.

The actual steps to update the packages:

#opkg update
#opkg remove wpad-mini hostapd
#opkg install hostapd-common wpad

We typically dump a bunch of useful tools and stuff too; none of this is
required
# opkg install ipset screen iftop tcpdump curl mtr wget diffutils iperf3
iwinfo snmpd  kmod-gpio-button-hotplug

We like to bring all the packages up to the latest version as well (also
not required)
#opkg update
#opkg list-upgradable | awk -F ' - ' '{print $1}' | xargs opkg upgrade

Step 4.2:
Most people choose to run on 5G for increased bandwidth and control of
range.  Usually this is wlan0, but on some hardware (i.e. WD N600) it will
be wlan1.  You can use iwinfo to see which device is your 5G radio.

Step 4.3: Attached script is a modified version of the 17.04.1 hostapd.sh

Step 4.4: The example is not for the scenario where you just have an open
SSID where you are providing access via WISPr/Captive Portal. It also has
some issues; it is missing "wifi-iface" on the  PF-Open SSID, uses the 2.4G
radio (which is often not wlan0), and has vlan_naming set to '0', where we
find '1' is the preferred option.

Below is an updated example with a single open SSID 'Public WiFi'.

Some notes on where yours might differ for the radio (wifi-device):
1) the 5G radio 'path' is specific to the device (in this case a TP-Link
C2600)
2) The country 'CA' is Canada (that's where we are)
3) The channel is '149' ; Higher channels deliver more power on some
devices. DFS channels like '100', when supported are typically very
clear.  Checkout
'iw list' output for power and channel support.

Some notes on where yours might differ for the interface (wifi-iface):
1) The 192.168.10.10 address should be your PF server on the management
subnet. We like to leave the default 192.168.1.1 on the lan interface for
local configuration and access, and so it is not used for any PF vlans
(management/registration/isolation/normal).
2) The vlan_tagged_interface 'eth0' should be the interface where your
tagged/trunked vlans are connected for registration/isolation/normal that
clients will be connecting to. Typically this is connected to the
"Internet" port.  In some rare cases this eth1 is the "Internet" port,
depending on the hardware. You need to know your hardware.
3) option network 'lan' should be removed, it is the default where wifi
interfaces normally are bridged to the lan
4) vlan_naming '1' seems to produce less bridge naming warnings in our
experience, though both '1' and '0' work.
5) Note that radio1 (2.4G) is set with option disabled '1' - we are not
using it

config wifi-device 'radio0'
option type 'mac80211'
option hwmode '11a'
option path 'soc/1b50.pci/pci:00/:00:00.0/:01:00.0'
option htmode 'VHT80'
option disabled '0'
option country 'CA'
option channel '149'

config wifi-iface 'default_radio0'
option device 'radio0'
option mode 'ap'
option encryption 'none'
option vlan_file '/etc/config/hostapd.vlan'
option vlan_bridge 'br-vlan'
option vlan_naming '1'
option dynamic_vlan '2'
option auth_port '1812'
option auth_server '192.168.10.10'
option auth_secret 's3cr3t'
option acct_port '1813'
option acct_server '192.168.10.10'
option acct_secret 's3cr3t'
option dae_port '3799'
option dae_client '192.168.10.10'
option dae_secret 's3cr3t'
option nasid 'Lobby AP'
option ssid 'Public WiFi'
option vlan_tagged_interface 'eth0'

config wifi-device 'radio1'
option type 'mac80211'
option channel '11'
option hwmode '11g'
option path 'soc/1b70.pci/pci0001:00/0001:00:00.0/0001:01:00.0'
option htmode 'HT20'
option disabled '1'

config wifi-iface 'default_radio1'
option device 'radio1'

[PacketFence-users] Solved: Hostapd causing duplicate Sign-Into-Network prompts intermittently

2018-03-07 Thread Ian MacDonald via PacketFence-users
Fabrice,

Thank you for the tip.  A nicely commented configuration option.  I'll make
a mental note to scroll through pf.conf.defaults before asking if a
configuration option exists.

wait_for_redirect=20 seems to be ideal for hostapd and the speediest of
fingers on the speediest of smartphones using WISPr.

The progress bar kicks in nicely and the transistion was smooth.

I will note for the list that this impacts all aspects of the WebAPI,
including the GUI unregistration, so we have removed the liklihood of any
race condition there as well with recently authorized devices being
de-registered.

I am also going to open up a ticket with hostapd upstream to see if they
can change the CoA behavior to always initiate an Access-Request after CoA
so we can get back to the instant activation which gives the impression of
'faster/better connectivity' for the end user.

cheers,
Ian



On Tue, Mar 6, 2018 at 9:46 PM, Durand fabrice via PacketFence-users <
packetfence-users@lists.sourceforge.net> wrote:

> Hello Ian,
>
> there is a configuration parameter that can be change to raise the time
> between the deauth order and the CoA request.
>
> It's Config{'fencing'}{'wait_for_redirect'} , so in pf.conf:
>
> [fencing]
> #
> # fencing.wait_for_redirect
> #
> # How many seconds should the WebAPI sleep before actually triggering the
> VLAN change.
> # This is meant to give the device enough time to fetch the redirection
> page before
> # switching VLAN.
> wait_for_redirect = 31
>
> Regards
>
> Fabrice
>
>
>
> Le 2018-03-06 à 13:54, Ian MacDonald via PacketFence-users a écrit :
>
> Well,
>
> We easily narrowed this down to a timing issue related to the
> Access-Accept(2) messages and how long an authorization is considered valid
> by hostapd.
>
> In short, we need to make sure that the CoA from portal activation is NOT
> sent to the AP within 30 seconds of the initial connection to the
> registration VLAN.
>
> We don't think the redirect or network_detection delay is suitable for
> this as that happens after the CoA.
>
> *How do we bake in a delay between Access-Accept and CoA messages from the
> radius server for speedy WISPr clients? *
>
> Is there a portal setting that can help us here?
>
> Ideally, a "minimum_coa_delay" setting could allow packetfence to only
> issue a CoA after a minimum time period since the last Access-Request or
> Access-Accept was processed.
>
> Additionally, we are going to investigate configuration of hostapd to
> change the 30s to something smaller (which would solve most our use cases)
> and/or, having the 30s timer reset on receipt of a CoA.  Both these seem
> non-trivial vs adding this intelligence to packetfence which may solve
> similar race conditions in the wide variety of radius clients supported.
>
> Details follow,
>
> cheers,
> Ian
>
>
> Hostapd considers the last authorization valid for 30 seconds, and if a
> host disconnects and reconnects within that window, it does not check back
> with the radius server for a new authorization.   This holds true even if a
> CoA is received from the radius server;  The latter which I think is a
> behavior that could be enhanced by hostapd - after all it is called "Change
> of Authorization" request.
>
> How this is resulting the double prompt for some users, is that if a user
> is able to complete the portal activation within 30 seconds of the device
> connecting to the registration VLAN, hostapd does not issue another
> Access-Request, but instead considers the current authorization to be
> valid, and the devices remain on the registration VLAN evem after the CoA
> is sent  that should switch them to the Normal VLAN.
>
> This means you can actually re-Sign-In multiple times within that 30
> second window if you are quick (my email auto-populates on my phone making
> this easy). Each time the portal generates a CoA but returns the device to
> the registration VLAN.  At 31 seconds after the initial device connnection
> to the registration VLAN, any subsequent CoA results in the correct VLAN
> being assigned.
>
> The 30s authorization sort of makes sense in order to deal with a client
> that is authorized and disconnects and reconnects quickly a bunch of times,
> you avoid some unnecessary radius requests back and forth, but the benefit
> is minimized since the Radius-Accounting packets are still processed with
> each disconnect.
>
> This 30 second delay also impacts de-registration.  We were able to
> confirm that if we try and de-register a client from the PF GUI within 30
> seconds of them being activated, they remain on the Normal VLAN.   So there
> is some impact here as well -> And we remember seeing this little bug a
> long tim

Re: [PacketFence-users] Hostapd causing duplicate Sign-Into-Network prompts intermittently

2018-03-06 Thread Ian MacDonald via PacketFence-users
Well,

We easily narrowed this down to a timing issue related to the
Access-Accept(2) messages and how long an authorization is considered valid
by hostapd.

In short, we need to make sure that the CoA from portal activation is NOT
sent to the AP within 30 seconds of the initial connection to the
registration VLAN.

We don't think the redirect or network_detection delay is suitable for this
as that happens after the CoA.

*How do we bake in a delay between Access-Accept and CoA messages from the
radius server for speedy WISPr clients? *

Is there a portal setting that can help us here?

Ideally, a "minimum_coa_delay" setting could allow packetfence to only
issue a CoA after a minimum time period since the last Access-Request or
Access-Accept was processed.

Additionally, we are going to investigate configuration of hostapd to
change the 30s to something smaller (which would solve most our use cases)
and/or, having the 30s timer reset on receipt of a CoA.  Both these seem
non-trivial vs adding this intelligence to packetfence which may solve
similar race conditions in the wide variety of radius clients supported.

Details follow,

cheers,
Ian


Hostapd considers the last authorization valid for 30 seconds, and if a
host disconnects and reconnects within that window, it does not check back
with the radius server for a new authorization.   This holds true even if a
CoA is received from the radius server;  The latter which I think is a
behavior that could be enhanced by hostapd - after all it is called "Change
of Authorization" request.

How this is resulting the double prompt for some users, is that if a user
is able to complete the portal activation within 30 seconds of the device
connecting to the registration VLAN, hostapd does not issue another
Access-Request, but instead considers the current authorization to be
valid, and the devices remain on the registration VLAN evem after the CoA
is sent  that should switch them to the Normal VLAN.

This means you can actually re-Sign-In multiple times within that 30 second
window if you are quick (my email auto-populates on my phone making this
easy). Each time the portal generates a CoA but returns the device to the
registration VLAN.  At 31 seconds after the initial device connnection to
the registration VLAN, any subsequent CoA results in the correct VLAN being
assigned.

The 30s authorization sort of makes sense in order to deal with a client
that is authorized and disconnects and reconnects quickly a bunch of times,
you avoid some unnecessary radius requests back and forth, but the benefit
is minimized since the Radius-Accounting packets are still processed with
each disconnect.

This 30 second delay also impacts de-registration.  We were able to confirm
that if we try and de-register a client from the PF GUI within 30 seconds
of them being activated, they remain on the Normal VLAN.   So there is some
impact here as well -> And we remember seeing this little bug a long time
ago, so we think this has always been there.  Often testing the captive
portals we de-register devices quickly to "try them again" and we used to
have to hit "re-evaluate" after save .. but it turns out, only if we do it
under 30 seconds.




On Mon, Mar 5, 2018 at 9:21 PM, Ian MacDonald  wrote:

> We have packetfence 7.4 instances out-of-band running on Debian 8, and use
> the captive portal with hostapd for WiFi client access.
>
> Our clients register using an email source for activation.
>
> Our configuration uses all captive_portal defaults except for the network
> detection IP, shown below.
>
> network_detection_initial_delay = 5s
> network_detection_retry_delay = 2s
> network_redirect_delay = 20s
> secure_redirect=enabled
> detection_mecanism_bypass = disabled
> wispr_redirection = enabled
>
> At some point in the past I recall our mobile clients used to see the
> scrollbar for many seconds (10+?), which I believe was redirect_delay being
> interrupted by network_detection; this would occur immediately after
> submitting an email for activation.
>
> Now on PF 7.4.0, hostapd 17.01.4 and with Samsung S8 clients, the delay
> between submitting the email and activation onto the user/normal VLAN is
> instantaneous (about 2 seconds according to our logs).   I suspect this
> speedy interaction has something to do with WiSPr and CoA working together
> quickly, whereas previously we were maybe only doing captive portal.
>
> The problem is that intermittently, the Android clients, after activation,
> are returned to the Registration VLAN instead of the Normal/User VLAN,
> which causes those clients to pop up the "Sign Into Network" WiSPR prompt a
> second time.  Clicking on it again loads a quick PF Network Error page
> which disappears as quickly as it appears, and a second later the client is
> dumped onto the Normal/User VLAN as it should have been originally.
>
> This is confusing for end users, and most make it through. But not
> everyone is happy.
>
> Below is the 

Re: [PacketFence-users] Access to PF captive portal is blocked

2018-03-06 Thread Ian MacDonald via PacketFence-users
Eugene,

I think a good old fashioned network diagram could be of help here.  I am
not sure which linux flavour your running, but I can see one problem that
might be confusing your forwarding/arp/iptables.

You have two IP addresses on the same subnet configured both on the raw
device (eth0) and a tagged trunk interface (eth0.1).  I don't think your
intent was to have packets on one of these interfaces coming and going on
the raw interface without tags (eth0) and the other being tagged (eth0.1),
especially if the interfaces are on the same subnet.  It just so happens
the default PVID on eth0 would be 1, making this sort of setup a bit more
confusing and maybe possibly sorta work strangely.

An alias for eth0 would be eth0:1 (note the ":" instead of ".").  This is
what you would use to put multiple IPs on the same subnet. i.e. eth0:1,
eth0:2 and eth0:3 are all on the same vlan, the native vlan for eth0.

eth0.1 is a trunked interface for VLAN1 on eth0.   If you had an 802.11q
trunk connected to your eth0 with tagged VLANS 10,11 and 12 they could have
interfaces on eth0.10, eth0.11 and eth0.12 respectively.  And if you wanted
multiple IPs on each subnet, you could have aliases even still, like
eth0.10:1 and eth0.10:2 on the VLAN10 subnet for example.

I only have experience with out-of-band setups with separate subnets for
management, registration and isolation.   My minds eye doesn't quite
understand your goals for the configuration for two interfaces on the same
subnet with an out-of-band wifi authenticating via captive portal.
Describing the packet flow /subnets for your client before authorization
and afterwards might help with some of the next steps once you nail down
the network topology.

cheers,
Ian

On Mon, Feb 19, 2018 at 3:29 AM, E.P. via PacketFence-users <
packetfence-users@lists.sourceforge.net> wrote:

> And my further attempts to put two and two together and look back in time
> into this mailing list showed that Fabrice already answered this question
> before 
>
> Yes, I’d create an alias, e.g. eth0.1
>
> So, under Configuration-Networks-Interfaces I click  “ADD VLAN”  and then
> add VLAN 1, add a new IP address to belong to the same subnet and then
> select type “portal”
>
> New interface eth0.1 gets created with IP address 172.16.0.223, I can
> reach it via IP and my interfaces and networks look like this:
>
>
>
>
>
> What else am I doing to enable captive portal? I thought that it is
> enabled by default and I see httpd.portal is UP and running but I don’t see
> anything ports open on 172.16.0.223
>
> And iptables allow all HTTP and HTTPS for input-portal-if chain
>
>
>
> Eugene
>
>
>
>
>
> *From:* E.P. [mailto:ype...@gmail.com]
> *Sent:* Sunday, February 18, 2018 11:14 PM
> *To:* 'packetfence-users@lists.sourceforge.net'  sourceforge.net>
> *Cc:* 'Durand fabrice' 
> *Subject:* RE: [PacketFence-users] Access to PF captive portal is blocked
>
>
>
> I think it is slowly coming to me, Fabrice.
>
> My PF is pure for RADIUS enforcement and PF has only one IP address of
> management type.
>
> Now if I want WebAuth enforcement I would need to create one more
> interface of portal type
>
> The question is can I create this portal type interface in the same subnet
> as the management interface ?
>
> I would want to have them both in the same VLAN
>
>
>
> Eugene
>
>
>
> *From:* E.P. [mailto:ype...@gmail.com ]
> *Sent:* Sunday, February 18, 2018 7:20 PM
> *To:* 'packetfence-users@lists.sourceforge.net'  sourceforge.net>
> *Cc:* 'Durand fabrice' 
> *Subject:* RE: [PacketFence-users] Access to PF captive portal is blocked
>
>
>
> Here it is, Fabrice
>
> 10.0.254.3 is the WiFi client and 172.16.0.222 is PF.
>
> Tcpdump.pcap is attached and it is made right on PF
>
> The second capture is made on the laptop connected to guest WiFi.
>
> It contains pings to PF but all TCP SYN requests all are answered with RST.
>
>
>
> Eugene
>
>
>
> *From:* Durand fabrice via PacketFence-users [mailto:packetfence-users@
> lists.sourceforge.net ]
> *Sent:* Sunday, February 18, 2018 10:51 AM
> *To:* packetfence-users@lists.sourceforge.net
> *Cc:* Durand fabrice 
> *Subject:* Re: [PacketFence-users] Access to PF captive portal is blocked
>
>
>
> Hello Eugene,
>
> do you have the capture ?
>
> Regards
> Fabrice
>
> Le 2018-02-15 à 23:12, E.P. via PacketFence-users a écrit :
>
> Hi Fabrice,
>
> I dare sending it again believing my previous email fell into cracks.
>
> Can you please advise what could be wrong (see below)
>
>
>
> Eugene
>
>
>
>
>
> *From:* E.P. [mailto:ype...@gmail.com ]
> *Sent:* Wednesday, February 14, 2018 1:08 AM
> *To:* packetfence-users@lists.sourceforge.net
> *Subject:* Access to PF captive portal is blocked
>
>
>
> Hello folks,
>
> I really hope someone who ran into a similar problem will shed some light.
>
> 

[PacketFence-users] Hostapd causing duplicate Sign-Into-Network prompts intermittently

2018-03-06 Thread Ian MacDonald via PacketFence-users
We have packetfence 7.4 instances out-of-band running on Debian 8, and use
the captive portal with hostapd for WiFi client access.

Our clients register using an email source for activation.

Our configuration uses all captive_portal defaults except for the network
detection IP, shown below.

network_detection_initial_delay = 5s
network_detection_retry_delay = 2s
network_redirect_delay = 20s
secure_redirect=enabled
detection_mecanism_bypass = disabled
wispr_redirection = enabled

At some point in the past I recall our mobile clients used to see the
scrollbar for many seconds (10+?), which I believe was redirect_delay being
interrupted by network_detection; this would occur immediately after
submitting an email for activation.

Now on PF 7.4.0, hostapd 17.01.4 and with Samsung S8 clients, the delay
between submitting the email and activation onto the user/normal VLAN is
instantaneous (about 2 seconds according to our logs).   I suspect this
speedy interaction has something to do with WiSPr and CoA working together
quickly, whereas previously we were maybe only doing captive portal.

The problem is that intermittently, the Android clients, after activation,
are returned to the Registration VLAN instead of the Normal/User VLAN,
which causes those clients to pop up the "Sign Into Network" WiSPR prompt a
second time.  Clicking on it again loads a quick PF Network Error page
which disappears as quickly as it appears, and a second later the client is
dumped onto the Normal/User VLAN as it should have been originally.

This is confusing for end users, and most make it through. But not everyone
is happy.

Below is the working scenario, as seen in terms of tcpdump of radius
packets on hostapd.

PF server is 10.2.1.2 and the hostapd daemon is 10.2.1.11

Client Connects to WiFi Network and Prestented with Email Activation in
Portal
12:03:58.157726 10.2.1.11.49281 > 10.2.1.2.1812: Access-Request (1)
12:03:58.211386 10.2.1.2.1812 > 10.2.1.11.49281: Access-Accept (2)
12:03:58.279846 10.2.1.11.45520 > 10.2.1.2.1813: Accounting-Request (4)
12:03:58.325525 10.2.1.2.1813 > 10.2.1.11.45520: Accounting-Response (5)
~30s later Client hits Activation button after entering email
12:04:28.735459 10.2.1.2.60410 > 10.2.1.11.3799: Disconnect-Request (40)
12:04:28.736847 10.2.1.11.45520 > 10.2.1.2.1813: Accounting-Request (4)
12:04:28.737166 10.2.1.11.3799 > 10.2.1.2.60410: Disconnect-ACK (41)
12:04:28.762991 10.2.1.2.1813 > 10.2.1.11.45520: Accounting-Response (5)
12:04:30.717764 10.2.1.11.49281 > 10.2.1.2.1812: Access-Request (1)
12:04:30.782158 10.2.1.2.1812 > 10.2.1.11.49281: Access-Accept (2)
12:04:30.906401 10.2.1.11.45520 > 10.2.1.2.1813: Accounting-Request (4)
12:04:30.943903 10.2.1.2.1813 > 10.2.1.11.45520: Accounting-Response (5)
~2s later Client is on normal/user VLAN with "Connected" message.

Corresponding log entries from the pf radius log showing Registration (81)
and Normal (76) vlans in concert with the Access-Request packets:

Mar  5 12:03:58 pf4 auth[1284]: [mac:a0:cc:2b:15:20:97] Accepted user:  and
returned VLAN 81
Mar  5 12:03:58 pf4 auth[1284]: (89) Login OK: [a0cc2b152097] (from client
10.2.1.11 port 0 cli a0:cc:2b:15:20:97)
Mar  5 12:04:30 pf4 auth[1284]: [mac:a0:cc:2b:15:20:97] Accepted user:  and
returned VLAN 76
Mar  5 12:04:30 pf4 auth[1284]: (90) Login OK: [a0cc2b152097] (from client
10.2.1.11 port 0 cli a0:cc:2b:15:20:97)

The hostapd radius logs align with PF and show the disconnect is followed
by the new vlan received in the Access-Accept (2) message

Mon Mar  5 12:03:58 2018 daemon.info hostapd: wlan0: STA a0:cc:2b:15:20:97
RADIUS: VLAN ID 81
Mon Mar  5 12:03:58 2018 daemon.info hostapd: wlan0: STA a0:cc:2b:15:20:97
RADIUS: starting accounting session 628C7DBAF341039F
Mon Mar  5 12:04:28 2018 daemon.info hostapd: wlan0: STA a0:cc:2b:15:20:97
RADIUS: stopped accounting session 628C7DBAF341039F
Mon Mar  5 12:04:30 2018 daemon.info hostapd: wlan0: STA a0:cc:2b:15:20:97
RADIUS: VLAN ID 76
Mon Mar  5 12:04:30 2018 daemon.info hostapd: wlan0: STA a0:cc:2b:15:20:97
RADIUS: starting accounting session 628C7DBAF341039F


Here is what it looks like in the buggy scenario where the user is prompted
twice to "Sign into Network".  It looks like after the Disconnect-Request
(40) to change to the normal VLAN, the hostapd does not send the
Access-Request (1) but just uses the VLAN from the previous
Access-Accept(2).

Client Connects to WiFi Network and Prestented with Email Activation in
Portal
15:11:35.269960 10.2.1.11.49281 > 10.2.1.2.1812: Access-Request (1)
15:11:35.326104 10.2.1.2.1812 > 10.2.1.11.49281: Access-Accept (2)
15:11:35.386984 10.2.1.11.45520 > 10.2.1.2.1813: Accounting-Request (4)
15:11:35.427652 10.2.1.2.1813 > 10.2.1.11.45520: Accounting-Response (5)
~22s later Client hits Activation button after entering email
15:11:57.417798 10.2.1.2.58842 > 10.2.1.11.3799: Disconnect-Request (40)
15:11:57.420730 10.2.1.11.45520 > 10.2.1.2.1813: Accounting-Request (4)
15:11:57.424759 10.2.1.11.3799 > 

Re: [PacketFence-users] Wrong language after upgrade to 7.0.4

2018-03-06 Thread Ian MacDonald via PacketFence-users
Hubert,

I would assume you mean upgraded from 7.3.0 to 7.4.0;

I think by default, connection profiles support whatever locale your
browser requests.   Maybe confirm your local setup using mybrowserinfo.com
and make sure you are requesting the language you want.

Using locale= will let your browser select the locale.

You should probably just start with a clone of the default connection
profile with no languages specified and see if the problem occurs there
with the correct browser locale, both in the Preview from the admin GUI as
well as when you hit the captive portal directly.

Compare different devices with your desired locale setting set to rule out
the client.

If the problem goes away with a clone of the default connection profile,
you can focus on your profile configuration and files, opting to find the
difference by direct comparison.

I am not sure, but your locale setting may also require subtags;  I'm not
sure the "de" language identifier will actually match one of the German
locales. Check the notation is correct
https://www.w3.org/TR/ltli/#language-terminology

I don't use the locale setting myself.

cheers,
Ian



On Wed, Feb 28, 2018 at 3:26 AM, Hubert Kupper via PacketFence-users <
packetfence-users@lists.sourceforge.net> wrote:

> Hello,
>
> I did a upgrade from 7.0.3 to 7.0.4. After the upgrade, the profile
> language is english while in pf.conf and profiles.conf the language is
> already set to "de".  Any ideas?
>
> Best regards,
> Hubert
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> PacketFence-users mailing list
> PacketFence-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/packetfence-users
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
PacketFence-users mailing list
PacketFence-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/packetfence-users


[PacketFence-users] Fixed: Fingerbank DB will not initialize

2018-03-05 Thread Ian MacDonald via PacketFence-users
It seems patience was the only culprit here;  About 5 minutes after I sent
this, likely a good 10 minutes after I tried to re-initialize the DB and
re-import the Fingerbank data, I received an email from PF indicating
success.

Our last action to stop pf services, re-install fingerbank and then retry
the DB initialize and fingerbank import was successful.

To recap, we recently updated to 7.4.  On one instance, it seems our mysql
db connection info was left out of the fingerbank.conf.   This led to
errors like:

Mar  4 06:46:00 pf2 pfqueue: pfqueue(3135) ERROR: [mac:c0:d3:c0:59:b7:7b]
DBIx::Class::Storage::DBI::_prepare_sth(): DBI Exception: DBD::SQLite::db
prepare_cached failed: database disk image is malformed [for Statement
"SELECT COUNT( * ) FROM dhcp_fingerprint me WHERE ( value = ? )"] at
/usr/local/pf/lib/fingerbank/Base/CRUD.pm line 416

We added back the [mysql] section in fingerbank.conf and tested them with
db ping. But could not successfully re-import the Fingerbank db into mysql
using the GUI actions.

[mysql]
state=enabled
username=pf
password=
database=pf_fingerbank

pf2:/usr/local/pf/logs# mysqladmin ping -upf -p
Enter password:
mysqld is alive

Stopping the services and reinstalling fingerbank seemed to resolve the DB
import failing.


pf2:/usr/local/pf/logs# /usr/local/pf/bin/pfcmd service pf stop
...
pf2:/usr/local/pf/logs# service packetfence-config stop

pf2:/usr/local/pf/logs# apt-get install fingerbank --reinstall
Reading package lists... Done
Building dependency tree
Reading state information... Done
0 upgraded, 0 newly installed, 1 reinstalled, 0 to remove and 0 not
upgraded.
Need to get 0 B/236 MB of archives.
After this operation, 0 B of additional disk space will be used.
(Reading database ... 86339 files and directories currently installed.)
Preparing to unpack .../fingerbank_3.1.1-1_all.deb ...
Unpacking fingerbank (3.1.1-1) over (3.1.1-1) ...
Setting up fingerbank (3.1.1-1) ...
DBIx::Class::Schema::Versioned::upgrade(): Upgrade not necessary at db/
upgrade.pl line 59
pf2:/usr/local/pf/logs#

Prior to this last step, we received the following email after trying to
re-initialize and import the fingerbank DB from the compliance settings GUI.

Subject: PF Alert: Failed to import Fingerbank data inside MySQL

Output of the command was : perl -I/usr/local/fingerbank/lib
-Mfingerbank::DB_Factory -e 'my () =
fingerbank::DB_Factory->instantiate(type => "MySQL", schema =>
"Upstream")->initialize_from_sqlite("/usr/local/fingerbank/db/fingerbank_Upstream.db")
; exit 1 if( != 200)'
Log4perl: Seems like no initialization happened. Forgot to call init()?
ERROR 1008 (HY000) at line 1: Can't drop database 'pf_fingerbank'; database
doesn't exist
Makefile:15: recipe for target 'init-mysql' failed
make: *** [init-mysql] Error 1

After the re-install we tried again, and received

Subject: PF Alert: Successfully imported the Fingerbank data inside MySQL

At no point we received any db initialization emails so it is not clear if
that step was ever executed, or part of the solution, but it was dispatched
from the GUI prior to failed and successful imports.

cheers,
Ian

On Sun, Mar 4, 2018 at 3:00 PM, Ian MacDonald  wrote:

> Hello,
>
> We are running current v7.4 on Debian 8.
>
> On one of our servers, it seems the fingerbank db will not initialize
> properly.
>
> In our logs, we see similar to below:
>
> Mar  4 14:51:08 pf2 packetfence_httpd.aaa: httpd.aaa(1142) ERROR:
> [mac:fc:c2:de:9e:7e:3b] Unable to compute Fingerbank device information for
> fc:c2:de:9e:7e:3b. Device profiling rules relying on it will not work.
> (DBIx::Class::Storage::DBI::_dbh_execute(): Table 'pf_fingerbank.device'
> doesn't exist at /usr/local/pf/lib/fingerbank/Base/CRUD.pm line 416
> Mar  4 14:51:22 pf2 pfqueue: pfqueue(1268) ERROR: [mac:74:f0:6d:32:56:27]
> DBIx::Class::Storage::DBI::_dbh_execute(): Table 
> 'pf_fingerbank.dhcp_fingerprint'
> doesn't exist at /usr/local/pf/lib/fingerbank/Base/CRUD.pm line 416
>
> We tried to stop services and re-install fingerbank (apt --reinstall) with
> no luck.
>
> Inside the PF GUI, it suggests that it can successfully dispatch the
> initialize DB function, but it does not seem to result in creation of the
> tables.
>
> The mysql user can ping the DB from the command line as configured in
> fingerbank.conf.
>
> We can't seem to find any logs related to the initialization command from
> the GUI
>
> Any help appreciated.
>
> cheers,
> Ian
>
> pf2:/usr/local/pf/logs# ls -al ../../fingerbank/db
> total 5463520
> drwxrwxr-x 3 fingerbank fingerbank   4096 Mar  4 14:49 .
> drwxrwxr-x 8 fingerbank fingerbank   4096 Mar  4 14:49 ..
> -rw-rw-r-- 1 root   root   98 Dec  1 10:04 .gitignore
> -rw-rw-r-- 1 root   root504208005 Dec  1 10:04
> fingerbank_Combination_Map.json
> -rw-rw-r-- 1 fingerbank fingerbank  33792 May 26  2017
> fingerbank_Local.db
> -rw-rw-r-- 1 fingerbank fingerbank 1049072740 Dec  1 10:04
> fingerbank_Upstream.db
> 

[PacketFence-users] Fingerbank DB will not initialize

2018-03-05 Thread Ian MacDonald via PacketFence-users
Hello,

We are running current v7.4 on Debian 8.

On one of our servers, it seems the fingerbank db will not initialize
properly.

In our logs, we see similar to below:

Mar  4 14:51:08 pf2 packetfence_httpd.aaa: httpd.aaa(1142) ERROR:
[mac:fc:c2:de:9e:7e:3b] Unable to compute Fingerbank device information for
fc:c2:de:9e:7e:3b. Device profiling rules relying on it will not work.
(DBIx::Class::Storage::DBI::_dbh_execute(): Table 'pf_fingerbank.device'
doesn't exist at /usr/local/pf/lib/fingerbank/Base/CRUD.pm line 416
Mar  4 14:51:22 pf2 pfqueue: pfqueue(1268) ERROR: [mac:74:f0:6d:32:56:27]
DBIx::Class::Storage::DBI::_dbh_execute(): Table
'pf_fingerbank.dhcp_fingerprint' doesn't exist at
/usr/local/pf/lib/fingerbank/Base/CRUD.pm line 416

We tried to stop services and re-install fingerbank (apt --reinstall) with
no luck.

Inside the PF GUI, it suggests that it can successfully dispatch the
initialize DB function, but it does not seem to result in creation of the
tables.

The mysql user can ping the DB from the command line as configured in
fingerbank.conf.

We can't seem to find any logs related to the initialization command from
the GUI

Any help appreciated.

cheers,
Ian

pf2:/usr/local/pf/logs# ls -al ../../fingerbank/db
total 5463520
drwxrwxr-x 3 fingerbank fingerbank   4096 Mar  4 14:49 .
drwxrwxr-x 8 fingerbank fingerbank   4096 Mar  4 14:49 ..
-rw-rw-r-- 1 root   root   98 Dec  1 10:04 .gitignore
-rw-rw-r-- 1 root   root504208005 Dec  1 10:04
fingerbank_Combination_Map.json
-rw-rw-r-- 1 fingerbank fingerbank  33792 May 26  2017
fingerbank_Local.db
-rw-rw-r-- 1 fingerbank fingerbank 1049072740 Dec  1 10:04
fingerbank_Upstream.db
-rw-r--r-- 1 pf pf 2020635648 Nov 22 21:43
fingerbank_Upstream.db_20171122_214310
-rw-r--r-- 1 pf pf 2020635648 Nov 23 21:43
fingerbank_Upstream.db_20171123_214310
-rw-rw-r-- 1 root   root 1072 Dec  1 10:04
sqlite3-to-mysql.py
drwxrwsr-x 2 root   root 4096 Mar  4 14:49 upgrade
-rwxrwxr-x 1 fingerbank fingerbank   2253 Dec  1 10:04 upgrade.pl
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
PacketFence-users mailing list
PacketFence-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/packetfence-users


[PacketFence-users] Fix for PacketFence 7.4.0 Android WISPr

2018-02-22 Thread Ian MacDonald via PacketFence-users
Further to my last post,

A big thanks to William McLeod;  His response instantly fixed our issue
with Android 7.x clients.

The fix is simple to implement directly through the GUI.

It sounds like a small tweak to packetfence might be prudent, as I am sure
Android 7 is well within the scope of a large number of end users.

cheers,
Ian

On Thu, Feb 22, 2018 at 7:36 AM, William McLeod  wrote:

> Hi Guys,
>
>
>
> Sorry for sending this direct. I am not on the PacketFence mailing list
> yet and my colleague sent your post through to me.
>
>
>
> We had the same issue and I spent some time doing diagnostics on this.
>
>
>
> All Android 7.x phones I tested including my own Samsung S8 got the
> ‘Internet may not be available’ message. It seems to happen during initial
> connection when they carry out a DNS query for ‘www.google.com’,
> PacketFence sends back a CNAME=*captiveportal.fqdn* A=
> *registration.ipaddress*. For whatever reason Android 7.x doesn’t like
> this.
>
>
>
> To fix it I added ‘www.google.com’ to the DNS filter with an A record of
> 127.0.0.1 and the Android 7.x phones automatically load the captive portal,
> for reference here’s how:
>
>
>
> From Packetfence administration console, browse to 'Configuration' >
> 'Advanced Access Configuration' > 'Filter Engines' > 'DNS Filters', add the
> following lines of config at the bottom:
>
>
>
> *[www_google_com]*
>
> *filter = qname*
>
> *operator = regex*
>
> *value = www.google.com *
>
>
>
> *[1:www_google_com]*
>
> *scope = registration*
>
> *answer = $qname 30 IN A 127.0.0.1*
>
> *rcode = NOERROR*
>
>
>
> Then click 'SAVE'
>
>
>
> Regards,
>
>
>
>
>
> William McLeod
> Future Technology Group Director
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
PacketFence-users mailing list
PacketFence-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/packetfence-users


Re: [PacketFence-users] packetfence-dhcpd.service restarting endlessly on PF 7.4.0 / Debian 9

2018-02-21 Thread Ian MacDonald via PacketFence-users
In addition to below,

It looks like this might be this issue, previously resolved with mariadb
but possibly not with packetfence-dhcpd.

https://github.com/inverse-inc/packetfence/issues/2792

Our short term workaround was to just turn off the broken systemd process
notification by changing the service type to "simple", executing a systemd
daemon-reload and restarting the packetfence-dhcpd.service.

Seems to be good for now.

cheers,
Ian

On Wed, Feb 21, 2018 at 10:28 AM, Ian MacDonald  wrote:

> Hello,
>
> Our PF instance(s) run in an out-of-band configuration, providing a
> captive portal to hostapd/CoA enabled switches using connection profiles
> with SSID filters and an email source for activation.
>
> We are not clustering, but instead have a qa instance where we stage and
> test upgrades and configuration changes.
>
> Since at least pf 7.3.0 we have noticed that the dhcpd service is
> constantly restarting every couple of minutes.  We just completed our
> upgrade to 7.4.0 and we still continue to see this dhcpd service restarting
> every couple of minutes.  This service provides connectivity to the
> registration and isolation VLANs by default.
>
> In our logs we see the lines captured below; I have included a capture
> from both our instances which behave the same way.
>
> Out best guess is that for some reason the dhcpd service monitoring is not
> quite functioning correctly, and restarting the service on a periodic
> basis.  The "Failed to start" message even though  it has started, seems to
> be an indication of perhaps some issue with the startup/systemd integration
> where it is not properly detecting the service start and simply restarting
> after the holdoff timer has expired.
>
> The net impact is that we believe this is causing some inconsistencies in
> portal registration / authorization timing that we may have previously
> worked around by increasing our redirect delay on activation.
>
> Now that we are on the latest version, we are posting here to see if there
> is a known quick fix while we continue to poke around with the dhcpd
> service startup configuration.
>
> cheers,
> Ian
>
> Instance 1:
> Feb 21 10:05:18 pf2 systemd[1]: packetfence-dhcpd.service start operation
> timed out. Terminating.
> Feb 21 10:05:18 pf2 systemd[1]: Failed to start PacketFence DHCPv4 Server
> Daemon.
> Feb 21 10:05:18 pf2 systemd[1]: Unit packetfence-dhcpd.service entered
> failed state.
> Feb 21 10:05:18 pf2 systemd[1]: packetfence-dhcpd.service holdoff time
> over, scheduling restart.
> Feb 21 10:05:18 pf2 systemd[1]: Stopping PacketFence DHCPv4 Server
> Daemon...
> Feb 21 10:05:18 pf2 systemd[1]: Starting PacketFence DHCPv4 Server
> Daemon...
> Feb 21 10:05:21 pf2 pfcmd[1952]: service|command
> Feb 21 10:05:21 pf2 pfcmd[1952]: dhcpd|config generated
>
> Instance 2:
> Feb 21 10:01:11 pf4 systemd[1]: packetfence-dhcpd.service start operation
> timed out. Terminating.
> Feb 21 10:01:11 pf4 systemd[1]: Failed to start PacketFence DHCPv4 Server
> Daemon.
> Feb 21 10:01:11 pf4 systemd[1]: Unit packetfence-dhcpd.service entered
> failed state.
> Feb 21 10:01:11 pf4 systemd[1]: packetfence-dhcpd.service holdoff time
> over, scheduling restart.
> Feb 21 10:01:11 pf4 systemd[1]: Stopping PacketFence DHCPv4 Server
> Daemon...
> Feb 21 10:01:11 pf4 systemd[1]: Starting PacketFence DHCPv4 Server
> Daemon...
> Feb 21 10:01:15 pf4 pfcmd[16671]: service|command
> Feb 21 10:01:15 pf4 pfcmd[16671]: dhcpd|config generated
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
PacketFence-users mailing list
PacketFence-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/packetfence-users


[PacketFence-users] Checkup warning on legacy redirtimer option in PF 7.4.0

2018-02-21 Thread Ian MacDonald via PacketFence-users
Hello,

We have two pf instances running 7.4.0 on Debian 9, both recently upgraded
from 7.3.0.   They both have been running packetfence since 6.5.0.

In both cases, they have the following in pf.conf that has been retained
over the upgrades:

[fencing]
#
# trapping.redirtimer
#
# How long to display the progress bar during trap release. Default value is
# based on VLAN enforcement techniques. Inline enforcement only users could
# lower the value.
redirtimer=30s


Since upgrading to 7.4.0, we now see a warning when using the "Perform
check-up" function in the admin GUI that states  *WARNING: unknown
configuration parameter fencing.redirtimer*

A quick review of the upgrade doc did not reveal any steps related to this
change, but we do note that this setting seems to now exist with a
different name with the same description in pf.conf.defaults shown below.

# captive_portal.network_redirect_delay
#
# How long to display the progress bar during trap release. Default value
is
# based on VLAN enforcement techniques. Inline enforcement only users could
# lower the value.
network_redirect_delay = 20s

It looks like it is a legacy option that possibly changed sections twice
(trapping->fencing->captive_portal) and names once
(redirtimer->network_redirect_delay) and was not modified in the upgrade
scripts cleanly along the way.

We are just removing the section from our pf.conf, but noting here in the
off chance that was in error and/or the intent was to have it cleaned up in
the upgrade scripts for other users.

cheers,
Ian
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
PacketFence-users mailing list
PacketFence-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/packetfence-users


[PacketFence-users] packetfence-dhcpd.service restarting endlessly on PF 7.4.0 / Debian 9

2018-02-21 Thread Ian MacDonald via PacketFence-users
Hello,

Our PF instance(s) run in an out-of-band configuration, providing a captive
portal to hostapd/CoA enabled switches using connection profiles with SSID
filters and an email source for activation.

We are not clustering, but instead have a qa instance where we stage and
test upgrades and configuration changes.

Since at least pf 7.3.0 we have noticed that the dhcpd service is
constantly restarting every couple of minutes.  We just completed our
upgrade to 7.4.0 and we still continue to see this dhcpd service restarting
every couple of minutes.  This service provides connectivity to the
registration and isolation VLANs by default.

In our logs we see the lines captured below; I have included a capture from
both our instances which behave the same way.

Out best guess is that for some reason the dhcpd service monitoring is not
quite functioning correctly, and restarting the service on a periodic
basis.  The "Failed to start" message even though  it has started, seems to
be an indication of perhaps some issue with the startup/systemd integration
where it is not properly detecting the service start and simply restarting
after the holdoff timer has expired.

The net impact is that we believe this is causing some inconsistencies in
portal registration / authorization timing that we may have previously
worked around by increasing our redirect delay on activation.

Now that we are on the latest version, we are posting here to see if there
is a known quick fix while we continue to poke around with the dhcpd
service startup configuration.

cheers,
Ian

Instance 1:
Feb 21 10:05:18 pf2 systemd[1]: packetfence-dhcpd.service start operation
timed out. Terminating.
Feb 21 10:05:18 pf2 systemd[1]: Failed to start PacketFence DHCPv4 Server
Daemon.
Feb 21 10:05:18 pf2 systemd[1]: Unit packetfence-dhcpd.service entered
failed state.
Feb 21 10:05:18 pf2 systemd[1]: packetfence-dhcpd.service holdoff time
over, scheduling restart.
Feb 21 10:05:18 pf2 systemd[1]: Stopping PacketFence DHCPv4 Server Daemon...
Feb 21 10:05:18 pf2 systemd[1]: Starting PacketFence DHCPv4 Server Daemon...
Feb 21 10:05:21 pf2 pfcmd[1952]: service|command
Feb 21 10:05:21 pf2 pfcmd[1952]: dhcpd|config generated

Instance 2:
Feb 21 10:01:11 pf4 systemd[1]: packetfence-dhcpd.service start operation
timed out. Terminating.
Feb 21 10:01:11 pf4 systemd[1]: Failed to start PacketFence DHCPv4 Server
Daemon.
Feb 21 10:01:11 pf4 systemd[1]: Unit packetfence-dhcpd.service entered
failed state.
Feb 21 10:01:11 pf4 systemd[1]: packetfence-dhcpd.service holdoff time
over, scheduling restart.
Feb 21 10:01:11 pf4 systemd[1]: Stopping PacketFence DHCPv4 Server Daemon...
Feb 21 10:01:11 pf4 systemd[1]: Starting PacketFence DHCPv4 Server Daemon...
Feb 21 10:01:15 pf4 pfcmd[16671]: service|command
Feb 21 10:01:15 pf4 pfcmd[16671]: dhcpd|config generated
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
PacketFence-users mailing list
PacketFence-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/packetfence-users


[PacketFence-users] RESOLVED: Upgrading PF 6.5 to 7.0 haproxy not starting

2017-06-02 Thread Ian MacDonald via PacketFence-users
Louis,

Thanks for your input on the issue,  some responses to your request for
info below,


>> The main problem seems to be that that the haproxy service is not
starting.

>> In the syslog we just get a generic service failure with no details
>>
>> May 29 16:51:08 pf2 systemd[1]: Started PacketFence HAProxy Load
Balancer.
>> May 29 16:51:08 pf2 systemd[1]: packetfence-haproxy.service: main
process
>> exited, code=exited, status=1/FAILURE
>> May 29 16:51:08 pf2 systemd[1]: Unit packetfence-haproxy.service entered
>> failed state.
>> May 29 16:51:08 pf2 systemd[1]: packetfence-haproxy.service holdoff time
>> over, scheduling restart.
>> May 29 16:51:08 pf2 systemd[1]: Stopping PacketFence HAProxy Load
Balancer...
>>

>Let's try a few things.
>
>First, can you please post the output to these commands:
>
># systemctl status packetfence-haproxy

pf2:~# systemctl status packetfence-haproxy -l
* packetfence-haproxy.service - PacketFence HAProxy Load Balancer
   Loaded: loaded (/lib/systemd/system/packetfence-haproxy.service; enabled)
   Active: failed (Result: start-limit) since Mon 2017-05-29 16:51:15 EDT;
3 days ago
  Process: 1031 ExecStart=/usr/sbin/haproxy-systemd-wrapper -f
/usr/local/pf/var/conf/haproxy.conf -p /usr/local/pf/var/run/haproxy.pid
(code=exited, status=1/FAILURE)
  Process: 977 ExecStartPre=/usr/local/pf/bin/pfcmd service haproxy
generateconfig (code=exited, status=0/SUCCESS)
 Main PID: 1031 (code=exited, status=1/FAILURE)

May 29 16:51:15 pf2 systemd[1]: packetfence-haproxy.service: main process
exited, code=exited, status=1/FAILURE
May 29 16:51:15 pf2 systemd[1]: Unit packetfence-haproxy.service entered
failed state.
May 29 16:51:15 pf2 haproxy-systemd-wrapper[1031]: haproxy-systemd-wrapper:
exit, haproxy RC=1
May 29 16:51:15 pf2 systemd[1]: packetfence-haproxy.service holdoff time
over, scheduling restart.
May 29 16:51:15 pf2 systemd[1]: Stopping PacketFence HAProxy Load
Balancer...
May 29 16:51:15 pf2 systemd[1]: Starting PacketFence HAProxy Load
Balancer...
May 29 16:51:15 pf2 systemd[1]: packetfence-haproxy.service start request
repeated too quickly, refusing to start.
May 29 16:51:15 pf2 systemd[1]: Failed to start PacketFence HAProxy Load
Balancer.
May 29 16:51:15 pf2 systemd[1]: Unit packetfence-haproxy.service entered
failed state.

># systemctl cat packetfence-haproxy

pf2:~# systemctl cat packetfence-haproxy
# /lib/systemd/system/packetfence-haproxy.service
[Unit]
Description=PacketFence HAProxy Load Balancer
Before=packetfence-httpd.portal.service packetfence-httpd.admin.service
Wants=packetfence-config.service

[Service]
StartLimitBurst=3
StartLimitInterval=60
PIDFile=/usr/local/pf/var/run/haproxy.pid
ExecStartPre=/usr/local/pf/bin/pfcmd service haproxy generateconfig
ExecStart=/usr/sbin/haproxy-systemd-wrapper -f
/usr/local/pf/var/conf/haproxy.conf -p /usr/local/pf/var/run/haproxy.pid
ExecReload=/bin/kill -USR2 $MAINPID
Restart=on-failure

[Install]
WantedBy=packetfence-base.target


> # ps -ef | grep haproxy

pf2:~# ps -ef | grep haproxy
root 11820 11782  0 10:16 pts/000:00:00 grep haproxy


>As to the configuration itself, look in
/usr/local/pf/var/conf/haproxy.conf to
>see the configuration that is actually generated by the conf/haproxy.conf
>template.

We did peek in here and nothing jumped out at us.

>You can try running haproxy in debug mode to see what error messages may
be
>lurking there:
>
># /usr/sbin/haproxy -f /usr/local/pf/var/conf/haproxy.conf -p
>/usr/local/pf/var/run/haproxy.pid -d

This was very helpful and immediately brought us to conclude it was related
to a change in our certs, that we opportunistically pushed out,  as a root
cause of our issue.  Is there a place in the docs that describes how to get
these debug outputs, to better help us help ourselves in the future?

pf2:~# /usr/sbin/haproxy -f /usr/local/pf/var/conf/haproxy.conf -p
/usr/local/pf/var/run/haproxy.pid -d
[ALERT] 152/125205 (13132) : parsing
[/usr/local/pf/var/conf/haproxy.conf:110] : 'bind 10.4.2.2:443' : unable to
load SSL private key from PEM file '/usr/local/pf/conf/ssl/server.pem'.
[ALERT] 152/125205 (13132) : parsing
[/usr/local/pf/var/conf/haproxy.conf:156] : 'bind 10.4.3.2:443' :
'/usr/local/pf/conf/ssl/server.pem'.
[ALERT] 152/125205 (13132) : parsing
[/usr/local/pf/var/conf/haproxy.conf:202] : 'bind 10.4.1.2:443' : unable to
load SSL private key from PEM file '/usr/local/pf/conf/ssl/server.pem'.
[ALERT] 152/125205 (13132) : Error(s) found in configuration file :
/usr/local/pf/var/conf/haproxy.conf
[WARNING] 152/125205 (13132) : Proxy 'stats': in multi-process mode, stats
will be limited to process assigned to the current request.
[ALERT] 152/125205 (13132) : Proxy 'portal-https-10.4.2.2': no SSL
certificate specified for bind '10.4.2.2:443' at
[/usr/local/pf/var/conf/haproxy.conf:110] (use 'crt').
[ALERT] 152/125205 (13132) : Proxy 'portal-https-10.4.3.2': no SSL
certificate specified for bind '10.4.3.2:443' at
[/usr/local/pf/var/conf/haproxy.conf:156] (use 'crt').