To close this thread off,

The package issues are being handled in
https://github.com/inverse-inc/packetfence/issues/7473

On the 1813 bindings;  It looks like I just had not yet noticed that there
are now two accounting options;  pfacct and radius-acct and although they
are both present, the radius-acct is only used when needed for heavy
lifting in special use cases with eap, or non-standard radius commands,
and otherwise has been replaced with the newer lightweight pfacct which
suits my use case.   I guess I must have decided to turn it on in my test
environment, thinking I was using it.  I mean at some point in the past I
was pre 11.  Anyways it doesn't start up a listener right away other than
localhost. Just user error.


On Sun, Jan 22, 2023 at 4:41 PM Ian MacDonald <i...@netstatz.com> wrote:

> In an upgrade of a PF environment from 11.0 to 12.1, following the upgrade
> instructions, we stumbled on a bug due to some package dependencies.
>
> These redudant packages are being tracked in a bug here
> https://github.com/inverse-inc/packetfence/issues/7246 however they cause
> an upgrade from 11.0 to 12.1 to fail.
>
> From an uptodate 11.0 instance running on Debian 11.6, we ran the upgrade
> script, following prompts etc.
>
> # /usr/local/pf/addons/full-upgrade/run-upgrade.sh
>
> The process ended due to the package conflict, but we were able to resolve
> the conflicts, getting apt sorted out, and then resume the upgrade script,
> which completed the DB upgrades and appeared to exit cleanly.
>
> # dpkg --force-depends -r packetfence-captive-portal-javascript
> # apt --fix-broken install
> # /usr/local/pf/addons/full-upgrade/run-upgrade.sh
>
> However it did not leave us with a working configuration.   The
> radius-acct binding is not working properly; we are not really sure what
> else might be a problem, so likely we have to restage to 11.0 and try again
> once we figure out an upgrade strategy.
>
> Previously we had the same issue in another environment going from 11.0
> ->11.1 ->11.2 ->12.1 where the upgrade process was interrupted by a problem
> with our firewall causing timeouts during the docker pull.   In that case,
> there was no package issue, but we were restarting the upgrade script.  The
> result was the same binding issue, and in that case we just decided to
> restage to 12.1 rather than figure it out, as our goal was to complete
> testing of the captive portal on 12.1.
>
> Our hypothesis is that interruption of the upgrade wrapper script, in one
> case due to a timeout on docker image pulls, and in this latest case from
> package dependencies, led to a 12.1 environment with this service binding
> issue.
>
> Our next step will be to restage 11.0 -> 12.1 without interruption by
> starting with
>
> # dpkg --force-depends -r packetfence-captive-portal-javascript
>
> to see if we can get to 12.1 with radius-acct/pfacct bindings in proper
> working order.
>
> Below are some details, maybe if there is an obvious quick fix that might
> be useful for anyone going through the upgrade process with similar
> results, or save us the time to restage later this month.
>
> cheers,
> Ian
>
> We did some quick comparisons to try and find the difference in systemctl
> packetfence units and configuration between working and no-working 12.1
> system.  We greped config and systemctl units for "1813" and did not find
> any differences but we know its there somewhere.
>
> On the working PF 12.1 instance, staged directly to 12.1, the binding
> looks like the following, where 10.2.1.2 is the management network where
> switches connect to radius-acct.
>
> pf4:/usr/local/pf/conf# netstat -tunap | grep ":1813"
> udp        0      0 10.2.1.2:1813           0.0.0.0:*
>       42375/pfacct
>
> On the non-working instance the binding seems to be to the localhost;
>
> pf3:/usr/local/pf/conf# netstat -tunap | grep ":1813"
> udp        0      0 127.0.0.1:1813          0.0.0.0:*
>       1942/pfacct
>
>
> We can see radius-acct complaining in the logs as well and working hard to
> restart what we believe is a docker container with radius running inside
> that the actual binding should be pointed at.
>
> radius-acct.log:Jan 22 15:48:16 pf3 radiusd-acct-docker-wrapper[93781]:
> Sun Jan 22 15:48:16 2023 : Error: Failed binding to acct address * port
> 1813 bound to server packetfence: Address already in use
> radius-acct.log:Jan 22 15:48:16 pf3 radiusd-acct-docker-wrapper[93781]:
> Sun Jan 22 15:48:16 2023 : Error: /usr/local/pf/raddb/acct.conf[8]: Error
> binding to port for 0.0.0.0 port 1813
> radius-acct.log:Jan 22 15:48:19 pf3 radiusd-acct-docker-wrapper[93890]:
> Error: No such container: radiusd-acct
> radius-acct.log:Jan 22 15:48:19 pf3 radiusd-acct-docker-wrapper[93890]:
> Error: No such container: 1
>
> If we stop the pfacct service only
>
> systemctl stop packetfence-pfacct.service
>
> this allows the docker  binding to work but not the pfacct process can not
> be restarted, and the binding appears to then be directly with freeradius,
> rather than pfacct, and on all interfaces rather than just the registration
> VLAN.
>
> pf3:/usr/local/pf/logs# netstat -tunap | grep ":1813"
> udp        0      0 0.0.0.0:1813            0.0.0.0:*
>       93957/freeradius
>
> Without more knowledge of the difference between the PF Go (pfacct) and
> Freeradius processes and how the new docker container bindings work, it
> looks like restage and retry the upgrade is probably the next step for us.
>
>
>
>
_______________________________________________
PacketFence-users mailing list
PacketFence-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/packetfence-users

Reply via email to