In an upgrade of a PF environment from 11.0 to 12.1, following the upgrade instructions, we stumbled on a bug due to some package dependencies.
These redudant packages are being tracked in a bug here https://github.com/inverse-inc/packetfence/issues/7246 however they cause an upgrade from 11.0 to 12.1 to fail. >From an uptodate 11.0 instance running on Debian 11.6, we ran the upgrade script, following prompts etc. # /usr/local/pf/addons/full-upgrade/run-upgrade.sh The process ended due to the package conflict, but we were able to resolve the conflicts, getting apt sorted out, and then resume the upgrade script, which completed the DB upgrades and appeared to exit cleanly. # dpkg --force-depends -r packetfence-captive-portal-javascript # apt --fix-broken install # /usr/local/pf/addons/full-upgrade/run-upgrade.sh However it did not leave us with a working configuration. The radius-acct binding is not working properly; we are not really sure what else might be a problem, so likely we have to restage to 11.0 and try again once we figure out an upgrade strategy. Previously we had the same issue in another environment going from 11.0 ->11.1 ->11.2 ->12.1 where the upgrade process was interrupted by a problem with our firewall causing timeouts during the docker pull. In that case, there was no package issue, but we were restarting the upgrade script. The result was the same binding issue, and in that case we just decided to restage to 12.1 rather than figure it out, as our goal was to complete testing of the captive portal on 12.1. Our hypothesis is that interruption of the upgrade wrapper script, in one case due to a timeout on docker image pulls, and in this latest case from package dependencies, led to a 12.1 environment with this service binding issue. Our next step will be to restage 11.0 -> 12.1 without interruption by starting with # dpkg --force-depends -r packetfence-captive-portal-javascript to see if we can get to 12.1 with radius-acct/pfacct bindings in proper working order. Below are some details, maybe if there is an obvious quick fix that might be useful for anyone going through the upgrade process with similar results, or save us the time to restage later this month. cheers, Ian We did some quick comparisons to try and find the difference in systemctl packetfence units and configuration between working and no-working 12.1 system. We greped config and systemctl units for "1813" and did not find any differences but we know its there somewhere. On the working PF 12.1 instance, staged directly to 12.1, the binding looks like the following, where 10.2.1.2 is the management network where switches connect to radius-acct. pf4:/usr/local/pf/conf# netstat -tunap | grep ":1813" udp 0 0 10.2.1.2:1813 0.0.0.0:* 42375/pfacct On the non-working instance the binding seems to be to the localhost; pf3:/usr/local/pf/conf# netstat -tunap | grep ":1813" udp 0 0 127.0.0.1:1813 0.0.0.0:* 1942/pfacct We can see radius-acct complaining in the logs as well and working hard to restart what we believe is a docker container with radius running inside that the actual binding should be pointed at. radius-acct.log:Jan 22 15:48:16 pf3 radiusd-acct-docker-wrapper[93781]: Sun Jan 22 15:48:16 2023 : Error: Failed binding to acct address * port 1813 bound to server packetfence: Address already in use radius-acct.log:Jan 22 15:48:16 pf3 radiusd-acct-docker-wrapper[93781]: Sun Jan 22 15:48:16 2023 : Error: /usr/local/pf/raddb/acct.conf[8]: Error binding to port for 0.0.0.0 port 1813 radius-acct.log:Jan 22 15:48:19 pf3 radiusd-acct-docker-wrapper[93890]: Error: No such container: radiusd-acct radius-acct.log:Jan 22 15:48:19 pf3 radiusd-acct-docker-wrapper[93890]: Error: No such container: 1 If we stop the pfacct service only systemctl stop packetfence-pfacct.service this allows the docker binding to work but not the pfacct process can not be restarted, and the binding appears to then be directly with freeradius, rather than pfacct, and on all interfaces rather than just the registration VLAN. pf3:/usr/local/pf/logs# netstat -tunap | grep ":1813" udp 0 0 0.0.0.0:1813 0.0.0.0:* 93957/freeradius Without more knowledge of the difference between the PF Go (pfacct) and Freeradius processes and how the new docker container bindings work, it looks like restage and retry the upgrade is probably the next step for us.
_______________________________________________ PacketFence-users mailing list PacketFence-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/packetfence-users