To close this thread off, The package issues are being handled in https://github.com/inverse-inc/packetfence/issues/7473
On the 1813 bindings; It looks like I just had not yet noticed that there are now two accounting options; pfacct and radius-acct and although they are both present, the radius-acct is only used when needed for heavy lifting in special use cases with eap, or non-standard radius commands, and otherwise has been replaced with the newer lightweight pfacct which suits my use case. I guess I must have decided to turn it on in my test environment, thinking I was using it. I mean at some point in the past I was pre 11. Anyways it doesn't start up a listener right away other than localhost. Just user error. On Sun, Jan 22, 2023 at 4:41 PM Ian MacDonald <i...@netstatz.com> wrote: > In an upgrade of a PF environment from 11.0 to 12.1, following the upgrade > instructions, we stumbled on a bug due to some package dependencies. > > These redudant packages are being tracked in a bug here > https://github.com/inverse-inc/packetfence/issues/7246 however they cause > an upgrade from 11.0 to 12.1 to fail. > > From an uptodate 11.0 instance running on Debian 11.6, we ran the upgrade > script, following prompts etc. > > # /usr/local/pf/addons/full-upgrade/run-upgrade.sh > > The process ended due to the package conflict, but we were able to resolve > the conflicts, getting apt sorted out, and then resume the upgrade script, > which completed the DB upgrades and appeared to exit cleanly. > > # dpkg --force-depends -r packetfence-captive-portal-javascript > # apt --fix-broken install > # /usr/local/pf/addons/full-upgrade/run-upgrade.sh > > However it did not leave us with a working configuration. The > radius-acct binding is not working properly; we are not really sure what > else might be a problem, so likely we have to restage to 11.0 and try again > once we figure out an upgrade strategy. > > Previously we had the same issue in another environment going from 11.0 > ->11.1 ->11.2 ->12.1 where the upgrade process was interrupted by a problem > with our firewall causing timeouts during the docker pull. In that case, > there was no package issue, but we were restarting the upgrade script. The > result was the same binding issue, and in that case we just decided to > restage to 12.1 rather than figure it out, as our goal was to complete > testing of the captive portal on 12.1. > > Our hypothesis is that interruption of the upgrade wrapper script, in one > case due to a timeout on docker image pulls, and in this latest case from > package dependencies, led to a 12.1 environment with this service binding > issue. > > Our next step will be to restage 11.0 -> 12.1 without interruption by > starting with > > # dpkg --force-depends -r packetfence-captive-portal-javascript > > to see if we can get to 12.1 with radius-acct/pfacct bindings in proper > working order. > > Below are some details, maybe if there is an obvious quick fix that might > be useful for anyone going through the upgrade process with similar > results, or save us the time to restage later this month. > > cheers, > Ian > > We did some quick comparisons to try and find the difference in systemctl > packetfence units and configuration between working and no-working 12.1 > system. We greped config and systemctl units for "1813" and did not find > any differences but we know its there somewhere. > > On the working PF 12.1 instance, staged directly to 12.1, the binding > looks like the following, where 10.2.1.2 is the management network where > switches connect to radius-acct. > > pf4:/usr/local/pf/conf# netstat -tunap | grep ":1813" > udp 0 0 10.2.1.2:1813 0.0.0.0:* > 42375/pfacct > > On the non-working instance the binding seems to be to the localhost; > > pf3:/usr/local/pf/conf# netstat -tunap | grep ":1813" > udp 0 0 127.0.0.1:1813 0.0.0.0:* > 1942/pfacct > > > We can see radius-acct complaining in the logs as well and working hard to > restart what we believe is a docker container with radius running inside > that the actual binding should be pointed at. > > radius-acct.log:Jan 22 15:48:16 pf3 radiusd-acct-docker-wrapper[93781]: > Sun Jan 22 15:48:16 2023 : Error: Failed binding to acct address * port > 1813 bound to server packetfence: Address already in use > radius-acct.log:Jan 22 15:48:16 pf3 radiusd-acct-docker-wrapper[93781]: > Sun Jan 22 15:48:16 2023 : Error: /usr/local/pf/raddb/acct.conf[8]: Error > binding to port for 0.0.0.0 port 1813 > radius-acct.log:Jan 22 15:48:19 pf3 radiusd-acct-docker-wrapper[93890]: > Error: No such container: radiusd-acct > radius-acct.log:Jan 22 15:48:19 pf3 radiusd-acct-docker-wrapper[93890]: > Error: No such container: 1 > > If we stop the pfacct service only > > systemctl stop packetfence-pfacct.service > > this allows the docker binding to work but not the pfacct process can not > be restarted, and the binding appears to then be directly with freeradius, > rather than pfacct, and on all interfaces rather than just the registration > VLAN. > > pf3:/usr/local/pf/logs# netstat -tunap | grep ":1813" > udp 0 0 0.0.0.0:1813 0.0.0.0:* > 93957/freeradius > > Without more knowledge of the difference between the PF Go (pfacct) and > Freeradius processes and how the new docker container bindings work, it > looks like restage and retry the upgrade is probably the next step for us. > > > >
_______________________________________________ PacketFence-users mailing list PacketFence-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/packetfence-users