Zammit, No proxy. The management VLAN operates the portal listener behind an OpenWRT firewall, which forwards 80/443 to Packetfence portal. iPerf shows no bandwidth issues whatsoever, in our lab, and I we have never seen any other connection issues, during apt upgrades, inbound portal connections for authentication/login, etc. Qualify that with the fact it is a lab environment, and only sees real use when upgrades and testing are underway.
Unfortunately, we have not upgraded our Prod yet (until we get this other pesky mobile login / detection issue sorted from my other thread) which will tell us if we see the same thing in our twinned Prod setup which is on a geographically separate connection and location, but has all the same pieces, and would further rule out anything network related if it repeated, or if it was fine, squarely having us looking at the specific DNS and NAT pieces involved for the versioned differences between or lab and prod bits, which amount to a Xen Hypevisor, OpenWRT firewall and PowerDNS recursor beyond physical L2 connectivity. The IPv6 lookup had me a bit baffled, and I am wondering if there is something unique about the DNS lookups for the image pulls, noting that the timeout appears to happen on DNS frames in most cases (port 53), that might point us at an actual issue with timeouts at the DNS resolver, which is ours, running PowerDNS doing about 40-50 queries per second for other IPv4 only hosts which are not reporting any DNS issues. Since the retry attempts eventually work, and upgrades are once in a while, we have not worked very hard to figure it out; There is probably more debug we could be doing to observe an actual timeout on DNS by isolating the traffic there during the image pull phase of the upgrade, which we would probably be able to do going from 11.1-12.0-12.1, if we see this during our 11.0-11.1 upgrade of Prod, after we fix the mobile device captive portal on this 12.1 instance. cheers, Ian pf4:~# iperf3 -c 105.244.196.67 Connecting to host 105.244.196.67, port 5201 [ 5] local 10.2.1.2 port 46870 connected to 105.244.196.67 port 5201 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 284 MBytes 2.38 Gbits/sec 0 1.56 MBytes [ 5] 1.00-2.00 sec 309 MBytes 2.59 Gbits/sec 0 1.56 MBytes [ 5] 2.00-3.00 sec 314 MBytes 2.63 Gbits/sec 0 1.56 MBytes pf4:~# iperf3 -R -c 105.244.196.67 Connecting to host 105.244.196.67, port 5201 Reverse mode, remote host 105.244.196.67 is sending [ 5] local 10.2.1.2 port 51046 connected to 105.244.196.67 port 5201 [ ID] Interval Transfer Bitrate [ 5] 0.00-1.00 sec 280 MBytes 2.35 Gbits/sec [ 5] 1.00-2.00 sec 333 MBytes 2.80 Gbits/sec [ 5] 2.00-3.00 sec 315 MBytes 2.64 Gbits/sec On Tue, Jan 10, 2023 at 4:06 PM Zammit, Ludovic <luza...@akamai.com> wrote: > Hello Ian, > > Is your PF server is behind a HTTP proxy ? > > Thanks, > > > *Ludovic Zammit* > *Product Support Engineer Principal Lead* > *Cell:* +1.613.670.8432 > Akamai Technologies - Inverse > 145 Broadway > Cambridge, MA 02142 > Connect with Us: <https://community.akamai.com> <http://blogs.akamai.com> > <https://twitter.com/akamai> <http://www.facebook.com/AkamaiTechnologies> > <http://www.linkedin.com/company/akamai-technologies> > <http://www.youtube.com/user/akamaitechnologies?feature=results_main> > > On Jan 10, 2023, at 3:27 PM, Ian MacDonald via PacketFence-users < > packetfence-users@lists.sourceforge.net> wrote: > > Hey PF Users, > > For recent versions; I believe 11.1, 12.0 and now 12.1 and possibly 11.0 > (Fairly certain since the images below were downloaded from Inverse repos > all at once during the installation or upgrade process) We have been having > to restart the upgrade process due to timeout related errors. Often 1-5 > times re-executions are required to complete the upgrade process. > > - proxysql > - haproxy-portal > - pfsso > - radiusd-eduroam > - httpd.aaa > - radiusd-cli > - pfconfig > - fingerbank-db > - pfcmd > - radiusd-load-balancer > - httpd.admin_dispatcher > - radiusd-acct > - pfpki > - httpd.portal > - httpd.dispatcher > - pfcron > - pfconnector > - httpd.webservices > - radiusd-auth > - haproxy-admin > - pfqueue > - api-frontend > - pfperl-api > > We do not really understand why, as there does not appear to be any > connectivity or DNS lookup issues that would cause this type of behavior. > Below are some of the output lines captured during our installation > process during a recent upgrade from 11.1 to 12.0 and then again from 12.0 > to 12.1. > > In a minor 12.0 upgrade we saw this one referencing an IPv6 github > address, yet the system is IPv4, so no idea why it is attempting IPv6 > error pulling image configuration: Get " > https://pkg-containers.githubusercontent.com/ghcr1/blobs/sha256:c31d236d97e3beb137f8c2b02bfbe88d0093b5592d9f181935c9c03a0132a142?se=2023-01-10T14%3A40%3A00Z&sig=%2B > <https://urldefense.com/v3/__https://pkg-containers.githubusercontent.com/ghcr1/blobs/sha256:c31d236d97e3beb137f8c2b02bfbe88d0093b5592d9f181935c9c03a0132a142?se=2023-01-10T14*3A40*3A00Z&sig=*2B__;JSUl!!GjvTz_vk!T20y3SBb08rY8cyH1WgrsAfIusBD_b-80oH5hDoWOCzwhs8YdQjAEQvs76hnGa9EbMFuTDo_A3f20KzOPmvCrL3pfveLoirlkdPprQ$> > HBahj6l0521Bm%2FB40v51MhZmNHztLYxzxBgJlsefEE%3D&sp=r&spr=https&sr=b&sv=2019-12-12": > dial tcp [2606:50c0:8001::154]:443: connect: network is unreachable > > In another 12.0 upgrade attempt, we saw this one, which looks like a > timeout to our DNS recursor, but to which there are no I/O bound or > restrictive conditions we can see. > > error pulling image configuration: Get " > https://ghcr.io/v2/inverse-inc/packetfence/pfcmd/blobs/sha256:5631317df2b6910aa8da1f20a382c04ecc0ffb572aeb7fd3201a18b0bee18633 > <https://urldefense.com/v3/__https://ghcr.io/v2/inverse-inc/packetfence/pfcmd/blobs/sha256:5631317df2b6910aa8da1f20a382c04ecc0ffb572aeb7fd3201a18b0bee18633__;!!GjvTz_vk!T20y3SBb08rY8cyH1WgrsAfIusBD_b-80oH5hDoWOCzwhs8YdQjAEQvs76hnGa9EbMFuTDo_A3f20KzOPmvCrL3pfveLoiouH-Lgiw$>": > dial tcp: lookup ghcr.io > <https://urldefense.com/v3/__http://ghcr.io__;!!GjvTz_vk!T20y3SBb08rY8cyH1WgrsAfIusBD_b-80oH5hDoWOCzwhs8YdQjAEQvs76hnGa9EbMFuTDo_A3f20KzOPmvCrL3pfveLoiobWe_Cqg$> > on 105.244 > .196.155:53: read udp 10.2.1.2:35975->105.244.196.155:53 > <https://urldefense.com/v3/__http://105.244.196.155:53__;!!GjvTz_vk!T20y3SBb08rY8cyH1WgrsAfIusBD_b-80oH5hDoWOCzwhs8YdQjAEQvs76hnGa9EbMFuTDo_A3f20KzOPmvCrL3pfveLoiobNtwt0A$>: > i/o timeout > > In 12.1 using the do-upgrade script we saw these similar messages > > Error response from daemon: Get "https://ghcr.io/v2/ > <https://urldefense.com/v3/__https://ghcr.io/v2/__;!!GjvTz_vk!T20y3SBb08rY8cyH1WgrsAfIusBD_b-80oH5hDoWOCzwhs8YdQjAEQvs76hnGa9EbMFuTDo_A3f20KzOPmvCrL3pfveLoiomwVg7yA$>": > dial tcp: lookup ghcr.io > <https://urldefense.com/v3/__http://ghcr.io__;!!GjvTz_vk!T20y3SBb08rY8cyH1WgrsAfIusBD_b-80oH5hDoWOCzwhs8YdQjAEQvs76hnGa9EbMFuTDo_A3f20KzOPmvCrL3pfveLoiobWe_Cqg$> > on 105.244.196.155:53 > <https://urldefense.com/v3/__http://105.244.196.155:53__;!!GjvTz_vk!T20y3SBb08rY8cyH1WgrsAfIusBD_b-80oH5hDoWOCzwhs8YdQjAEQvs76hnGa9EbMFuTDo_A3f20KzOPmvCrL3pfveLoiobNtwt0A$>: > read udp 10.2.1.2:60065->105.244.196.155:53 > <https://urldefense.com/v3/__http://105.244.196.155:53__;!!GjvTz_vk!T20y3SBb08rY8cyH1WgrsAfIusBD_b-80oH5hDoWOCzwhs8YdQjAEQvs76hnGa9EbMFuTDo_A3f20KzOPmvCrL3pfveLoiobNtwt0A$>: > i/o timeout > > Error response from daemon: Head " > https://ghcr.io/v2/inverse-inc/packetfence/radiusd-eduroam/manifests/maintenance-12-1 > <https://urldefense.com/v3/__https://ghcr.io/v2/inverse-inc/packetfence/radiusd-eduroam/manifests/maintenance-12-1__;!!GjvTz_vk!T20y3SBb08rY8cyH1WgrsAfIusBD_b-80oH5hDoWOCzwhs8YdQjAEQvs76hnGa9EbMFuTDo_A3f20KzOPmvCrL3pfveLoioq_IOarQ$>": > dial tcp: lookup ghcr.io > <https://urldefense.com/v3/__http://ghcr.io__;!!GjvTz_vk!T20y3SBb08rY8cyH1WgrsAfIusBD_b-80oH5hDoWOCzwhs8YdQjAEQvs76hnGa9EbMFuTDo_A3f20KzOPmvCrL3pfveLoiobWe_Cqg$> > on 104.244.196.155:53 > <https://urldefense.com/v3/__http://104.244.196.155:53__;!!GjvTz_vk!T20y3SBb08rY8cyH1WgrsAfIusBD_b-80oH5hDoWOCzwhs8YdQjAEQvs76hnGa9EbMFuTDo_A3f20KzOPmvCrL3pfveLoiqPh85jfA$>: > read udp 10.2.1.2:55582->104.244.196.155:53 > <https://urldefense.com/v3/__http://104.244.196.155:53__;!!GjvTz_vk!T20y3SBb08rY8cyH1WgrsAfIusBD_b-80oH5hDoWOCzwhs8YdQjAEQvs76hnGa9EbMFuTDo_A3f20KzOPmvCrL3pfveLoiqPh85jfA$>: > i/o timeout > > We just repeated, which seemed like a good time to send this email, and it > worked (3rd time just now on 12.1). > > Tue Jan 10 15:22:56 EST 2023 - Pull of images finished > Tue Jan 10 15:22:58 EST 2023 - Tag of images finished > Tue Jan 10 15:23:45 EST 2023 - Previous images cleaned > > It seems very odd that we get these timeouts when doing the image > downloading. Maybe somebody else has seen this or knows why it may be > occurring during this stage of the installation/upgrade process. > > cheers, > Ian > _______________________________________________ > PacketFence-users mailing list > PacketFence-users@lists.sourceforge.net > > https://urldefense.com/v3/__https://lists.sourceforge.net/lists/listinfo/packetfence-users__;!!GjvTz_vk!T20y3SBb08rY8cyH1WgrsAfIusBD_b-80oH5hDoWOCzwhs8YdQjAEQvs76hnGa9EbMFuTDo_A3f20KzOPmvCrL3pfveLoipu2fzoWw$ > > >
_______________________________________________ PacketFence-users mailing list PacketFence-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/packetfence-users