Re: Strange network issues with -current

2023-08-15 Thread Alexander Leidinger

Am 2023-08-15 14:24, schrieb Alexander Leidinger:

Am 2023-08-15 13:48, schrieb Alexander Leidinger:

since a while I have some strange network issues in some parts of a 
particular system.


I just stumbled upon the mail which discusses issues with commit 
e3ba0d6adde3, and when I look into this I see changes related to the 
use of SO_REUSEPORT flags, and all my nginx systems use the reuseport 
directive in their config. I'm compiling right now with his change 
reverted. Once tested I will report back.


Unfortunately it wasn't that.

Bye,
Alexander.

--
http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0x8F31830F9F2772BF



Re: Strange network issues with -current

2023-08-15 Thread Alexander Leidinger

Am 2023-08-15 13:48, schrieb Alexander Leidinger:

since a while I have some strange network issues in some parts of a 
particular system.


I just stumbled upon the mail which discusses issues with commit 
e3ba0d6adde3, and when I look into this I see changes related to the use 
of SO_REUSEPORT flags, and all my nginx systems use the reuseport 
directive in their config. I'm compiling right now with his change 
reverted. Once tested I will report back.


Bye,
Alexander.

--
http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0x8F31830F9F2772BF



Strange network issues with -current

2023-08-15 Thread Alexander Leidinger

Hi,

since a while I have some strange network issues in some parts of a 
particular system.


A build with src from 2023-07-26 was still working ok. An update to 
2023-08-07 broke some parts in a strange way. I tried again with src 
from 2023-08-11 didn't fix things.


What I see is... strange and complex.

I have a jail host with about 23 jails. All the jails are sitting on a 
bridge, and have IPv6 and IPV4 addresses. One jail is a DNS server for a 
domain which contains all the DNS entries for all the jails on the 
system (and more). Other jails have mysql (FS socket for mysql 
nullfs-mounted into other jails for connecting to mysql via the FS 
socket instead of the network), dovecot IMAP server, postfix SMTP 
server, a nginx based reverse proxy and 2 different kinds of webmail 
solutions (old php74 based on the way out on favour for a php81 based 
one), a wiki and other things.


With the old working basesystem I can login into the old webmail system 
and read mails. With the newer non-working basesystem I still can login, 
but the auth-credentials are not stored in the backend-session and as 
such no mail is listed at all, as this requires subsequent connections 
from php to dovecot. This webmail system is going via the reverse proxy 
to the webmail-jail which has another nginx configured to connect to the 
php-fpm backend.
With the new webmail system I can login, read mails, and even are 
writing this email from. The first login to it fails. The second 
succeeds. It is not behind the reverse proxy (as it is not fully ready 
yet for access from the outside (DSL with NAT on the DSL-box to the 
reverse proxy)), but a single nginx with php-fpm backend (instead of 2 
nginx + php-fpm as in the old webmail).


The wiki behind the reverse proxy is sometimes working, and sometimes 
not. Sometimes it is providing everything, sometimes parts of the site 
is missing (e.g. pictures / icons). Sometimes there is simply a blank 
page, sometimes it gives an error message from the wiki about an 
unforseen bug...


The error messages in the nginx reverse proxy log for all the strange 
failure cases is "accept4() failed (53: Software caused connection 
abort)". Sometimes I get "upstream timed out". When it times out in the 
reverse proxy instead of getting the accept4-errors, I see the same 
accept4-error message in the nginx inside the wiki or webmail jail 
instead.


I tried to recompile all the components of the wiki and reverse proxy 
and php81 based webmail, to no avail. The issue persists.


Does this ring a bell to someone? Maybe some network or socket or VM 
based changes in this timeframe which smell like they could be related 
and maybe good candidates for a backup-test? Any ideas how to drill down 
with debugging to have a more simple test-case than the complex setup of 
if_bridge, epair, jails, wiki, php, nginx, ...?


Bye,
Alexander.

--
http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0x8F31830F9F2772BF