Re: Transparent proxy issue on FreeBSD

2023-03-07 Thread Rainer Duffner



> Am 07.03.2023 um 18:26 schrieb Marc West :
> 
> On 2023-03-07 08:09:04, Rainer Duffner wrote:
>> I admit I only toyed with TP, so I really don???t know what I???m doing 
>> there, but:
>> 
>> Have you tried to just use pfSense for this? The developer of the package 
>> (https://github.com/PiBa-NL) seemed to be active here, but I haven???t seen 
>> anything from him since 2020, so I wonder if he has moved on.
>> 
>> My co-workers use OPNSense for this purpose - and on VMWare, they insist 
>> that only em(4) NICs work.
>> 
>> 
>> If you don???t find his email-address, I can mail it to you.
> 
> Thanks for the suggestion. I haven't tried HAProxy on pfSense but the
> working transparent config and related ipfw fwd rules we have did come
> from PiBa-NL [1].


Ah, ok.

Either ask on the freebsd-forum or the mailing-list - or try with 
OPNSense/pfSense and if the problem persists, you might get more response on 
the forums there.

pf and ipfw are very specialized parts of the kernel and very few developers 
want to touch it, AFAIK.


> Everything does function perfectly until a brief
> period with production traffic and something happens to cause the tproxy
> bind errors and request failures to start. I'm just not sure what is
> going wrong or how to debug further.
> 
> [1] https://www.mail-archive.com/haproxy@formilux.org/msg09923.html
> 






Re: Transparent proxy issue on FreeBSD

2023-03-07 Thread Marc West
On 2023-03-07 08:09:04, Rainer Duffner wrote:
> I admit I only toyed with TP, so I really don???t know what I???m doing 
> there, but:
> 
> Have you tried to just use pfSense for this? The developer of the package 
> (https://github.com/PiBa-NL) seemed to be active here, but I haven???t seen 
> anything from him since 2020, so I wonder if he has moved on.
> 
> My co-workers use OPNSense for this purpose - and on VMWare, they insist that 
> only em(4) NICs work.
> 
> 
> If you don???t find his email-address, I can mail it to you.

Thanks for the suggestion. I haven't tried HAProxy on pfSense but the
working transparent config and related ipfw fwd rules we have did come
from PiBa-NL [1]. Everything does function perfectly until a brief
period with production traffic and something happens to cause the tproxy
bind errors and request failures to start. I'm just not sure what is
going wrong or how to debug further.

[1] https://www.mail-archive.com/haproxy@formilux.org/msg09923.html



Re: Transparent proxy issue on FreeBSD

2023-03-07 Thread Rainer Duffner



> Am 07.03.2023 um 08:46 schrieb Marc West :
> 
> 
> 
> Any other thoughts to look at or data that would be helpful to collect?
> 


I admit I only toyed with TP, so I really don’t know what I’m doing there, but:

Have you tried to just use pfSense for this? The developer of the package 
(https://github.com/PiBa-NL) seemed to be active here, but I haven’t seen 
anything from him since 2020, so I wonder if he has moved on.

My co-workers use OPNSense for this purpose - and on VMWare, they insist that 
only em(4) NICs work.


If you don’t find his email-address, I can mail it to you.





Re: Transparent proxy issue on FreeBSD

2023-03-06 Thread Marc West
Hi Stefan and thanks for your replies. 

(Sorry for the late reply and replying to my own mail, I don't seem to
be receiving messages from the list after confirming the subscription
twice and noticed your replies when checking the archives.)

> when I understand you correct then you have forwarding enabled to that
> ports on pf.
> 
> I had a similar issue on pfsense. The solution was to disable the
> forwarding to that port.

PF isn't doing anything special with the public IPs/ports that HAProxy
binds to, only allowing that traffic. PF does do outbound NAT for
internal servers to reach the Internet like so:

table  const { 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16 }
nat pass on $pub_vlan from 10.10.15.0/24 to ! -> 1.2.3.4

Would a firewall be able to cause the HAProxy tproxy bind errors for
some (but not all) transparent connections? I believe firewalls could
block connections but shouldn't prevent the actual haproxy bind from
succeeding (?). I read through the code and see where the tproxy bind
error is being hit but unsure what is causing it to succeed sometimes
and fail others.

It doesn't seem like it would be an issue allocating or exhausting ports
since the original client IP+port is being reused with "usesrc client"
and there shouldn't be conflicts there. On FreeBSD there are no similar
sysctls to Linux's net.ipv4.ip_nonlocal_bind, and transparent does work
some of the time with my existing config until it starts failing.

> one another:
> 
> source ipv4@ usesrc clientip

I have separate backends/frontends for IPv4 and IPv6 with "source
0.0.0.0 usesrc client" in defaults (also tried "clientip"), which in my
understanding should do the right thing for both v4 and v6 respectively.
Would there be something different about using ipv4@ here?

Any other thoughts to look at or data that would be helpful to collect?



Re: Transparent proxy issue on FreeBSD

2023-02-23 Thread Stefan Fuhrmann

Hello Marc,


one another:

source ipv4@ usesrc clientip

hope that helps.
Stefan

Am 17.02.23 um 12:47 schrieb Marc West:

Hi,

After my other thread about performance issues on OpenBSD we decided to
switch OSes on our HAProxy boxes to FreeBSD 13.1. In the test
environment everything worked perfectly with transparent proxying but
when cutting production over to FreeBSD I ran into an issue and had to
revert for now.

With "source 0.0.0.0 usesrc client" everything seemed OK for about a
minute but then many client connections across all backends started
getting 503 errors (or connection failures for TCP backends). The
haproxy logs have a lot of:

- Cannot bind to tproxy source address before connect() for backend my_backend
- Connect() failed for backend my_backend: local address already in use.

This was during a late night maintenance with less than 100 connections,
netstat had less than 500 total connections, and I didn't see any limits
that were close to being hit (these are all auto-detected except
nbthread=36 to leave some breathing room on this 40 core HW):

pid = 76927 (process #1, nbproc = 1, nbthread = 36)
uptime = 0d 0h14m15s
system limits: memmax = unlimited; ulimit-n = 940059
maxsock = 940059; maxconn = 469926; maxpipes = 0
current conns = 40; current pipes = 0/0; conn rate = 8/sec; bit rate = 1.086 
Mbps
Running tasks: 0/248; idle = 100 %

Health checks were always OK and the 503s/failures were intermittent
with a majority but not all requests failing. With "clientip" instead of
"client" it seemed to be worse with almost all requests failing.

Looking back at the test env there were some occurrences of the
"Connect() failed ... address already in use" in logs but there were no
noticeable request failures. There were zero "Cannot bind to tproxy
source address before connect()" in test even with much higher stress
testing load. Configs between test and production are nearly identical
except there are more backends and VLANs in production and prod traffic
comes from more IPs than the 2-3 IPs that hit test.

On OpenBSD transparent works out of the box with no PF divert rules
needed (only standard pass rules). Unfortunately with FreeBSD's PF
(which is behind OpenBSD on features but ahead on performance) I have
not found any combination of rules that allow transparent to work at
all (and divert-reply isn't implemented), so here ipfw is diverting and
passing all traffic stateless like so:

$cmd 0001 fwd localhost tcp from 10.10.15.0/24 80,443 to any in recv vlan123
$cmd 65534 allow all from any to any

Then PF handles the main stateful firewall rules and outbound NAT for
internal servers. All PF rules have logging and there was nothing being
blocked by PF during the issue.

It isn't ideal to run both ipfw and PF like this but it would be a
major undertaking to rework the tooling we have built around PF to a
different firewall, and since transparent is required it seems ipfw is
needed to handle the divert part on FreeBSD.

Am I overlooking something in my setup or might this be an OS or haproxy
issue to debug further? Thanks in advance for any thoughts!

$ haproxy -vv
HAProxy version 2.6.9-3a3700a 2023/02/14 -https://haproxy.org/
Status: long-term supported branch - will stop receiving fixes around Q2 2027.
Known bugs:http://www.haproxy.org/bugs/bugs-2.6.9.html
Running on: FreeBSD 13.1-RELEASE-p6 FreeBSD 13.1-RELEASE-p6 GENERIC amd64
Build options :
   TARGET  = freebsd
   CPU = generic
   CC  = cc
   CFLAGS  = -O2 -pipe -fstack-protector-strong -fno-strict-aliasing -Wall 
-Wextra -Wundef -Wdeclaration-after-statement -Wfatal-errors -Wtype-limits 
-Wshift-negative-value -Wnull-dereference -fwrapv -Wno-unknown-warning-option 
-Wno-address-of-packed-member -Wno-unused-label -Wno-sign-compare 
-Wno-unused-parameter -Wno-clobbered -Wno-missing-field-initializers 
-Wno-cast-function-type -Wno-string-plus-int -Wno-atomic-alignment 
-DFREEBSD_PORTS
   OPTIONS = USE_PCRE2=1 USE_PCRE2_JIT=1 USE_GETADDRINFO=1 USE_OPENSSL=1 
USE_LUA=1 USE_ACCEPT4=1 USE_ZLIB=1 USE_CPU_AFFINITY=1 USE_PROMEX=1
   DEBUG   = -DDEBUG_STRICT -DDEBUG_MEMORY_POOLS

Feature list : -51DEGREES +ACCEPT4 -BACKTRACE +CLOSEFROM +CPU_AFFINITY -CRYPT_H 
-DEVICEATLAS -DL -ENGINE -EPOLL -EVPORTS +GETADDRINFO +KQUEUE +LIBCRYPT 
-LINUX_SPLICE -LINUX_TPROXY +LUA -MEMORY_PROFILING -NETFILTER -NS 
-OBSOLETE_LINKER +OPENSSL -OT -PCRE +PCRE2 +PCRE2_JIT -PCRE_JIT +POLL -PRCTL 
+PROCCTL +PROMEX -QUIC -RT -SLZ -STATIC_PCRE -STATIC_PCRE2 -SYSTEMD -TFO 
+THREAD -THREAD_DUMP +TPROXY -WURFL +ZLIB

Default settings :
   bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with multi-threading support (MAX_THREADS=64, default=40).
Built with OpenSSL version : OpenSSL 1.1.1o-freebsd  3 May 2022
Running on OpenSSL version : OpenSSL 1.1.1o-freebsd  3 May 2022
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3
Built with Lua version : Lua 5.3.6
Built 

Re: Transparent proxy issue on FreeBSD

2023-02-23 Thread Stefan Fuhrmann

Hello Marc,

when I understand you correct then you have forwarding enabled to that 
ports on pf.


I had a similar issue on pfsense. The solution was to disable the 
forwarding to that port.


Maybe it helps you...

greats

Stefan


when I understand you correct then you have forwarding

Am 17.02.23 um 12:47 schrieb Marc West:

Hi,

After my other thread about performance issues on OpenBSD we decided to
switch OSes on our HAProxy boxes to FreeBSD 13.1. In the test
environment everything worked perfectly with transparent proxying but
when cutting production over to FreeBSD I ran into an issue and had to
revert for now.

With "source 0.0.0.0 usesrc client" everything seemed OK for about a
minute but then many client connections across all backends started
getting 503 errors (or connection failures for TCP backends). The
haproxy logs have a lot of:

- Cannot bind to tproxy source address before connect() for backend my_backend
- Connect() failed for backend my_backend: local address already in use.

This was during a late night maintenance with less than 100 connections,
netstat had less than 500 total connections, and I didn't see any limits
that were close to being hit (these are all auto-detected except
nbthread=36 to leave some breathing room on this 40 core HW):

pid = 76927 (process #1, nbproc = 1, nbthread = 36)
uptime = 0d 0h14m15s
system limits: memmax = unlimited; ulimit-n = 940059
maxsock = 940059; maxconn = 469926; maxpipes = 0
current conns = 40; current pipes = 0/0; conn rate = 8/sec; bit rate = 1.086 
Mbps
Running tasks: 0/248; idle = 100 %

Health checks were always OK and the 503s/failures were intermittent
with a majority but not all requests failing. With "clientip" instead of
"client" it seemed to be worse with almost all requests failing.

Looking back at the test env there were some occurrences of the
"Connect() failed ... address already in use" in logs but there were no
noticeable request failures. There were zero "Cannot bind to tproxy
source address before connect()" in test even with much higher stress
testing load. Configs between test and production are nearly identical
except there are more backends and VLANs in production and prod traffic
comes from more IPs than the 2-3 IPs that hit test.

On OpenBSD transparent works out of the box with no PF divert rules
needed (only standard pass rules). Unfortunately with FreeBSD's PF
(which is behind OpenBSD on features but ahead on performance) I have
not found any combination of rules that allow transparent to work at
all (and divert-reply isn't implemented), so here ipfw is diverting and
passing all traffic stateless like so:

$cmd 0001 fwd localhost tcp from 10.10.15.0/24 80,443 to any in recv vlan123
$cmd 65534 allow all from any to any

Then PF handles the main stateful firewall rules and outbound NAT for
internal servers. All PF rules have logging and there was nothing being
blocked by PF during the issue.

It isn't ideal to run both ipfw and PF like this but it would be a
major undertaking to rework the tooling we have built around PF to a
different firewall, and since transparent is required it seems ipfw is
needed to handle the divert part on FreeBSD.

Am I overlooking something in my setup or might this be an OS or haproxy
issue to debug further? Thanks in advance for any thoughts!

$ haproxy -vv
HAProxy version 2.6.9-3a3700a 2023/02/14 - https://haproxy.org/
Status: long-term supported branch - will stop receiving fixes around Q2 2027.
Known bugs: http://www.haproxy.org/bugs/bugs-2.6.9.html
Running on: FreeBSD 13.1-RELEASE-p6 FreeBSD 13.1-RELEASE-p6 GENERIC amd64
Build options :
   TARGET  = freebsd
   CPU = generic
   CC  = cc
   CFLAGS  = -O2 -pipe -fstack-protector-strong -fno-strict-aliasing -Wall 
-Wextra -Wundef -Wdeclaration-after-statement -Wfatal-errors -Wtype-limits 
-Wshift-negative-value -Wnull-dereference -fwrapv -Wno-unknown-warning-option 
-Wno-address-of-packed-member -Wno-unused-label -Wno-sign-compare 
-Wno-unused-parameter -Wno-clobbered -Wno-missing-field-initializers 
-Wno-cast-function-type -Wno-string-plus-int -Wno-atomic-alignment 
-DFREEBSD_PORTS
   OPTIONS = USE_PCRE2=1 USE_PCRE2_JIT=1 USE_GETADDRINFO=1 USE_OPENSSL=1 
USE_LUA=1 USE_ACCEPT4=1 USE_ZLIB=1 USE_CPU_AFFINITY=1 USE_PROMEX=1
   DEBUG   = -DDEBUG_STRICT -DDEBUG_MEMORY_POOLS

Feature list : -51DEGREES +ACCEPT4 -BACKTRACE +CLOSEFROM +CPU_AFFINITY -CRYPT_H 
-DEVICEATLAS -DL -ENGINE -EPOLL -EVPORTS +GETADDRINFO +KQUEUE +LIBCRYPT 
-LINUX_SPLICE -LINUX_TPROXY +LUA -MEMORY_PROFILING -NETFILTER -NS 
-OBSOLETE_LINKER +OPENSSL -OT -PCRE +PCRE2 +PCRE2_JIT -PCRE_JIT +POLL -PRCTL 
+PROCCTL +PROMEX -QUIC -RT -SLZ -STATIC_PCRE -STATIC_PCRE2 -SYSTEMD -TFO 
+THREAD -THREAD_DUMP +TPROXY -WURFL +ZLIB

Default settings :
   bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with multi-threading support (MAX_THREADS=64, default=40).
Built with OpenSSL version : OpenSSL 1.1.1o-freebsd  3 May 2022
Running on OpenSSL version : OpenSSL 

Transparent proxy issue on FreeBSD

2023-02-17 Thread Marc West
Hi,

After my other thread about performance issues on OpenBSD we decided to
switch OSes on our HAProxy boxes to FreeBSD 13.1. In the test
environment everything worked perfectly with transparent proxying but
when cutting production over to FreeBSD I ran into an issue and had to 
revert for now.

With "source 0.0.0.0 usesrc client" everything seemed OK for about a
minute but then many client connections across all backends started
getting 503 errors (or connection failures for TCP backends). The
haproxy logs have a lot of:

- Cannot bind to tproxy source address before connect() for backend my_backend
- Connect() failed for backend my_backend: local address already in use.

This was during a late night maintenance with less than 100 connections,
netstat had less than 500 total connections, and I didn't see any limits 
that were close to being hit (these are all auto-detected except 
nbthread=36 to leave some breathing room on this 40 core HW):

pid = 76927 (process #1, nbproc = 1, nbthread = 36)
uptime = 0d 0h14m15s
system limits: memmax = unlimited; ulimit-n = 940059
maxsock = 940059; maxconn = 469926; maxpipes = 0
current conns = 40; current pipes = 0/0; conn rate = 8/sec; bit rate = 1.086 
Mbps
Running tasks: 0/248; idle = 100 %

Health checks were always OK and the 503s/failures were intermittent
with a majority but not all requests failing. With "clientip" instead of
"client" it seemed to be worse with almost all requests failing.

Looking back at the test env there were some occurrences of the
"Connect() failed ... address already in use" in logs but there were no
noticeable request failures. There were zero "Cannot bind to tproxy
source address before connect()" in test even with much higher stress 
testing load. Configs between test and production are nearly identical
except there are more backends and VLANs in production and prod traffic 
comes from more IPs than the 2-3 IPs that hit test.

On OpenBSD transparent works out of the box with no PF divert rules
needed (only standard pass rules). Unfortunately with FreeBSD's PF
(which is behind OpenBSD on features but ahead on performance) I have
not found any combination of rules that allow transparent to work at
all (and divert-reply isn't implemented), so here ipfw is diverting and 
passing all traffic stateless like so:

$cmd 0001 fwd localhost tcp from 10.10.15.0/24 80,443 to any in recv vlan123
$cmd 65534 allow all from any to any

Then PF handles the main stateful firewall rules and outbound NAT for
internal servers. All PF rules have logging and there was nothing being
blocked by PF during the issue. 

It isn't ideal to run both ipfw and PF like this but it would be a
major undertaking to rework the tooling we have built around PF to a
different firewall, and since transparent is required it seems ipfw is
needed to handle the divert part on FreeBSD.

Am I overlooking something in my setup or might this be an OS or haproxy
issue to debug further? Thanks in advance for any thoughts!

$ haproxy -vv
HAProxy version 2.6.9-3a3700a 2023/02/14 - https://haproxy.org/
Status: long-term supported branch - will stop receiving fixes around Q2 2027.
Known bugs: http://www.haproxy.org/bugs/bugs-2.6.9.html
Running on: FreeBSD 13.1-RELEASE-p6 FreeBSD 13.1-RELEASE-p6 GENERIC amd64
Build options :
  TARGET  = freebsd
  CPU = generic
  CC  = cc
  CFLAGS  = -O2 -pipe -fstack-protector-strong -fno-strict-aliasing -Wall 
-Wextra -Wundef -Wdeclaration-after-statement -Wfatal-errors -Wtype-limits 
-Wshift-negative-value -Wnull-dereference -fwrapv -Wno-unknown-warning-option 
-Wno-address-of-packed-member -Wno-unused-label -Wno-sign-compare 
-Wno-unused-parameter -Wno-clobbered -Wno-missing-field-initializers 
-Wno-cast-function-type -Wno-string-plus-int -Wno-atomic-alignment 
-DFREEBSD_PORTS
  OPTIONS = USE_PCRE2=1 USE_PCRE2_JIT=1 USE_GETADDRINFO=1 USE_OPENSSL=1 
USE_LUA=1 USE_ACCEPT4=1 USE_ZLIB=1 USE_CPU_AFFINITY=1 USE_PROMEX=1
  DEBUG   = -DDEBUG_STRICT -DDEBUG_MEMORY_POOLS

Feature list : -51DEGREES +ACCEPT4 -BACKTRACE +CLOSEFROM +CPU_AFFINITY -CRYPT_H 
-DEVICEATLAS -DL -ENGINE -EPOLL -EVPORTS +GETADDRINFO +KQUEUE +LIBCRYPT 
-LINUX_SPLICE -LINUX_TPROXY +LUA -MEMORY_PROFILING -NETFILTER -NS 
-OBSOLETE_LINKER +OPENSSL -OT -PCRE +PCRE2 +PCRE2_JIT -PCRE_JIT +POLL -PRCTL 
+PROCCTL +PROMEX -QUIC -RT -SLZ -STATIC_PCRE -STATIC_PCRE2 -SYSTEMD -TFO 
+THREAD -THREAD_DUMP +TPROXY -WURFL +ZLIB

Default settings :
  bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with multi-threading support (MAX_THREADS=64, default=40).
Built with OpenSSL version : OpenSSL 1.1.1o-freebsd  3 May 2022
Running on OpenSSL version : OpenSSL 1.1.1o-freebsd  3 May 2022
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3
Built with Lua version : Lua 5.3.6
Built with the Prometheus exporter as a service
Support for malloc_trim() is enabled.
Built with zlib version : 1.2.12
Running on zlib