https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263062
Bug ID: 263062
Summary: tcp_inpcb leaking in VM environment
Product: Base System
Version: 13.1-STABLE
Hardware: Any
OS: Any
Status: New
Severity: Affects Only Me
Priority: ---
Component: kern
Assignee: [email protected]
Reporter: [email protected]
I'm running 13.0-RELEASE or 13.1-RC1 in a virtual machine attached to the outer
world via vtnet(4) driver. VM presumably is ran as Q35 chipset VM , definitely
under KVM/QEMU in Hetzner cloud datacenter.
VM is used as a web-server, proxying wss/grpc application via nginx with
relatively long-living connections. VM has 16 Gigs of memory and is running
GENERIC kernel. Nginx is servicing around 30-40K of established connections.
After 2-3 hours of uptime VM is starting to show several signs of kernel
structure leakage:
I can see multiple dmesg errors:
sonewconn: pcb 0xfffff8001ac8bd90: pru_attach() failed
sonewconn: pcb 0xfffff8000ab625d0: pru_attach() failed
sonewconn: pcb 0xfffff8000af999b0: pru_attach() failed
sonewconn: pcb 0xfffff8000ab621f0: pru_attach() failed
sonewconn: pcb 0xfffff8000ab62000: pru_attach() failed
sonewconn: pcb 0xfffff8000ab625d0: pru_attach() failed
sonewconn: pcb 0xfffff8000af999b0: pru_attach() failed
sonewconn: pcb 0xfffff8000af999b0: pru_attach() failed
sonewconn: pcb 0xfffff8000af993e0: pru_attach() failed
sonewconn: pcb 0xfffff8000af999b0: pru_attach() failed
sonewconn: pcb 0xfffff8000ab627c0: pru_attach() failed
console is spamming errors:
[zone: tcp_inpcb] kern.ipc.maxsockets limit reached
[zone: tcp_inpcb] kern.ipc.maxsockets limit reached
[zone: tcp_inpcb] kern.ipc.maxsockets limit reached
[zone: tcp_inpcb] kern.ipc.maxsockets limit reached
[zone: tcp_inpcb] kern.ipc.maxsockets limit reached
[zone: tcp_inpcb] kern.ipc.maxsockets limit reached
[zone: tcp_inpcb] kern.ipc.maxsockets limit reached
[zone: tcp_inpcb] kern.ipc.maxsockets limit reached
[zone: tcp_inpcb] kern.ipc.maxsockets limit reached
[zone: tcp_inpcb] kern.ipc.maxsockets limit reached
and the network stack is basically unusable:
# telnet 127.0.0.1 4080
Trying 127.0.0.1...
telnet: socket: No buffer space available
This is definitely caused by leaking tcp_inpcb. It's count is progressing over
time and is never constantly diminished:
(these are taken with 10 seconds interval)
ITEM SIZE LIMIT USED FREE REQ FAILSLEEP
XDOMAIN
tcp_inpcb: 496, 4189440, 802462, 1194, 1050344, 0, 0, 0
ITEM SIZE LIMIT USED FREE REQ FAILSLEEP
XDOMAIN
tcp_inpcb: 496, 4189440, 803971, 1469, 1051853, 0, 0, 0
ITEM SIZE LIMIT USED FREE REQ FAILSLEEP
XDOMAIN
tcp_inpcb: 496, 4189440, 805375, 1081, 1053257, 0, 0, 0
ITEM SIZE LIMIT USED FREE REQ FAILSLEEP
XDOMAIN
tcp_inpcb: 496, 4189440, 806936, 1296, 1054818, 0, 0, 0
ITEM SIZE LIMIT USED FREE REQ FAILSLEEP
XDOMAIN
tcp_inpcb: 496, 4189440, 808609, 1143, 1056491, 0, 0, 0
ITEM SIZE LIMIT USED FREE REQ FAILSLEEP
XDOMAIN
tcp_inpcb: 496, 4189440, 810052, 1228, 1057934, 0, 0, 0
ITEM SIZE LIMIT USED FREE REQ FAILSLEEP
XDOMAIN
tcp_inpcb: 496, 4189440, 811487, 809, 1059369, 0, 0, 0
ITEM SIZE LIMIT USED FREE REQ FAILSLEEP
XDOMAIN
tcp_inpcb: 496, 4189440, 813068, 1260, 1060950, 0, 0, 0
ITEM SIZE LIMIT USED FREE REQ FAILSLEEP
XDOMAIN
tcp_inpcb: 496, 4189440, 814532, 1068, 1062414, 0, 0, 0
ITEM SIZE LIMIT USED FREE REQ FAILSLEEP
XDOMAIN
tcp_inpcb: 496, 4189440, 816036, 1084, 1063918, 0, 0, 0
ITEM SIZE LIMIT USED FREE REQ FAILSLEEP
XDOMAIN
tcp_inpcb: 496, 4189440, 817511, 1641, 1065393, 0, 0, 0
ITEM SIZE LIMIT USED FREE REQ FAILSLEEP
XDOMAIN
tcp_inpcb: 496, 4189440, 818988, 924, 1066870, 0, 0, 0
ITEM SIZE LIMIT USED FREE REQ FAILSLEEP
XDOMAIN
tcp_inpcb: 496, 4189440, 820412, 1532, 1068294, 0, 0, 0
ITEM SIZE LIMIT USED FREE REQ FAILSLEEP
XDOMAIN
tcp_inpcb: 496, 4189440, 821880, 832, 1069762, 0, 0, 0
ITEM SIZE LIMIT USED FREE REQ FAILSLEEP
XDOMAIN
tcp_inpcb: 496, 4189440, 823399, 1345, 1071281, 0, 0, 0
ITEM SIZE LIMIT USED FREE REQ FAILSLEEP
XDOMAIN
tcp_inpcb: 496, 4189440, 824865, 895, 1072747, 0, 0, 0
ITEM SIZE LIMIT USED FREE REQ FAILSLEEP
XDOMAIN
tcp_inpcb: 496, 4189440, 826309, 1227, 1074191, 0, 0, 0
ITEM SIZE LIMIT USED FREE REQ FAILSLEEP
XDOMAIN
tcp_inpcb: 496, 4189440, 827594, 958, 1075476, 0, 0, 0
In the same time the kern.ipc.numopensockets is relatively low:
kern.ipc.numopensockets: 34689
I also have several 12.x and 13.x running the same stack on baremetal; but with
bigger amount of RAM: 96-128 Gigs. This never happens on these. One can say
that the reason is the amount of RAM, and this may seem reasonable. However:
- baremetal servers serve way more connections, for instance I have the
baremetal 13.0 server serving around 300K of connections:
TCP connection count by state:
4 connections in CLOSED state
65 connections in LISTEN state
31 connections in SYN_SENT state
446 connections in SYN_RCVD state
292378 connections in ESTABLISHED state
5 connections in CLOSE_WAIT state
27467 connections in FIN_WAIT_1 state
266 connections in CLOSING state
6714 connections in LAST_ACK state
5114 connections in FIN_WAIT_2 state
40976 connections in TIME_WAIT state
the number of ipc sockets is also way bigger:
kern.ipc.numopensockets: 332907
But the tcp_inpcb is way lower:
ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP
tcp_inpcb: 488, 4189440, 374628, 181772,27330286910, 0, 0
So I assume this leakage is specific to the situation when FreeBSD runs in a
virtual environment, and is probably caused by the virtio drivers.
As a workaround I have tried to tweak some of the sysctl oids:
kern.maxfiles=4189440
kern.ipc.maxsockets=4189440
net.inet.tcp.tcbhashsize=1048576
but this measure only delayed the tcp_inpcb exhaustion.
--
You are receiving this mail because:
You are the assignee for the bug.