JiangHai2011 created an issue (kamailio/kamailio#4437)
### Description
during relaying pressure test of tcp REGISTER, we found that sometimes tcp_main
will report "queue full" when dispatch the fd to child, but the
min_busy(tcp_children[i].busy) value can be extremely high(in the real
evironment, it can reach 176182), while actually there are not so many requests.
### Troubleshooting
in the real evironment, it can reach 176182
send2child(): tcp child 1, socket 25: queue full, 176181 requests queued (total
handled 324699543)
send2child(): tcp child 1, socket 25: queue full, 176182 requests queued (total
handled 324699544)
send2child(): tcp child 1, socket 25: queue full, 176183 requests queued (total
handled 324699545)
send2child(): tcp child 1, socket 25: queue full, 176184 requests queued (total
handled 324699546)
send2child(): tcp child 1, socket 25: queue full, 176185 requests queued (total
handled 324699547)
send2child(): tcp child 1, socket 25: queue full, 176186 requests queued (total
handled 324699548)
send2child(): tcp child 1, socket 25: queue full, 176187 requests queued (total
handled 324699549)
send2child(): tcp child 1, socket 25: queue full, 176188 requests queued (total
handled 324699550)
send2child(): tcp child 1, socket 25: queue full, 176189 requests queued (total
handled 324699551)
send2child(): tcp child 1, socket 25: queue full, 176190 requests queued (total
handled 324699552)
send2child(): tcp child 1, socket 25: queue full, 176191 requests queued (total
handled 324699553)
send2child(): tcp child 1, socket 25: queue full, 176192 requests queued (total
handled 324699554)
send2child(): tcp child 1, socket 25: queue full, 176193 requests queued (total
handled 324699555)
send2child(): tcp child 1, socket 25: queue full, 176194 requests queued (total
handled 324699556)
send2child(): tcp child 1, socket 25: queue full, 176195 requests queued (total
handled 324699557)
but in my local place, due to sipp performance, the queued request is not that
high, but by gdb-attaching tcp_main we can see the abnormal queued request
count, we can see tcp_children[2].busy is still 1267 in idle time and it will
never be back to 0 again
(gdb) p tcp_children[0].busy
$1 = 0
(gdb) p tcp_children[1].busy
$2 = 0
(gdb) p tcp_children[2].busy
$3 = 1267
(gdb) p tcp_children[3].busy
$4 = 0
(gdb)
#### Reproduction
use sipp-1 to send REGISTER and use sipp-2 to receive REGISTER and send 200OK
and disconnect
do this sipp test with 2000cps rate for a long time minutes
request_route is simple:
tcp_connection_lifetime=3605
tcp_max_connections=102400
tcp_rd_buf_size=8192
tcp_no_connect=no
tcp_async=yes
tcp_conn_wq_max=256000
tcp_wq_blk_size=4096
tcp_accept_iplimit=102400
open_files_limit=102400
request_route {
add_local_rport();
$du = "sip:192.168.130.118:1000;transport=tcp";
forward();
usleep("100"); // here to simulate the service normal processing
}
#### Debugging Data
None
#### Log Messages
Oct 18 02:20:07 gds-sip-150 sipproxy[3259]: CRITICAL: <core>
[core/tcp_main.c:4416]: send2child(): tcp child 1, socket 15: queue full, 3054
requests queued (total handled 151472)
Oct 18 02:20:07 gds-sip-150 sipproxy[3259]: ERROR: <core>
[core/tcp_main.c:3643]: send_fd_queue_run(): send_fd failed on socket 15 ,
queue entry 0, retries 5017, connection 0xffff2ddd8c00, tcp socket 5169,
errno=11 (Resource temporarily unavailable)
Oct 18 02:20:07 gds-sip-150 sipproxy[3259]: ERROR: <core>
[core/tcp_main.c:3643]: send_fd_queue_run(): send_fd failed on socket 15 ,
queue entry 1, retries 4968, connection 0xffff2dd13fb0, tcp socket 5193,
errno=11 (Resource temporarily unavailable)
Oct 18 02:20:07 gds-sip-150 sipproxy[3259]: ERROR: <core>
[core/tcp_main.c:3643]: send_fd_queue_run(): send_fd failed on socket 15 ,
queue entry 0, retries 5018, connection 0xffff2c8d6620, tcp socket 5369,
errno=11 (Resource temporarily unavailable)
Oct 18 02:20:07 gds-sip-150 sipproxy[3259]: ERROR: <core>
[core/tcp_main.c:3643]: send_fd_queue_run(): send_fd failed on socket 15 ,
queue entry 0, retries 5410, connection 0xffff2bddf970, tcp socket 5785,
errno=11 (Resource temporarily unavailable)
Oct 18 02:20:07 gds-sip-150 sipproxy[3259]: ERROR: <core>
[core/tcp_main.c:3643]: send_fd_queue_run(): send_fd failed on socket 15 ,
queue entry 1, retries 5402, connection 0xffff2e32ed40, tcp socket 5793,
errno=11 (Resource temporarily unavailable)
Oct 18 02:20:07 gds-sip-150 sipproxy[3259]: ERROR: <core>
[core/tcp_main.c:3643]: send_fd_queue_run(): send_fd failed on socket 15 ,
queue entry 2, retries 5394, connection 0xffff2d5f74c0, tcp socket 5801,
errno=11 (Resource temporarily unavailable)
Oct 18 02:20:07 gds-sip-150 sipproxy[3259]: ERROR: <core>
[core/tcp_main.c:3643]: send_fd_queue_run(): send_fd failed on socket 15 ,
queue entry 3, retries 5382, connection 0xffff2d253780, tcp socket 5813,
errno=11 (Resource temporarily unavailable)
Oct 18 02:20:07 gds-sip-150 sipproxy[3259]: ERROR: <core>
[core/tcp_main.c:3643]: send_fd_queue_run(): send_fd failed on socket 15 ,
queue entry 4, retries 5374, connection 0xffff2c34c490, tcp socket 5821,
errno=11 (Resource temporarily unavailable)
Oct 18 02:20:07 gds-sip-150 sipproxy[3259]: CRITICAL: <core>
[core/tcp_main.c:4416]: send2child(): tcp child 1, socket 15: queue full, 3055
requests queued (total handled 151473)
Oct 18 02:20:07 gds-sip-150 sipproxy[3259]: CRITICAL: <core>
[core/tcp_main.c:4416]: send2child(): tcp child 1, socket 15: queue full, 3056
requests queued (total handled 151474)
Oct 18 02:39:35 gds-sip-150 sipproxy[3259]: CRITICAL: <core>
[core/tcp_main.c:4416]: send2child(): tcp child 0, socket 13: queue full, 3907
requests queued (total handled 246153)
Oct 18 02:39:35 gds-sip-150 sipproxy[3259]: CRITICAL: <core>
[core/tcp_main.c:4416]: send2child(): tcp child 0, socket 13: queue full, 3908
requests queued (total handled 246154)
Oct 18 02:39:35 gds-sip-150 sipproxy[3259]: ERROR: <core>
[core/tcp_main.c:3643]: send_fd_queue_run(): send_fd failed on socket 13 ,
queue entry 0, retries 4706, connection 0xffff2b2ff9d0, tcp socket 5190,
errno=11 (Resource temporarily unavailable)
Oct 18 02:39:35 gds-sip-150 sipproxy[3259]: ERROR: <core>
[core/tcp_main.c:3643]: send_fd_queue_run(): send_fd failed on socket 13 ,
queue entry 1, retries 4702, connection 0xffff2b308a90, tcp socket 5194,
errno=11 (Resource temporarily unavailable)
Oct 18 02:39:35 gds-sip-150 sipproxy[3259]: ERROR: <core>
[core/tcp_main.c:3643]: send_fd_queue_run(): send_fd failed on socket 13 ,
queue entry 2, retries 4698, connection 0xffff2b311b50, tcp socket 5198,
errno=11 (Resource temporarily unavailable)
Oct 18 02:39:35 gds-sip-150 sipproxy[3259]: ERROR: <core>
[core/tcp_main.c:3643]: send_fd_queue_run(): send_fd failed on socket 13 ,
queue entry 3, retries 4694, connection 0xffff2b31ac10, tcp socket 5202,
errno=11 (Resource temporarily unavailable)
Oct 18 02:39:35 gds-sip-150 sipproxy[3259]: CRITICAL: <core>
[core/tcp_main.c:4416]: send2child(): tcp child 0, socket 13: queue full, 3909
requests queued (total handled 246155)
Oct 18 02:39:35 gds-sip-150 sipproxy[3259]: ERROR: <core>
[core/tcp_main.c:3643]: send_fd_queue_run(): send_fd failed on socket 13 ,
queue entry 0, retries 5062, connection 0xffff2c14f990, tcp socket 5704,
errno=11 (Resource temporarily unavailable)
Oct 18 02:39:35 gds-sip-150 sipproxy[3259]: ERROR: <core>
[core/tcp_main.c:3643]: send_fd_queue_run(): send_fd failed on socket 13 ,
queue entry 1, retries 5058, connection 0xffff2c158a50, tcp socket 5708,
errno=11 (Resource temporarily unavailable)
Oct 18 02:39:35 gds-sip-150 sipproxy[3259]: ERROR: <core>
[core/tcp_main.c:3643]: send_fd_queue_run(): send_fd failed on socket 13 ,
queue entry 2, retries 4978, connection 0xffff2d0645a0, tcp socket 5788,
errno=11 (Resource temporarily unavailable)
Oct 18 02:39:35 gds-sip-150 sipproxy[3259]: CRITICAL: <core>
[core/tcp_main.c:4416]: send2child(): tcp child 0, socket 13: queue full, 3910
requests queued (total handled 246156)
#### SIP Traffic
just normal REGISTER as follows:
REGISTER sip:[email protected]:5060;transport=TCP SIP/2.0
Via: SIP/2.0/TCP 192.168.130.118:5060;branch=z9hG4bK-22346-1-0;rport
From: <sip:[email protected]:5060>;tag=1
To: <sip:[email protected]:5060;transport=TCP>
Call-ID: [email protected]
CSeq: 1 REGISTER
Contact: <sip:[email protected]:5060;transport=TCP>;expires=3600
Expires: 3600
User-Agent: sipp register test
Max-Forwards: 70
Subject: Performance Test
Content-Length: 0
### Possible Solutions
in send2child of tcp_main.c, do tcp_children[idx].busy-- before every "return
-1;"
(as we know, tcp_children[idx].busy-- will only be done after child has
finished the task and returned the conn to tcp_main.)
### Additional Information
* **Kamailio Version** - output of `kamailio -v`
version: kamailio 6.0.1 (aarch64/linux)
flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE,
USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, MEM_JOIN_FREE, Q_MALLOC,
F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT-NOSMP,
USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLOCKLIST, HAVE_RESOLV_RES,
TLS_PTHREAD_MUTEX_SHARED
ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_SEND_BUFFER_SIZE
262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB
poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
id: unknown
compiled on 02:01:11 Oct 18 2025 with gcc 7.3.1
* **Operating System**:
Linux sipserver 4.14.320-243.544.amzn2.aarch64 #1 SMP Tue Aug 1 21:03:02 UTC
2023 aarch64 aarch64 aarch64 GNU/Linux
--
Reply to this email directly or view it on GitHub:
https://github.com/kamailio/kamailio/issues/4437
You are receiving this because you are subscribed to this thread.
Message ID: <kamailio/kamailio/issues/[email protected]>_______________________________________________
Kamailio - Development Mailing List -- [email protected]
To unsubscribe send an email to [email protected]
Important: keep the mailing list in the recipients, do not reply only to the
sender!