JiangHai2011 created an issue (kamailio/kamailio#4437)

### Description
during relaying pressure test of tcp REGISTER, we found that sometimes tcp_main 
will report "queue full" when dispatch the fd to child, but the 
min_busy(tcp_children[i].busy) value can be extremely high(in the real 
evironment, it can reach 176182), while actually there are not so many requests.

### Troubleshooting
in the real evironment, it can reach 176182
send2child(): tcp child 1, socket 25: queue full, 176181 requests queued (total 
handled 324699543)
send2child(): tcp child 1, socket 25: queue full, 176182 requests queued (total 
handled 324699544)
send2child(): tcp child 1, socket 25: queue full, 176183 requests queued (total 
handled 324699545)
send2child(): tcp child 1, socket 25: queue full, 176184 requests queued (total 
handled 324699546)
send2child(): tcp child 1, socket 25: queue full, 176185 requests queued (total 
handled 324699547)
send2child(): tcp child 1, socket 25: queue full, 176186 requests queued (total 
handled 324699548)
send2child(): tcp child 1, socket 25: queue full, 176187 requests queued (total 
handled 324699549)
send2child(): tcp child 1, socket 25: queue full, 176188 requests queued (total 
handled 324699550)
send2child(): tcp child 1, socket 25: queue full, 176189 requests queued (total 
handled 324699551)
send2child(): tcp child 1, socket 25: queue full, 176190 requests queued (total 
handled 324699552)
send2child(): tcp child 1, socket 25: queue full, 176191 requests queued (total 
handled 324699553)
send2child(): tcp child 1, socket 25: queue full, 176192 requests queued (total 
handled 324699554)
send2child(): tcp child 1, socket 25: queue full, 176193 requests queued (total 
handled 324699555)
send2child(): tcp child 1, socket 25: queue full, 176194 requests queued (total 
handled 324699556)
send2child(): tcp child 1, socket 25: queue full, 176195 requests queued (total 
handled 324699557)

but in my local place, due to sipp performance, the queued request is not that 
high, but by gdb-attaching tcp_main we can see the abnormal queued request 
count, we can see tcp_children[2].busy is still 1267 in idle time and it will 
never be back to 0 again
(gdb) p tcp_children[0].busy
$1 = 0
(gdb) p tcp_children[1].busy
$2 = 0
(gdb) p tcp_children[2].busy
$3 = 1267
(gdb) p tcp_children[3].busy
$4 = 0
(gdb) 

#### Reproduction

use sipp-1 to send REGISTER  and use sipp-2 to receive REGISTER and send 200OK 
and disconnect 
do this sipp test with 2000cps rate for a long time minutes

request_route is simple:

tcp_connection_lifetime=3605
tcp_max_connections=102400
tcp_rd_buf_size=8192
tcp_no_connect=no
tcp_async=yes
tcp_conn_wq_max=256000
tcp_wq_blk_size=4096
tcp_accept_iplimit=102400
open_files_limit=102400

request_route {
        add_local_rport();
        $du = "sip:192.168.130.118:1000;transport=tcp";
        forward();
        usleep("100");  // here to simulate the service normal processing
}

#### Debugging Data
None

#### Log Messages
Oct 18 02:20:07 gds-sip-150 sipproxy[3259]: CRITICAL: <core> 
[core/tcp_main.c:4416]: send2child(): tcp child 1, socket 15: queue full, 3054 
requests queued (total handled 151472)
Oct 18 02:20:07 gds-sip-150 sipproxy[3259]: ERROR: <core> 
[core/tcp_main.c:3643]: send_fd_queue_run(): send_fd failed on socket 15 , 
queue entry 0, retries 5017, connection 0xffff2ddd8c00, tcp socket 5169, 
errno=11 (Resource temporarily unavailable)
Oct 18 02:20:07 gds-sip-150 sipproxy[3259]: ERROR: <core> 
[core/tcp_main.c:3643]: send_fd_queue_run(): send_fd failed on socket 15 , 
queue entry 1, retries 4968, connection 0xffff2dd13fb0, tcp socket 5193, 
errno=11 (Resource temporarily unavailable)
Oct 18 02:20:07 gds-sip-150 sipproxy[3259]: ERROR: <core> 
[core/tcp_main.c:3643]: send_fd_queue_run(): send_fd failed on socket 15 , 
queue entry 0, retries 5018, connection 0xffff2c8d6620, tcp socket 5369, 
errno=11 (Resource temporarily unavailable)
Oct 18 02:20:07 gds-sip-150 sipproxy[3259]: ERROR: <core> 
[core/tcp_main.c:3643]: send_fd_queue_run(): send_fd failed on socket 15 , 
queue entry 0, retries 5410, connection 0xffff2bddf970, tcp socket 5785, 
errno=11 (Resource temporarily unavailable)
Oct 18 02:20:07 gds-sip-150 sipproxy[3259]: ERROR: <core> 
[core/tcp_main.c:3643]: send_fd_queue_run(): send_fd failed on socket 15 , 
queue entry 1, retries 5402, connection 0xffff2e32ed40, tcp socket 5793, 
errno=11 (Resource temporarily unavailable)
Oct 18 02:20:07 gds-sip-150 sipproxy[3259]: ERROR: <core> 
[core/tcp_main.c:3643]: send_fd_queue_run(): send_fd failed on socket 15 , 
queue entry 2, retries 5394, connection 0xffff2d5f74c0, tcp socket 5801, 
errno=11 (Resource temporarily unavailable)
Oct 18 02:20:07 gds-sip-150 sipproxy[3259]: ERROR: <core> 
[core/tcp_main.c:3643]: send_fd_queue_run(): send_fd failed on socket 15 , 
queue entry 3, retries 5382, connection 0xffff2d253780, tcp socket 5813, 
errno=11 (Resource temporarily unavailable)
Oct 18 02:20:07 gds-sip-150 sipproxy[3259]: ERROR: <core> 
[core/tcp_main.c:3643]: send_fd_queue_run(): send_fd failed on socket 15 , 
queue entry 4, retries 5374, connection 0xffff2c34c490, tcp socket 5821, 
errno=11 (Resource temporarily unavailable)
Oct 18 02:20:07 gds-sip-150 sipproxy[3259]: CRITICAL: <core> 
[core/tcp_main.c:4416]: send2child(): tcp child 1, socket 15: queue full, 3055 
requests queued (total handled 151473)
Oct 18 02:20:07 gds-sip-150 sipproxy[3259]: CRITICAL: <core> 
[core/tcp_main.c:4416]: send2child(): tcp child 1, socket 15: queue full, 3056 
requests queued (total handled 151474)

Oct 18 02:39:35 gds-sip-150 sipproxy[3259]: CRITICAL: <core> 
[core/tcp_main.c:4416]: send2child(): tcp child 0, socket 13: queue full, 3907 
requests queued (total handled 246153)
Oct 18 02:39:35 gds-sip-150 sipproxy[3259]: CRITICAL: <core> 
[core/tcp_main.c:4416]: send2child(): tcp child 0, socket 13: queue full, 3908 
requests queued (total handled 246154)
Oct 18 02:39:35 gds-sip-150 sipproxy[3259]: ERROR: <core> 
[core/tcp_main.c:3643]: send_fd_queue_run(): send_fd failed on socket 13 , 
queue entry 0, retries 4706, connection 0xffff2b2ff9d0, tcp socket 5190, 
errno=11 (Resource temporarily unavailable)
Oct 18 02:39:35 gds-sip-150 sipproxy[3259]: ERROR: <core> 
[core/tcp_main.c:3643]: send_fd_queue_run(): send_fd failed on socket 13 , 
queue entry 1, retries 4702, connection 0xffff2b308a90, tcp socket 5194, 
errno=11 (Resource temporarily unavailable)
Oct 18 02:39:35 gds-sip-150 sipproxy[3259]: ERROR: <core> 
[core/tcp_main.c:3643]: send_fd_queue_run(): send_fd failed on socket 13 , 
queue entry 2, retries 4698, connection 0xffff2b311b50, tcp socket 5198, 
errno=11 (Resource temporarily unavailable)
Oct 18 02:39:35 gds-sip-150 sipproxy[3259]: ERROR: <core> 
[core/tcp_main.c:3643]: send_fd_queue_run(): send_fd failed on socket 13 , 
queue entry 3, retries 4694, connection 0xffff2b31ac10, tcp socket 5202, 
errno=11 (Resource temporarily unavailable)
Oct 18 02:39:35 gds-sip-150 sipproxy[3259]: CRITICAL: <core> 
[core/tcp_main.c:4416]: send2child(): tcp child 0, socket 13: queue full, 3909 
requests queued (total handled 246155)
Oct 18 02:39:35 gds-sip-150 sipproxy[3259]: ERROR: <core> 
[core/tcp_main.c:3643]: send_fd_queue_run(): send_fd failed on socket 13 , 
queue entry 0, retries 5062, connection 0xffff2c14f990, tcp socket 5704, 
errno=11 (Resource temporarily unavailable)
Oct 18 02:39:35 gds-sip-150 sipproxy[3259]: ERROR: <core> 
[core/tcp_main.c:3643]: send_fd_queue_run(): send_fd failed on socket 13 , 
queue entry 1, retries 5058, connection 0xffff2c158a50, tcp socket 5708, 
errno=11 (Resource temporarily unavailable)
Oct 18 02:39:35 gds-sip-150 sipproxy[3259]: ERROR: <core> 
[core/tcp_main.c:3643]: send_fd_queue_run(): send_fd failed on socket 13 , 
queue entry 2, retries 4978, connection 0xffff2d0645a0, tcp socket 5788, 
errno=11 (Resource temporarily unavailable)
Oct 18 02:39:35 gds-sip-150 sipproxy[3259]: CRITICAL: <core> 
[core/tcp_main.c:4416]: send2child(): tcp child 0, socket 13: queue full, 3910 
requests queued (total handled 246156)

#### SIP Traffic

just normal REGISTER as follows:

REGISTER sip:[email protected]:5060;transport=TCP SIP/2.0
Via: SIP/2.0/TCP 192.168.130.118:5060;branch=z9hG4bK-22346-1-0;rport
From: <sip:[email protected]:5060>;tag=1
To: <sip:[email protected]:5060;transport=TCP>
Call-ID: [email protected]
CSeq: 1 REGISTER
Contact: <sip:[email protected]:5060;transport=TCP>;expires=3600
Expires: 3600
User-Agent: sipp register test
Max-Forwards: 70
Subject: Performance Test
Content-Length: 0


### Possible Solutions
in send2child of tcp_main.c,  do tcp_children[idx].busy-- before every "return 
-1;"
(as we know, tcp_children[idx].busy-- will only be done after child has 
finished the task and returned the conn to tcp_main.)

### Additional Information

  * **Kamailio Version** - output of `kamailio -v`
version: kamailio 6.0.1 (aarch64/linux) 
flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, 
USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, MEM_JOIN_FREE, Q_MALLOC, 
F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT-NOSMP, 
USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLOCKLIST, HAVE_RESOLV_RES, 
TLS_PTHREAD_MUTEX_SHARED
ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_SEND_BUFFER_SIZE 
262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB
poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
id: unknown 
compiled on 02:01:11 Oct 18 2025 with gcc 7.3.1

* **Operating System**:
Linux sipserver 4.14.320-243.544.amzn2.aarch64 #1 SMP Tue Aug 1 21:03:02 UTC 
2023 aarch64 aarch64 aarch64 GNU/Linux

-- 
Reply to this email directly or view it on GitHub:
https://github.com/kamailio/kamailio/issues/4437
You are receiving this because you are subscribed to this thread.

Message ID: <kamailio/kamailio/issues/[email protected]>
_______________________________________________
Kamailio - Development Mailing List -- [email protected]
To unsubscribe send an email to [email protected]
Important: keep the mailing list in the recipients, do not reply only to the 
sender!

Reply via email to