Great everyone!
It often happens to me that the service completely fails and refuses to service
calls. For example, yesterday there was such a situation. I looked at the logs
and at the beginning of the incident I noticed this strange entry:
kamailio[283008]: WARNING: {1 14714 ACK clMGFB7OWsiT4fQp9TPvV-RlvGqv2zPa}
<core> [core/tcp_main.c:1297]: find_listening_sock_info(): binding to source
address 109.68.232.40:15061 failed: Address already in use [98]
kamailio[283008]: WARNING: {1 14714 ACK clMGFB7OWsiT4fQp9TPvV-RlvGqv2zPa}
<core> [core/tcp_main.c:1441]: tcp_do_connect(): 10.241.60.46:15061: could not
find corresponding listening socket
kamailio[283008]: ERROR: {1 14714 ACK clMGFB7OWsiT4fQp9TPvV-RlvGqv2zPa} <core>
[core/tcp_main.c:3078]: tcpconn_1st_send(): connect 10.241.60.46:15061 failed
(RST) Connection refused
kamailio[283008]: ERROR: {1 14714 ACK clMGFB7OWsiT4fQp9TPvV-RlvGqv2zPa} <core>
[core/tcp_main.c:3087]: tcpconn_1st_send(): 10.241.60.46:15061: connect & send
for 0x7f7107e01750 (sock 39)
kamailio[283008]: CRITICAL: {1 14714 ACK clMGFB7OWsiT4fQp9TPvV-RlvGqv2zPa}
<core> [core/forward.h:290]: msg_send_buffer(): tcp_send failed
kamailio[283008]: ERROR: {1 14714 ACK clMGFB7OWsiT4fQp9TPvV-RlvGqv2zPa} sl
[sl_funcs.c:415]: sl_reply_error(): stateless error reply used: Unfortunately
error on sending to next hop occurred (477/SL)
And after several such entries this begins:
/usr/local/sbin/kamailio[283069]: ERROR: <core> [core/tcp_main.c:4787]:
tcpconn_main_timeout(): connect 31.173.85.36:10588 failed (timeout)
/usr/local/sbin/kamailio[283069]: ERROR: <core> [core/tcp_main.c:4787]:
tcpconn_main_timeout(): connect 95.24.163.132:48075 failed (timeout)
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]:
tcp_timer_check_connections(): message processing timeout on connection id:
22668011 (state: 0) - closing
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]:
tcp_timer_check_connections(): message processing timeout on connection id:
22668304 (state: 0) - closing /usr/local/sbin/kamailio[282828]: CRITICAL:
<core> [core/tcp_main.c:5526]: tcp_timer_check_connections(): message
processing timeout on connection id: 22668309 (state: 0) - closing
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]:
tcp_timer_check_connections(): message processing timeout on connection id:
22668316 (state: 0) - closing
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]:
tcp_timer_check_connections(): message processing timeout on connection id:
22668192 (state: 3) - closing
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]:
tcp_timer_check_connections(): message processing timeout on connection id:
22668228 (state: 0) - closing
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]:
tcp_timer_check_connections(): message processing timeout on connection id:
22668236 (state: 0) - closing
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]:
tcp_timer_check_connections(): message processing timeout on connection id:
22668427 (state: 3) - closing
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]:
tcp_timer_check_connections(): message processing timeout on connection id:
22668428 (state: 3) - closing
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]:
tcp_timer_check_connections(): message processing timeout on connection id:
22668429 (state: 3) - closing
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]:
tcp_timer_check_connections(): message processing timeout on connection id:
22668430 (state: 3) - closing
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]:
tcp_timer_check_connections(): message processing timeout on connection id:
22668431 (state: 3) - closing
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]:
tcp_timer_check_connections(): message processing timeout on connection id:
22668432 (state: 3) - closing
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]:
tcp_timer_check_connections(): message processing timeout on connection id:
22668433 (state: 3) - closing
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]:
tcp_timer_check_connections(): message processing timeout on connection id:
22668434 (state: 3) - closing
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]:
tcp_timer_check_connections(): message processing timeout on connection id:
22668435 (state: 3) - closing
/usr/local/sbin/kamailio[282828]: CRITICAL: <core> [core/tcp_main.c:5526]:
tcp_timer_check_connections(): message processing timeout on connection id:
22668436 (state: 3) - closing
Such recordings go on for a very long time (5 minutes) and begin like this:
/usr/local/sbin/kamailio[283069]: CRITICAL: <core> [core/tcp_main.c:4361]:
send2child(): tcp child 20, socket 273: queue full, 286 requests queued (total
handled 2421746)
/usr/local/sbin/kamailio[283069]: CRITICAL: <core> [core/tcp_main.c:4361]:
send2child(): tcp child 29, socket 291: queue full, 286 requests queued (total
handled 2419460)
/usr/local/sbin/kamailio[283069]: CRITICAL: <core> [core/tcp_main.c:4361]:
send2child(): tcp child 4, socket 241: queue full, 287 requests queued (total
handled 2451838)
/usr/local/sbin/kamailio[283069]: CRITICAL: <core> [core/tcp_main.c:4361]:
send2child(): tcp child 5, socket 243: queue full, 287 requests queued (total
handled 2445182)
/usr/local/sbin/kamailio[283069]: CRITICAL: <core> [core/tcp_main.c:4361]:
send2child(): tcp child 18, socket 269: queue full, 287 requests queued (total
handled 2422113)
/usr/local/sbin/kamailio[283069]: CRITICAL: <core> [core/tcp_main.c:4361]:
send2child(): tcp child 20, socket 273: queue full, 287 requests queued (total
handled 2421747)
/usr/local/sbin/kamailio[283069]: CRITICAL: <core> [core/tcp_main.c:4361]:
send2child(): tcp child 27, socket 287: queue full, 287 requests queued (total
handled 2419986)
/usr/local/sbin/kamailio[283069]: CRITICAL: <core> [core/tcp_main.c:4361]:
send2child(): tcp child 29, socket 291: queue full, 287 requests queued (total
handled 2419461)
/usr/local/sbin/kamailio[283069]: CRITICAL: <core> [core/tcp_main.c:4361]:
send2child(): tcp child 30, socket 293: queue full, 287 requests queued (total
handled 2419979)
Also, my eyes are distracted by such records and it’s difficult to understand
the logs:
1) kamailio[282798]: CRITICAL: {2 12704 INVITE
4cb232e6-9b5a-4a5b-aa0e-1ed73539039f} tm [../../core/forward.h:290]:
msg_send_buffer(): tcp_send failed
2) /usr/local/sbin/kamailio[282970]: ERROR: sl [sl.c:285]: send_reply(): failed
to reply stateful (tm)
3) kamailio[283053]: CRITICAL: <core> [core/tcp_read.c:267]: tcp_read_data():
error reading: Connection reset by peer (104) ([217.66.156.172]:11604 ->
kamailio[283053]: CRITICAL: <core> [core/tcp_read.c:270]: tcp_read_data():
-> [109.68.232.40]:15061)
kamailio[283053]: CRITICAL: <core> [core/tcp_read.c:1523]: tcp_read_req():
ERROR: tcp_read_req: error reading - c: 0x7f71215bb610 r: 0x7f71215bb738 (-1)
kamailio[283007]: CRITICAL: <core> [core/tcp_read.c:267]: tcp_read_data():
error reading: Connection timed out (110) ([188.170.204.16]:52148 ->
kamailio[283007]: CRITICAL: <core> [core/tcp_read.c:270]: tcp_read_data():
-> [109.68.232.40]:15061)
kamailio[283007]: CRITICAL: <core> [core/tcp_read.c:1523]: tcp_read_req():
ERROR: tcp_read_req: error reading - c: 0x7f711b638fa0 r: 0x7f711b6390c8 (-1)
kamailio[283027]: ERROR: {1 9746 REGISTER
8e6af58d-d4ca-4dfe-a8f6-d3c84685c845} tsilo [ts_append.c:121]: ts_append_to():
transaction [23428:1443020265] not found
4) kamailio[282820]: CRITICAL: {1 2 UPDATE
6da65a87-3224-46a1-961f-8f9ee50348cc} tm [../../core/forward.h:290]:
msg_send_buffer(): tcp_send failed
kamailio[282820]: WARNING: {1 2 UPDATE
6da65a87-3224-46a1-961f-8f9ee50348cc} tm [t_fwd.c:1581]: t_send_branch():
sending request on branch 0 failed
kamailio[282820]: ERROR: {1 2 UPDATE 6da65a87-3224-46a1-961f-8f9ee50348cc}
sl [sl_funcs.c:415]: sl_reply_error(): stateless error reply used:
Unfortunately error on sending to next hop occurred (477/SL)
Once the queue is full, the system stops servicing new calls and freezes.
kamailio -v
version: kamailio 5.7.2 (x86_64/linux) 95629c
flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE,
USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, MEM_JOIN_FREE, Q_MALLOC,
F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT,
USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLOCKLIST, HAVE_RESOLV_RES,
TLS_PTHREAD_MUTEX_SHARED
ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024,
BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB
poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
id: 95629c
OS: Debian 11, 64 cores, 125GB of DDR4 RAM; DB: MariaDB 10.5.15;
udp_children: 30; tcp_children: 34; TLS: YES; async_workers: 64.
Incoming calls are sent via push notifications( Federico Cabiddu method:
https://www.voztovoice.org/sites/default/files/KamilioWorld2015%20-Federico…).
NetBridging(for SIP and RTPEngine).
ims_charging for billing (integration with our billing system using the Diameter
protocol).
As soon as the service starts responding with 503, it does not recover on its
own, you have to restart.
My server is powerful enough for the number of users of my service (what I mean
is that there is no problem with resources). For example, 64 cores, 125GB of
DDR4 RAM, 2TB of disk, of which 1TB SSD for the database, 10Gigabit channel,
this is only a server for Kamailio for Rtpengines separate servers.
Please help me figure out what causes a service failure!?
__________________________________________________________
Kamailio - Users Mailing List - Non Commercial Discussions
To unsubscribe send an email to [email protected]
Important: keep the mailing list in the recipients, do not reply only to the
sender!
Edit mailing list options or unsubscribe: