Have you done a test with tools such as sipp, or was this happening after a while, with usual phones registering?
Can you list the tcp connections and see if they are listed? kamctl tcp core.tcp_list Cheers, Daniel On 25.03.19 08:03, Kristijan Vrban wrote: >> The solution here is to use set_reply_no_connect() > implemented it. Now the issue has shifted to: > > ERROR: <core> [core/tcp_main.c:3959]: handle_new_connect(): maximum > number of connections exceeded: 2048/2048 > > But not a single TCP connection is active between Kamailio and any > device. Seems this counter for maximum number of connections > now has an issue? > > Kristijan > > Am Mi., 20. März 2019 um 15:07 Uhr schrieb Daniel-Constantin Mierla > <[email protected]>: >> Hello, >> >> based on the trap output I think I could figure out what happened there. >> >> You have tcp_children to very low value (1 or so), the problem is not >> actually that one, but the fact that the connection to upstream (the >> device/app sending the request) was closed after receiving the request >> and routing of the reply gets stuck in the way of: >> >> - a reply is received and has to be forwarded >> - connection was lost, so Kamailio tries to establish a new one, but >> takes time till fails because the upstream is behind nat or so based on >> the via header: >> >> Via: SIP/2.0/TLS >> 10.1.0.4:10002;rport=55229;received=13.94.188.218;branch=z9hG4bK-3336-7f2927bfd703ae907348edff3611bfc9 >> >> - the reply is retransmitted and gets to another worker, which tries >> to forward it again, but discovers a connection structure for that >> destination exists (created by previous reply worker) and now waits for >> the connection to be released (or better said, for the mutex on writing >> buffer to be unlocked) >> >> - as the second reply waits, there can be other retransmissions of the >> reply ending up in other workers stuck on waiting for the mutex of the >> connection write buffer >> >> The solution here is to use set_reply_no_connect() -- you can put it >> first in request_route block. I think this would be a good addition to >> the default configuration file as well, IMO, the sip server should not >> connect for sending replies and should do it also for requests that go >> behind nat. >> >> Cheers, >> Daniel >> >> On 19.03.19 10:53, Kristijan Vrban wrote: >>> So i had again the situation. But this time, incoming udp was >>> affected. Kamailio was sending out OPTIONS (via dispatcher module) to >>> a group of asterisk machines >>> but the 200 OK reply to the OPTIONS where not processed, so the >>> dispatcher module set all asterisk to inactive, even though they >>> replied 200 OK >>> >>> Attached the output of kamctl trap during the situation. Hope there is >>> any useful in it. Because after "kamctl trap" it was working again >>> without kamailio restart. >>> >>> Best >>> Kristijan >>> >>> Am Mo., 18. März 2019 um 12:27 Uhr schrieb Daniel-Constantin Mierla >>> <[email protected]>: >>>> Hello, >>>> >>>> setting tcp_children=1 is not a god option for scallability, practically >>>> you set kamailio to process a single tcp message at one time, on high >>>> traffic, that won't work well. >>>> >>>> Maybe try to set tcp_children to 2 or 4, that should make an eventual >>>> race appear faster. >>>> >>>> Regarding the pid, if it is an outgoing connection, then it can be >>>> created by any worker process, including a UDP worker, if that was the >>>> one receiving the sip message over udp and sends it out via tcp. >>>> >>>> Cheers, >>>> Daniel >>>> >>>> On 18.03.19 10:09, Kristijan Vrban wrote: >>>>> Hi Daniel, >>>>> >>>>> for testing, i now had set: "tcp_children=1" and so far this issue did >>>>> not occur >>>>> ever since. So now value to provide for "kamctl trap" yet. >>>>> >>>>> "kamctl ps" show this two process to handle tcp: >>>>> >>>>> ... >>>>> }, { >>>>> "IDX": 25, >>>>> "PID": 71929, >>>>> "DSC": "tcp receiver (generic) child=0" >>>>> }, { >>>>> "IDX": 26, >>>>> "PID": 71933, >>>>> "DSC": "tcp main process" >>>>> } >>>>> ... >>>>> >>>>> >>>>> Ok, but then is was wondering to see a TCP connection on a udp receiver >>>>> child: >>>>> >>>>> >>>>> netstat -ntp |grep 5061 >>>>> >>>>> ... >>>>> tcp 0 0 172.17.217.10:5061 195.70.114.125:18252 >>>>> ESTABLISHED 71895/kamailio >>>>> ... >>>>> >>>>> An pid 71895 is: >>>>> >>>>> }, { >>>>> "IDX": 3, >>>>> "PID": 71895, >>>>> "DSC": "udp receiver child=2 sock=127.0.0.1:5060" >>>>> }, { >>>>> >>>>> >>>>> >>>>> And if i look into it via "lsof -p 71895" (the udp receiver child) >>>>> >>>>> ... >>>>> kamailio 71895 kamailio 14u sock 0,9 0t0 >>>>> 8856085 protocol: TCP >>>>> kamailio 71895 kamailio 15u sock 0,9 0t0 >>>>> 8886886 protocol: TCP >>>>> kamailio 71895 kamailio 16u sock 0,9 0t0 >>>>> 8854886 protocol: TCP >>>>> kamailio 71895 kamailio 17u sock 0,9 0t0 >>>>> 8828915 protocol: TCP >>>>> kamailio 71895 kamailio 18u unix 0x000000005f73cb91 0t0 >>>>> 1680314 type=DGRAM >>>>> kamailio 71895 kamailio 19u IPv4 1846523 0t0 >>>>> TCP kamailio-preview:sip-tls->XXX:18252 (ESTABLISHED) >>>>> kamailio 71895 kamailio 20u sock 0,9 0t0 >>>>> 8887192 protocol: TCP >>>>> kamailio 71895 kamailio 21u sock 0,9 0t0 >>>>> 8813634 protocol: TCP >>>>> kamailio 71895 kamailio 22u unix 0x00000000c19bd102 0t0 >>>>> 1681407 type=STREAM >>>>> kamailio 71895 kamailio 23u sock 0,9 0t0 >>>>> 8850488 protocol: TCP >>>>> ... >>>>> >>>>> Not only the ESTABLISHED TCP session. But also this empty sockets >>>>> "protocol: TCP" >>>>> What are they doing there in the udp receiver? Is that how it's supposed >>>>> to be? >>>>> >>>>> Kristijan >>>>> >>>>> Am Do., 14. März 2019 um 14:48 Uhr schrieb Daniel-Constantin Mierla >>>>> <[email protected]>: >>>>>> Can you get file written by `kamctl trap`? It should have the backtrace >>>>>> for all kamailio processes. You need latest kamailio 5.2. >>>>>> >>>>>> Also, get the output for: kamctl ps >>>>>> >>>>>> Cheers, >>>>>> Daniel >>>>>> >>>>>> On 14.03.19 13:52, Kristijan Vrban wrote: >>>>>>> When i attach via gdb to one of the tcp worker, i see this: >>>>>>> >>>>>>> (gdb) bt >>>>>>> #0 0x00007fdaf4d14470 in futex_wait (private=<optimized out>, >>>>>>> expected=1, futex_word=0x7fdaeca92f8c) at >>>>>>> ../sysdeps/unix/sysv/linux/futex-internal.h:61 >>>>>>> #1 futex_wait_simple (private=<optimized out>, expected=1, >>>>>>> futex_word=0x7fdaeca92f8c) at ../sysdeps/nptl/futex-internal.h:135 >>>>>>> #2 __pthread_rwlock_wrlock_slow (rwlock=0x7fdaeca92f80) at >>>>>>> pthread_rwlock_wrlock.c:67 >>>>>>> #3 0x00007fdaf0912ee9 in CRYPTO_THREAD_write_lock () from >>>>>>> /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1 >>>>>>> #4 0x00007fdaf08e1c08 in ?? () from >>>>>>> /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1 >>>>>>> #5 0x00007fdaf08a6f69 in ?? () from >>>>>>> /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1 >>>>>>> #6 0x00007fdaf08b36c7 in EVP_CIPHER_CTX_ctrl () from >>>>>>> /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1 >>>>>>> #7 0x00007fdaf0c31144 in ?? () from >>>>>>> /usr/lib/x86_64-linux-gnu/libssl.so.1.1 >>>>>>> #8 0x00007fdaf0c2bddb in ?? () from >>>>>>> /usr/lib/x86_64-linux-gnu/libssl.so.1.1 >>>>>>> #9 0x00007fdaf0c22858 in ?? () from >>>>>>> /usr/lib/x86_64-linux-gnu/libssl.so.1.1 >>>>>>> #10 0x00007fdaf0c1af61 in SSL_do_handshake () from >>>>>>> /usr/lib/x86_64-linux-gnu/libssl.so.1.1 >>>>>>> #11 0x00007fdaf0e8d31b in tls_accept (c=0x7fdaed26fa98, >>>>>>> error=0x7ffffe2a2df0) at tls_server.c:422 >>>>>>> #12 0x00007fdaf0e96a1b in tls_read_f (c=0x7fdaed26fa98, >>>>>>> flags=0x7ffffe2c318c) at tls_server.c:1116 >>>>>>> #13 0x0000556ead5e7c46 in tcp_read_headers (c=0x7fdaed26fa98, >>>>>>> read_flags=0x7ffffe2c318c) at core/tcp_read.c:469 >>>>>>> #14 0x0000556ead5ef9cb in tcp_read_req (con=0x7fdaed26fa98, >>>>>>> bytes_read=0x7ffffe2c3184, read_flags=0x7ffffe2c318c) at >>>>>>> core/tcp_read.c:1496 >>>>>>> #15 0x0000556ead5f575f in handle_io (fm=0x7fdaf597aa98, events=1, >>>>>>> idx=-1) at core/tcp_read.c:1862 >>>>>>> #16 0x0000556ead5e2053 in io_wait_loop_epoll (h=0x556eadaaeec0 <io_w>, >>>>>>> t=2, repeat=0) at core/io_wait.h:1065 >>>>>>> #17 0x0000556ead5f6b35 in tcp_receive_loop (unix_sock=49) at >>>>>>> core/tcp_read.c:1974 >>>>>>> #18 0x0000556ead4c8e24 in tcp_init_children () at core/tcp_main.c:4853 >>>>>>> #19 0x0000556ead3c352a in main_loop () at main.c:1735 >>>>>>> #20 0x0000556ead3ca5f8 in main (argc=13, argv=0x7ffffe2c3828) at >>>>>>> main.c:2675 >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Am Do., 14. März 2019 um 13:41 Uhr schrieb Kristijan Vrban >>>>>>> <[email protected]>: >>>>>>>> Hi, with full debug is see this in log for every incoming TCP SIP >>>>>>>> request: >>>>>>>> >>>>>>>> Mar 14 12:10:15 kamailio-preview /usr/sbin/kamailio[17940]: DEBUG: >>>>>>>> <core> [core/tcp_main.c:3871]: send2child(): WARNING: no free tcp >>>>>>>> receiver, connection passed to the least busy one (105) >>>>>>>> Mar 14 12:10:15 kamailio-preview /usr/sbin/kamailio[17940]: DEBUG: >>>>>>>> <core> [core/tcp_main.c:3875]: send2child(): selected tcp worker 2 >>>>>>>> 27(17937) for activity on [tls:172.17.217.10:5061], 0x7fdaeda8f928 >>>>>>>> >>>>>>>> So the Kamailio TCP process is working, and received TCP traffic. But >>>>>>>> the tcp workers are somehow busy. >>>>>>>> >>>>>>>> When i attach via strace to the TCP worker, i do not see any activity. >>>>>>>> Just: >>>>>>>> >>>>>>>> futex(0x7fdaeca92f8c, FUTEX_WAIT_PRIVATE, 1, NULL >>>>>>>> >>>>>>>> and nothing, even when i see the main tcp process choose this worker >>>>>>>> process. >>>>>>>> >>>>>>>> Kristijan >>>>>>>> >>>>>>>> Am Mi., 27. Feb. 2019 um 15:14 Uhr schrieb Kristijan Vrban >>>>>>>> <[email protected]>: >>>>>>>>> first of all thanks for the feedback. i prepared our system now to run >>>>>>>>> with debug=3 >>>>>>>>> I hope to see more then then. >>>>>>>>> >>>>>>>>> Am Mi., 27. Feb. 2019 um 11:53 Uhr schrieb Kristijan Vrban >>>>>>>>> <[email protected]>: >>>>>>>>>> Hi kamailios, >>>>>>>>>> >>>>>>>>>> i have a creepy situation with v5.2.1 stable Kamilio. After a day or >>>>>>>>>> so, Kamailio stop to process incoming SIP traffic via TCP. The >>>>>>>>>> incoming TCP network packages get TCP-ACK from the OS (Debian 9, >>>>>>>>>> 4.18.0-15-generic-Linux) but Kamailio does not show any processing >>>>>>>>>> for >>>>>>>>>> the SIP-Traffic incoming via TCP. No logs, nothing. While traffic via >>>>>>>>>> UDP is working just totally fine. >>>>>>>>>> >>>>>>>>>> When i look via command "netstat -ntp" is see, that the Recv-Q get >>>>>>>>>> bigger and bigger. e.g.: >>>>>>>>>> >>>>>>>>>> Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program >>>>>>>>>> name tcp 4566 0 172.17.217.12:5060 xxx.xxx.xxx.xxx:57252 ESTABLISHED >>>>>>>>>> 31347/kamailio >>>>>>>>>> >>>>>>>>>> After Kamailio restart, all is working fine again for a day. We have >>>>>>>>>> maybe 10-20 devices online via TCP and low call volume (1-2 call per >>>>>>>>>> minute). The only settings for tcp we have is "tcp_delayed_ack=no" >>>>>>>>>> >>>>>>>>>> How to could we debug this situation? Again, no error, no warings in >>>>>>>>>> the log. Just nothing. >>>>>>>>>> >>>>>>>>>> Kristijan >>>>>>> _______________________________________________ >>>>>>> Kamailio (SER) - Users Mailing List >>>>>>> [email protected] >>>>>>> https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users >>>>>> -- >>>>>> Daniel-Constantin Mierla -- www.asipto.com >>>>>> www.twitter.com/miconda -- www.linkedin.com/in/miconda >>>>>> Kamailio World Conference - May 6-8, 2019 -- www.kamailioworld.com >>>>>> Kamailio Advanced Training - Mar 25-27, 2019, in Washington, DC, USA -- >>>>>> www.asipto.com >>>>>> >>>>> _______________________________________________ >>>>> Kamailio (SER) - Users Mailing List >>>>> [email protected] >>>>> https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users >>>> -- >>>> Daniel-Constantin Mierla -- www.asipto.com >>>> www.twitter.com/miconda -- www.linkedin.com/in/miconda >>>> Kamailio World Conference - May 6-8, 2019 -- www.kamailioworld.com >>>> Kamailio Advanced Training - Mar 25-27, 2019, in Washington, DC, USA -- >>>> www.asipto.com >>>> >> -- >> Daniel-Constantin Mierla -- www.asipto.com >> www.twitter.com/miconda -- www.linkedin.com/in/miconda >> Kamailio World Conference - May 6-8, 2019 -- www.kamailioworld.com >> Kamailio Advanced Training - Mar 25-27, 2019, in Washington, DC, USA -- >> www.asipto.com >> -- Daniel-Constantin Mierla -- www.asipto.com www.twitter.com/miconda -- www.linkedin.com/in/miconda Kamailio World Conference - May 6-8, 2019 -- www.kamailioworld.com Kamailio Advanced Training - Mar 25-27, 2019, in Washington, DC, USA -- www.asipto.com _______________________________________________ Kamailio (SER) - Users Mailing List [email protected] https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
