Just curious, did you get to compile with OpenSSL 1.0 and test? On Tue, Mar 26, 2019 at 06:12 Kristijan Vrban <[email protected]> wrote:
> And again one more kamctl trap file where > > set_reply_no_connect was set. > > Am Di., 26. März 2019 um 08:53 Uhr schrieb Kristijan Vrban > <[email protected]>: > > > > Attached also the output of kamctl trap > > > > Am Di., 26. März 2019 um 08:42 Uhr schrieb Kristijan Vrban > > <[email protected]>: > > > > > > > Have you done a test with tools such as sipp, or was this happening > > > > after a while, with usual phones registering? > > > > > > Usual variety of devices registering via TLS. But i can not exclude > > > that some devices displaying behavioural problems. > > > > > > > Can you list the tcp connections and see if they are listed? > > > > kamctl tcp core.tcp_list > > > > > > Need Kex module for that? So i can deliver next time. But when i do > > > "lsof -u kamailio |grep TCP" > > > i get a long list of more then 2000 lines with: > > > > > > ... > > > kamailio 37561 kamailio 2105u sock 0,9 0t0 > > > 27856287 protocol: TCP > > > kamailio 37561 kamailio 2106u sock 0,9 0t0 > > > 27856305 protocol: TCP > > > kamailio 37561 kamailio 2107u sock 0,9 0t0 > > > 27856306 protocol: TCP > > > kamailio 37561 kamailio 2108u sock 0,9 0t0 > > > 27856914 protocol: TCP > > > ... > > > > > > So about the time Kamailio created a lot of socket in the TCP domain, > > > but which are not bound to any port (eg via connect(2) or listen(2) or > > > bind(2)) > > > Until we get to the maximum number of 2048 connections. > > > > > > Best > > > Kristijan > > > > > > Am Mo., 25. März 2019 um 14:27 Uhr schrieb Daniel-Constantin Mierla > > > <[email protected]>: > > > > > > > > Have you done a test with tools such as sipp, or was this happening > > > > after a while, with usual phones registering? > > > > > > > > Can you list the tcp connections and see if they are listed? > > > > > > > > kamctl tcp core.tcp_list > > > > > > > > Cheers, > > > > Daniel > > > > > > > > On 25.03.19 08:03, Kristijan Vrban wrote: > > > > >> The solution here is to use set_reply_no_connect() > > > > > implemented it. Now the issue has shifted to: > > > > > > > > > > ERROR: <core> [core/tcp_main.c:3959]: handle_new_connect(): maximum > > > > > number of connections exceeded: 2048/2048 > > > > > > > > > > But not a single TCP connection is active between Kamailio and any > > > > > device. Seems this counter for maximum number of connections > > > > > now has an issue? > > > > > > > > > > Kristijan > > > > > > > > > > Am Mi., 20. März 2019 um 15:07 Uhr schrieb Daniel-Constantin Mierla > > > > > <[email protected]>: > > > > >> Hello, > > > > >> > > > > >> based on the trap output I think I could figure out what happened > there. > > > > >> > > > > >> You have tcp_children to very low value (1 or so), the problem is > not > > > > >> actually that one, but the fact that the connection to upstream > (the > > > > >> device/app sending the request) was closed after receiving the > request > > > > >> and routing of the reply gets stuck in the way of: > > > > >> > > > > >> - a reply is received and has to be forwarded > > > > >> - connection was lost, so Kamailio tries to establish a new > one, but > > > > >> takes time till fails because the upstream is behind nat or so > based on > > > > >> the via header: > > > > >> > > > > >> Via: SIP/2.0/TLS > > > > >> 10.1.0.4:10002 > ;rport=55229;received=13.94.188.218;branch=z9hG4bK-3336-7f2927bfd703ae907348edff3611bfc9 > > > > >> > > > > >> - the reply is retransmitted and gets to another worker, which > tries > > > > >> to forward it again, but discovers a connection structure for that > > > > >> destination exists (created by previous reply worker) and now > waits for > > > > >> the connection to be released (or better said, for the mutex on > writing > > > > >> buffer to be unlocked) > > > > >> > > > > >> - as the second reply waits, there can be other retransmissions > of the > > > > >> reply ending up in other workers stuck on waiting for the mutex > of the > > > > >> connection write buffer > > > > >> > > > > >> The solution here is to use set_reply_no_connect() -- you can put > it > > > > >> first in request_route block. I think this would be a good > addition to > > > > >> the default configuration file as well, IMO, the sip server > should not > > > > >> connect for sending replies and should do it also for requests > that go > > > > >> behind nat. > > > > >> > > > > >> Cheers, > > > > >> Daniel > > > > >> > > > > >> On 19.03.19 10:53, Kristijan Vrban wrote: > > > > >>> So i had again the situation. But this time, incoming udp was > > > > >>> affected. Kamailio was sending out OPTIONS (via dispatcher > module) to > > > > >>> a group of asterisk machines > > > > >>> but the 200 OK reply to the OPTIONS where not processed, so the > > > > >>> dispatcher module set all asterisk to inactive, even though they > > > > >>> replied 200 OK > > > > >>> > > > > >>> Attached the output of kamctl trap during the situation. Hope > there is > > > > >>> any useful in it. Because after "kamctl trap" it was working > again > > > > >>> without kamailio restart. > > > > >>> > > > > >>> Best > > > > >>> Kristijan > > > > >>> > > > > >>> Am Mo., 18. März 2019 um 12:27 Uhr schrieb Daniel-Constantin > Mierla > > > > >>> <[email protected]>: > > > > >>>> Hello, > > > > >>>> > > > > >>>> setting tcp_children=1 is not a god option for scallability, > practically > > > > >>>> you set kamailio to process a single tcp message at one time, > on high > > > > >>>> traffic, that won't work well. > > > > >>>> > > > > >>>> Maybe try to set tcp_children to 2 or 4, that should make an > eventual > > > > >>>> race appear faster. > > > > >>>> > > > > >>>> Regarding the pid, if it is an outgoing connection, then it can > be > > > > >>>> created by any worker process, including a UDP worker, if that > was the > > > > >>>> one receiving the sip message over udp and sends it out via tcp. > > > > >>>> > > > > >>>> Cheers, > > > > >>>> Daniel > > > > >>>> > > > > >>>> On 18.03.19 10:09, Kristijan Vrban wrote: > > > > >>>>> Hi Daniel, > > > > >>>>> > > > > >>>>> for testing, i now had set: "tcp_children=1" and so far this > issue did not occur > > > > >>>>> ever since. So now value to provide for "kamctl trap" yet. > > > > >>>>> > > > > >>>>> "kamctl ps" show this two process to handle tcp: > > > > >>>>> > > > > >>>>> ... > > > > >>>>> }, { > > > > >>>>> "IDX": 25, > > > > >>>>> "PID": 71929, > > > > >>>>> "DSC": "tcp receiver (generic) child=0" > > > > >>>>> }, { > > > > >>>>> "IDX": 26, > > > > >>>>> "PID": 71933, > > > > >>>>> "DSC": "tcp main process" > > > > >>>>> } > > > > >>>>> ... > > > > >>>>> > > > > >>>>> > > > > >>>>> Ok, but then is was wondering to see a TCP connection on a udp > receiver child: > > > > >>>>> > > > > >>>>> > > > > >>>>> netstat -ntp |grep 5061 > > > > >>>>> > > > > >>>>> ... > > > > >>>>> tcp 0 0 172.17.217.10:5061 > 195.70.114.125:18252 > > > > >>>>> ESTABLISHED 71895/kamailio > > > > >>>>> ... > > > > >>>>> > > > > >>>>> An pid 71895 is: > > > > >>>>> > > > > >>>>> }, { > > > > >>>>> "IDX": 3, > > > > >>>>> "PID": 71895, > > > > >>>>> "DSC": "udp receiver child=2 sock=127.0.0.1:5060" > > > > >>>>> }, { > > > > >>>>> > > > > >>>>> > > > > >>>>> > > > > >>>>> And if i look into it via "lsof -p 71895" (the udp receiver > child) > > > > >>>>> > > > > >>>>> ... > > > > >>>>> kamailio 71895 kamailio 14u sock 0,9 0t0 > > > > >>>>> 8856085 protocol: TCP > > > > >>>>> kamailio 71895 kamailio 15u sock 0,9 0t0 > > > > >>>>> 8886886 protocol: TCP > > > > >>>>> kamailio 71895 kamailio 16u sock 0,9 0t0 > > > > >>>>> 8854886 protocol: TCP > > > > >>>>> kamailio 71895 kamailio 17u sock 0,9 0t0 > > > > >>>>> 8828915 protocol: TCP > > > > >>>>> kamailio 71895 kamailio 18u unix 0x000000005f73cb91 0t0 > > > > >>>>> 1680314 type=DGRAM > > > > >>>>> kamailio 71895 kamailio 19u IPv4 1846523 0t0 > > > > >>>>> TCP kamailio-preview:sip-tls->XXX:18252 (ESTABLISHED) > > > > >>>>> kamailio 71895 kamailio 20u sock 0,9 0t0 > > > > >>>>> 8887192 protocol: TCP > > > > >>>>> kamailio 71895 kamailio 21u sock 0,9 0t0 > > > > >>>>> 8813634 protocol: TCP > > > > >>>>> kamailio 71895 kamailio 22u unix 0x00000000c19bd102 0t0 > > > > >>>>> 1681407 type=STREAM > > > > >>>>> kamailio 71895 kamailio 23u sock 0,9 0t0 > > > > >>>>> 8850488 protocol: TCP > > > > >>>>> ... > > > > >>>>> > > > > >>>>> Not only the ESTABLISHED TCP session. But also this empty > sockets > > > > >>>>> "protocol: TCP" > > > > >>>>> What are they doing there in the udp receiver? Is that how > it's supposed to be? > > > > >>>>> > > > > >>>>> Kristijan > > > > >>>>> > > > > >>>>> Am Do., 14. März 2019 um 14:48 Uhr schrieb Daniel-Constantin > Mierla > > > > >>>>> <[email protected]>: > > > > >>>>>> Can you get file written by `kamctl trap`? It should have the > backtrace > > > > >>>>>> for all kamailio processes. You need latest kamailio 5.2. > > > > >>>>>> > > > > >>>>>> Also, get the output for: kamctl ps > > > > >>>>>> > > > > >>>>>> Cheers, > > > > >>>>>> Daniel > > > > >>>>>> > > > > >>>>>> On 14.03.19 13:52, Kristijan Vrban wrote: > > > > >>>>>>> When i attach via gdb to one of the tcp worker, i see this: > > > > >>>>>>> > > > > >>>>>>> (gdb) bt > > > > >>>>>>> #0 0x00007fdaf4d14470 in futex_wait (private=<optimized > out>, > > > > >>>>>>> expected=1, futex_word=0x7fdaeca92f8c) at > > > > >>>>>>> ../sysdeps/unix/sysv/linux/futex-internal.h:61 > > > > >>>>>>> #1 futex_wait_simple (private=<optimized out>, expected=1, > > > > >>>>>>> futex_word=0x7fdaeca92f8c) at > ../sysdeps/nptl/futex-internal.h:135 > > > > >>>>>>> #2 __pthread_rwlock_wrlock_slow (rwlock=0x7fdaeca92f80) at > > > > >>>>>>> pthread_rwlock_wrlock.c:67 > > > > >>>>>>> #3 0x00007fdaf0912ee9 in CRYPTO_THREAD_write_lock () from > > > > >>>>>>> /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1 > > > > >>>>>>> #4 0x00007fdaf08e1c08 in ?? () from > /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1 > > > > >>>>>>> #5 0x00007fdaf08a6f69 in ?? () from > /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1 > > > > >>>>>>> #6 0x00007fdaf08b36c7 in EVP_CIPHER_CTX_ctrl () from > > > > >>>>>>> /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1 > > > > >>>>>>> #7 0x00007fdaf0c31144 in ?? () from > /usr/lib/x86_64-linux-gnu/libssl.so.1.1 > > > > >>>>>>> #8 0x00007fdaf0c2bddb in ?? () from > /usr/lib/x86_64-linux-gnu/libssl.so.1.1 > > > > >>>>>>> #9 0x00007fdaf0c22858 in ?? () from > /usr/lib/x86_64-linux-gnu/libssl.so.1.1 > > > > >>>>>>> #10 0x00007fdaf0c1af61 in SSL_do_handshake () from > > > > >>>>>>> /usr/lib/x86_64-linux-gnu/libssl.so.1.1 > > > > >>>>>>> #11 0x00007fdaf0e8d31b in tls_accept (c=0x7fdaed26fa98, > > > > >>>>>>> error=0x7ffffe2a2df0) at tls_server.c:422 > > > > >>>>>>> #12 0x00007fdaf0e96a1b in tls_read_f (c=0x7fdaed26fa98, > > > > >>>>>>> flags=0x7ffffe2c318c) at tls_server.c:1116 > > > > >>>>>>> #13 0x0000556ead5e7c46 in tcp_read_headers (c=0x7fdaed26fa98, > > > > >>>>>>> read_flags=0x7ffffe2c318c) at core/tcp_read.c:469 > > > > >>>>>>> #14 0x0000556ead5ef9cb in tcp_read_req (con=0x7fdaed26fa98, > > > > >>>>>>> bytes_read=0x7ffffe2c3184, read_flags=0x7ffffe2c318c) at > > > > >>>>>>> core/tcp_read.c:1496 > > > > >>>>>>> #15 0x0000556ead5f575f in handle_io (fm=0x7fdaf597aa98, > events=1, > > > > >>>>>>> idx=-1) at core/tcp_read.c:1862 > > > > >>>>>>> #16 0x0000556ead5e2053 in io_wait_loop_epoll > (h=0x556eadaaeec0 <io_w>, > > > > >>>>>>> t=2, repeat=0) at core/io_wait.h:1065 > > > > >>>>>>> #17 0x0000556ead5f6b35 in tcp_receive_loop (unix_sock=49) at > > > > >>>>>>> core/tcp_read.c:1974 > > > > >>>>>>> #18 0x0000556ead4c8e24 in tcp_init_children () at > core/tcp_main.c:4853 > > > > >>>>>>> #19 0x0000556ead3c352a in main_loop () at main.c:1735 > > > > >>>>>>> #20 0x0000556ead3ca5f8 in main (argc=13, > argv=0x7ffffe2c3828) at main.c:2675 > > > > >>>>>>> > > > > >>>>>>> > > > > >>>>>>> > > > > >>>>>>> > > > > >>>>>>> > > > > >>>>>>> > > > > >>>>>>> > > > > >>>>>>> Am Do., 14. März 2019 um 13:41 Uhr schrieb Kristijan Vrban > > > > >>>>>>> <[email protected]>: > > > > >>>>>>>> Hi, with full debug is see this in log for every incoming > TCP SIP request: > > > > >>>>>>>> > > > > >>>>>>>> Mar 14 12:10:15 kamailio-preview /usr/sbin/kamailio[17940]: > DEBUG: > > > > >>>>>>>> <core> [core/tcp_main.c:3871]: send2child(): WARNING: no > free tcp > > > > >>>>>>>> receiver, connection passed to the least busy one (105) > > > > >>>>>>>> Mar 14 12:10:15 kamailio-preview /usr/sbin/kamailio[17940]: > DEBUG: > > > > >>>>>>>> <core> [core/tcp_main.c:3875]: send2child(): selected tcp > worker 2 > > > > >>>>>>>> 27(17937) for activity on [tls:172.17.217.10:5061], > 0x7fdaeda8f928 > > > > >>>>>>>> > > > > >>>>>>>> So the Kamailio TCP process is working, and received TCP > traffic. But > > > > >>>>>>>> the tcp workers are somehow busy. > > > > >>>>>>>> > > > > >>>>>>>> When i attach via strace to the TCP worker, i do not see > any activity. Just: > > > > >>>>>>>> > > > > >>>>>>>> futex(0x7fdaeca92f8c, FUTEX_WAIT_PRIVATE, 1, NULL > > > > >>>>>>>> > > > > >>>>>>>> and nothing, even when i see the main tcp process choose > this worker process. > > > > >>>>>>>> > > > > >>>>>>>> Kristijan > > > > >>>>>>>> > > > > >>>>>>>> Am Mi., 27. Feb. 2019 um 15:14 Uhr schrieb Kristijan Vrban > > > > >>>>>>>> <[email protected]>: > > > > >>>>>>>>> first of all thanks for the feedback. i prepared our > system now to run > > > > >>>>>>>>> with debug=3 > > > > >>>>>>>>> I hope to see more then then. > > > > >>>>>>>>> > > > > >>>>>>>>> Am Mi., 27. Feb. 2019 um 11:53 Uhr schrieb Kristijan Vrban > > > > >>>>>>>>> <[email protected]>: > > > > >>>>>>>>>> Hi kamailios, > > > > >>>>>>>>>> > > > > >>>>>>>>>> i have a creepy situation with v5.2.1 stable Kamilio. > After a day or > > > > >>>>>>>>>> so, Kamailio stop to process incoming SIP traffic via > TCP. The > > > > >>>>>>>>>> incoming TCP network packages get TCP-ACK from the OS > (Debian 9, > > > > >>>>>>>>>> 4.18.0-15-generic-Linux) but Kamailio does not show any > processing for > > > > >>>>>>>>>> the SIP-Traffic incoming via TCP. No logs, nothing. While > traffic via > > > > >>>>>>>>>> UDP is working just totally fine. > > > > >>>>>>>>>> > > > > >>>>>>>>>> When i look via command "netstat -ntp" is see, that the > Recv-Q get > > > > >>>>>>>>>> bigger and bigger. e.g.: > > > > >>>>>>>>>> > > > > >>>>>>>>>> Proto Recv-Q Send-Q Local Address Foreign Address State > PID/Program > > > > >>>>>>>>>> name tcp 4566 0 172.17.217.12:5060 xxx.xxx.xxx.xxx:57252 > ESTABLISHED > > > > >>>>>>>>>> 31347/kamailio > > > > >>>>>>>>>> > > > > >>>>>>>>>> After Kamailio restart, all is working fine again for a > day. We have > > > > >>>>>>>>>> maybe 10-20 devices online via TCP and low call volume > (1-2 call per > > > > >>>>>>>>>> minute). The only settings for tcp we have is > "tcp_delayed_ack=no" > > > > >>>>>>>>>> > > > > >>>>>>>>>> How to could we debug this situation? Again, no error, no > warings in > > > > >>>>>>>>>> the log. Just nothing. > > > > >>>>>>>>>> > > > > >>>>>>>>>> Kristijan > > > > >>>>>>> _______________________________________________ > > > > >>>>>>> Kamailio (SER) - Users Mailing List > > > > >>>>>>> [email protected] > > > > >>>>>>> https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users > > > > >>>>>> -- > > > > >>>>>> Daniel-Constantin Mierla -- www.asipto.com > > > > >>>>>> www.twitter.com/miconda -- www.linkedin.com/in/miconda > > > > >>>>>> Kamailio World Conference - May 6-8, 2019 -- > www.kamailioworld.com > > > > >>>>>> Kamailio Advanced Training - Mar 25-27, 2019, in Washington, > DC, USA -- www.asipto.com > > > > >>>>>> > > > > >>>>> _______________________________________________ > > > > >>>>> Kamailio (SER) - Users Mailing List > > > > >>>>> [email protected] > > > > >>>>> https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users > > > > >>>> -- > > > > >>>> Daniel-Constantin Mierla -- www.asipto.com > > > > >>>> www.twitter.com/miconda -- www.linkedin.com/in/miconda > > > > >>>> Kamailio World Conference - May 6-8, 2019 -- > www.kamailioworld.com > > > > >>>> Kamailio Advanced Training - Mar 25-27, 2019, in Washington, > DC, USA -- www.asipto.com > > > > >>>> > > > > >> -- > > > > >> Daniel-Constantin Mierla -- www.asipto.com > > > > >> www.twitter.com/miconda -- www.linkedin.com/in/miconda > > > > >> Kamailio World Conference - May 6-8, 2019 -- > www.kamailioworld.com > > > > >> Kamailio Advanced Training - Mar 25-27, 2019, in Washington, DC, > USA -- www.asipto.com > > > > >> > > > > -- > > > > Daniel-Constantin Mierla -- www.asipto.com > > > > www.twitter.com/miconda -- www.linkedin.com/in/miconda > > > > Kamailio World Conference - May 6-8, 2019 -- www.kamailioworld.com > > > > Kamailio Advanced Training - Mar 25-27, 2019, in Washington, DC, USA > -- www.asipto.com > > > > > _______________________________________________ > Kamailio (SER) - Users Mailing List > [email protected] > https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users >
_______________________________________________ Kamailio (SER) - Users Mailing List [email protected] https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
