Hi William,
I suspect you may hit some deadlock (most like a wild guess, as there is
not much data to check). And the only advice I can give you is to
upgrade to 2.4 (simple one) or 3.0 (a bit more complex).
Regards,
Bogdan-Andrei Iancu
OpenSIPS Founder and Developer
https://www.opensips-solutions.com
OpenSIPS Summit, Amsterdam, May 2020
https://www.opensips.org/events/Summit-2020Amsterdam/
On 3/9/20 8:22 PM, William Simon wrote:
This is opensips 2.2.7 (I understand no longer supported) . We found
last time this happened a very large TCP send-q from opensips to a
remote SIP TCP endpoint when running netstat. It seems TCP gets
blocked and does not recover. We do not need to reboot the server to
restore the service, only restart opensips.
*From: *Bogdan-Andrei Iancu <[email protected]>
*Date: *Monday, March 9, 2020 at 5:02 AM
*To: *William Simon <[email protected]>, OpenSIPS users mailling
list <[email protected]>
*Subject: *Re: [OpenSIPS-Users] opensips udp workers lock up with
sched_yield
Hi William,
That it's interesting. Most of the processes are idle (waiting in the
I/O reactor) and there are a bunch of them blocked in a lock (same
pattern). Nevertheless, the weird thing is there is no active process
(like doing something) that may hold the lock. All procs are either
blocked, either idle.
What opensips version you have?
Also, is opensips recovering from this state? or you need to do a reboot ?
Regards,
Bogdan-Andrei Iancu
OpenSIPS Founder and Developer
https://www.opensips-solutions.com
OpenSIPS Summit, Amsterdam, May 2020
https://www.opensips.org/events/Summit-2020Amsterdam/
On 2/28/20 4:20 PM, William Simon wrote:
Bogdan-Andrei, thank you for your insight. Yes, we also use SIP
TCP & TLS. I do not see any locks in the rest of the “opensipsctl
trap.” Perhaps you will be able to understand it better. The trap
is posted at https://pastebin.com/1rs8fVEB
Thank you
William Simon
*From: *Bogdan-Andrei Iancu <[email protected]>
<mailto:[email protected]>
*Date: *Friday, February 28, 2020 at 4:23 AM
*To: *OpenSIPS users mailling list <[email protected]>
<mailto:[email protected]>, William Simon
<[email protected]> <mailto:[email protected]>
*Subject: *Re: [OpenSIPS-Users] opensips udp workers lock up with
sched_yield
Hi William,
That sched_yield translates into waiting for a lock. As the
backtrace (a bit crippled) shows as coming from "send_pr_buffer"
(which is responsible for sending out on the network the buffer of
a SIP msg), I suspect the transport is TCP or TLS (missing frame
#1), as they are using locking. So you have the backtraces from
all the procs? this will help to identify the proc holding the
lock and blocking all the other procs.
Best regards,
Bogdan-Andrei Iancu
OpenSIPS Founder and Developer
https://www.opensips-solutions.com
OpenSIPS Summit, Amsterdam, May 2020
https://www.opensips.org/events/Summit-2020Amsterdam/
OpenSIPS Bootcamp, Miami, March 2020
https://opensips.org/training/OpenSIPS_Bootcamp_2020/
On 2/28/20 3:58 AM, William Simon wrote:
In a SIP video environment we have a pair of opensips servers
load balancing traffic to freeswitch. The call volume is
modest among the two proxies, about 400 concurrent calls at
peak times.
We are occasionally seeing opensips lock up and stop
responding to SIP traffic. There is no error in the syslog and
no indication of resource exhaustion on the VM (it is a 4-core
VMware instance with 4GB of RAM). Once opensips locks up, CPU
soon reaches 100%, but before that, it was not using even 50%
of the CPU.
Get_statistics shows that neither the shared memory nor pkg
memory are heavily used. They are set at 64M / 4M
opensipsctl trace shows this on the udp worker threads
(children=8 in config – it was previously set to children=4
and showed the same behavior)
[Thread debugging using libthread_db enabled]
Using host libthread_db library
"/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007f5b14028bb7 in sched_yield () at
../sysdeps/unix/syscall-template.S:81
#0 0x00007f5b14028bb7 in sched_yield () at
../sysdeps/unix/syscall-template.S:81
No locals.
#1 0x00000000005323a5 in ?? ()
No symbol table info available.
#2 0x00007f5b0ec6c48f in send_pr_buffer () from
/usr/lib/x86_64-linux-gnu/opensips/modules/tm.so
No symbol table info available.
#3 0x00007f5b0ec9eb9b in t_forward_nonack () from
/usr/lib/x86_64-linux-gnu/opensips/modules/tm.so
No symbol table info available.
#4 0x00007f5b0ec6defe in t_relay_to () from
/usr/lib/x86_64-linux-gnu/opensips/modules/tm.so
No symbol table info available.
#5 0x00007f5b0ec815ee in ?? () from
/usr/lib/x86_64-linux-gnu/opensips/modules/tm.so
No symbol table info available.
#6 0x000000000042b20a in do_action ()
No symbol table info available.
#7 0x0000000000430590 in run_action_list ()
No symbol table info available.
#8 0x000000000046d3bc in ?? ()
No symbol table info available.
#9 0x000000000046cc1d in eval_expr ()
No symbol table info available.
#10 0x000000000046cc39 in eval_expr ()
No symbol table info available.
#11 0x000000000046cc09 in eval_expr ()
No symbol table info available.
#12 0x000000000042b19a in do_action ()
No symbol table info available.
#13 0x0000000000430590 in run_action_list ()
No symbol table info available.
#14 0x00000000004306ba in ?? ()
No symbol table info available.
#15 0x000000000042da9a in do_action ()
No symbol table info available.
#16 0x0000000000430590 in run_action_list ()
No symbol table info available.
#17 0x000000000042e62e in do_action ()
No symbol table info available.
#18 0x0000000000430590 in run_action_list ()
No symbol table info available.
#19 0x000000000042e62e in do_action ()
No symbol table info available.
#20 0x0000000000430590 in run_action_list ()
No symbol table info available.
#21 0x00000000004308d0 in run_top_route ()
No symbol table info available.
#22 0x0000000000436ef3 in receive_msg ()
No symbol table info available.
#23 0x000000000052d5c5 in ?? ()
No symbol table info available.
#24 0x000000000051536d in ?? ()
No symbol table info available.
#25 0x000000000051837a in udp_rcv_loop ()
No symbol table info available.
#26 0x0000000000519c38 in udp_start_processes ()
No symbol table info available.
#27 0x000000000041c38a in main ()
No symbol table info available.
---end 82753
-------------------------------------------------------
“The information transmitted is intended only for the person or entity
to which it is addressed and may contain proprietary,
business-confidential and/or privileged material. If you are not the
intended recipient of this message you are hereby notified that any
use, review, retransmission, dissemination, distribution, reproduction
or any action taken in reliance upon this message is prohibited. If
you received this in error, please contact the sender and delete the
material from any computer.”
_______________________________________________
Users mailing list
[email protected]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users