Hi, Since all the backtraces are in allocation routines my guess is that the shared memory lock might be causing a problem.
Are you compiling with -DF_MALLOC? What version of OpenSIPS are you using? What is the total shared memory pool you are allocating? What amount of memory are you using? ( Use : opensipsctl fifo get_statistics all ) Alex Massover wrote: > Some more, > > (gdb) bt > #0 0xb78dc424 in __kernel_vsyscall () > #1 0xb781741c in sched_yield () from /lib/i686/cmov/libc.so.6 > #2 0xb73d77fd in build_new_dlg () from /usr/lib/opensips/modules/dialog.so > #3 0xb73d4b81 in dlg_create_dialog () from > /usr/lib/opensips/modules/dialog.so > #4 0xb73c9c9e in ?? () from /usr/lib/opensips/modules/dialog.so > #5 0x08055030 in do_action () > #6 0x08053ebf in run_action_list () > #7 0x08056e7a in do_action () > #8 0x08053ebf in run_action_list () > #9 0x08057d99 in run_top_route () > #10 0x0808ad6c in receive_msg () > #11 0x080bd2f2 in udp_rcv_loop () > #12 0x08069339 in main () > > > (gdb) bt > #0 0xb78dc424 in __kernel_vsyscall () > #1 0xb781741c in sched_yield () from /lib/i686/cmov/libc.so.6 > #2 0xb77242cd in build_cell () from /usr/lib/opensips/modules/tm.so > #3 0xb7739c4a in t_newtran () from /usr/lib/opensips/modules/tm.so > #4 0xb772e7b8 in t_relay_to () from /usr/lib/opensips/modules/tm.so > #5 0xb773b501 in ?? () from /usr/lib/opensips/modules/tm.so > #6 0x08055030 in do_action () > #7 0x08053ebf in run_action_list () > #8 0x08095cf2 in eval_expr () > #9 0x080958d9 in eval_expr () > #10 0x08095919 in eval_expr () > #11 0x080554e2 in do_action () > #12 0x08053ebf in run_action_list () > #13 0x080569d8 in do_action () > #14 0x08053ebf in run_action_list () > #15 0x08056e7a in do_action () > #16 0x08053ebf in run_action_list () > #17 0x08057d99 in run_top_route () > #18 0x0808ad6c in receive_msg () > #19 0x080bd2f2 in udp_rcv_loop () > #20 0x08069339 in main () > > -- > Best Regards, > Alex Massover > VoIP R&D TL > Jajah Inc. > > >> -----Original Message----- >> From: [email protected] [mailto:users- >> [email protected]] On Behalf Of Alex Massover >> Sent: Thursday, January 21, 2010 2:24 PM >> To: OpenSIPS users mailling list >> Subject: Re: [OpenSIPS-Users] sched_yield() >> >> Hi, >> >> Another one.. It hangs for a number of seconds (but it's enough to >> cause to SIP timeouts - MSG queue jumps to 260K), it's hard to make a >> bt at the right moment. >> This one looks better because there's sched_yield() there :) >> >> (gdb) bt >> #0 0xb77d5424 in __kernel_vsyscall () >> #1 0xb771041c in sched_yield () from /lib/i686/cmov/libc.so.6 >> #2 0x080bf23d in new_avp () >> #3 0x080bf53f in add_avp () >> #4 0xb72c1c9c in ?? () from /usr/lib/opensips/modules/dialog.so >> #5 0x08055030 in do_action () >> #6 0x08053ebf in run_action_list () >> #7 0x08056e7a in do_action () >> #8 0x08053ebf in run_action_list () >> #9 0x08056e7a in do_action () >> #10 0x08053ebf in run_action_list () >> #11 0x08056e7a in do_action () >> #12 0x08053ebf in run_action_list () >> #13 0x08057d99 in run_top_route () >> #14 0x0808ad6c in receive_msg () >> #15 0x080bd2f2 in udp_rcv_loop () >> #16 0x08069339 in main () >> >> -- >> Best Regards, >> Alex Massover >> VoIP R&D TL >> Jajah Inc. >> >> >>> -----Original Message----- >>> From: [email protected] [mailto:users- >>> [email protected]] On Behalf Of Alex Massover >>> Sent: Thursday, January 21, 2010 2:05 PM >>> To: OpenSIPS users mailling list >>> Subject: Re: [OpenSIPS-Users] sched_yield() >>> >>> Hi Andrei, >>> Hopefully this is it (with FASTLOCK) >>> >>> #0 0xb77d5424 in __kernel_vsyscall () >>> #1 0xb772babb in poll () from /lib/i686/cmov/libc.so.6 >>> #2 0xb77ba83a in ?? () from /lib/i686/cmov/libresolv.so.2 >>> #3 0xb77b8946 in __libc_res_nquery () from >>> /lib/i686/cmov/libresolv.so.2 >>> #4 0xb77b8fdb in ?? () from /lib/i686/cmov/libresolv.so.2 >>> #5 0xb77b92ae in __libc_res_nsearch () from >>> /lib/i686/cmov/libresolv.so.2 >>> #6 0xb77b96d4 in __res_nsearch () from /lib/i686/cmov/libresolv.so.2 >>> #7 0xb77b808a in res_search () from /lib/i686/cmov/libresolv.so.2 >>> #8 0x0808c613 in get_record () >>> #9 0x0808cf05 in ?? () >>> #10 0x0808e385 in sip_resolvehost () >>> #11 0x0807a26c in mk_proxy () >>> #12 0xb7627d39 in t_relay_to () from /usr/lib/opensips/modules/tm.so >>> #13 0xb7634501 in ?? () from /usr/lib/opensips/modules/tm.so >>> #14 0x08055030 in do_action () >>> #15 0x08053ebf in run_action_list () >>> #16 0x08095cf2 in eval_expr () >>> #17 0x080958d9 in eval_expr () >>> #18 0x08095919 in eval_expr () >>> #19 0x080554e2 in do_action () >>> #20 0x08053ebf in run_action_list () >>> #21 0x08056e7a in do_action () >>> #22 0x08053ebf in run_action_list () >>> ---Type <return> to continue, or q <return> to quit--- >>> #23 0x080569d8 in do_action () >>> #24 0x08053ebf in run_action_list () >>> #25 0x08056e7a in do_action () >>> #26 0x08053ebf in run_action_list () >>> #27 0x08057d99 in run_top_route () >>> #28 0x0808ad6c in receive_msg () >>> #29 0x080bd2f2 in udp_rcv_loop () >>> #30 0x08069339 in main () >>> (gdb) >>> >>> -- >>> Best Regards, >>> Alex Massover >>> VoIP R&D TL >>> Jajah Inc. >>> >>>> -----Original Message----- >>>> From: [email protected] [mailto:users- >>>> [email protected]] On Behalf Of Andrei Dragus >>>> Sent: Wednesday, January 20, 2010 2:58 PM >>>> To: OpenSIPS users mailling list >>>> Subject: Re: [OpenSIPS-Users] sched_yield() >>>> >>>> Hi, >>>> >>>> I think that there is a lock that is being held more than it should >>>> >>> be >>> >>>> and that's what causes starvation. It would help us if you could >>>> >>> attach >>> >>>> to a process using gdb and give us a full backtrace. >>>> >>>> Temporary solutions which should work would be to reduce the number >>>> >>> of >>> >>>> processes to 4-6 or to recompile replacing -DFAST_LOCK with one of >>>> >>> the >>> >>>> other options (-DUSE_POSIX_SEM or -DUSE_PTHREAD_MUTEX) but we >>>> >> should >> >>>> see >>>> where this is from to fix it. >>>> >>>> Alex Massover wrote: >>>> >>>>> Hi! >>>>> >>>>> Yes, from the source on debian, I build deb package. (I did some >>>>> >>>> minor changes to the source, but the problem happens also without >>>> >> my >> >>>> changes) >>>> >>>>> 16 children on 4 cores. >>>>> >>>>> What do you suggest to reduce it to 4? It runs on 2.6.32 on >>>>> >> VMware >> >>>> ESX. >>>> >>>>> I'm also trying now sleep(0) instead of sched_yield(). >>>>> >>>>> -- >>>>> Best Regards, >>>>> Alex Massover >>>>> VoIP R&D TL >>>>> Jajah Inc. >>>>> >>>>> >>>>>> -----Original Message----- >>>>>> From: [email protected] [mailto:users- >>>>>> [email protected]] On Behalf Of Andrei Dragus >>>>>> Sent: Wednesday, January 20, 2010 1:05 PM >>>>>> To: OpenSIPS users mailling list >>>>>> Subject: Re: [OpenSIPS-Users] sched_yield() >>>>>> >>>>>> Hi Alex, >>>>>> >>>>>> Are you building OpenSIPS from source? >>>>>> How many processes do you have and on how many cores? >>>>>> >>>>>> >>>>>> Alex Massover wrote: >>>>>> >>>>>> >>>>>>> Hello! >>>>>>> >>>>>>> I'm facing a strange problem, sometimes under a stress OpenSIPS >>>>>>> "locks" - load average jumps, SIP processing delays, opensips >>>>>>> >> msg >> >>>>>>> queue fills with a lot of sip messages, opensips processes >>>>>>> >> start >> >>> to >>> >>>>>>> comsume a lot of CPU. >>>>>>> >>>>>>> And strace shows: >>>>>>> >>>>>>> sched_yield() >>>>>>> >>>>>>> sched_yield() >>>>>>> >>>>>>> sched_yield() >>>>>>> >>>>>>> sched_yield() >>>>>>> >>>>>>> .... >>>>>>> >>>>>>> for all processes. >>>>>>> >>>>>>> If I stop the stress - after a while (not immediately) - it >>>>>>> >>>> unlocks, >>>> >>>>>>> also suddenly, I can see in top that all opensips processes >>>>>>> >> stop >> >>> to >>> >>>>>>> consume CPU. >>>>>>> >>>>>>> What can it be? Some kind of starvation? >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> Best Regards, >>>>>>> >>>>>>> Alex Massover >>>>>>> >>>>>>> VoIP R&D TL >>>>>>> >>>>>>> Jajah Inc. >>>>>>> >>>>>>> >>>>>>> >>>>>>> This mail was sent via Mail-SeCure System. >>>>>>> --------------------------------------------------------------- >>>>>>> >> -- >> >>> -- >>> >>>> -- >>>> >>>>>> --- >>>>>> >>>>>> >>>>>>> _______________________________________________ >>>>>>> Users mailing list >>>>>>> [email protected] >>>>>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users >>>>>>> >>>>>>> >>>>>>> >>>>>> -- >>>>>> Andrei Dragus >>>>>> www.voice-system.ro >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Users mailing list >>>>>> [email protected] >>>>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users >>>>>> >>>>>> This mail was received via Mail-SeCure System. >>>>>> >>>>>> >>>>>> >>>>> This mail was sent via Mail-SeCure System. >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Users mailing list >>>>> [email protected] >>>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users >>>>> >>>>> >>>> -- >>>> Andrei Dragus >>>> www.voice-system.ro >>>> >>>> >>>> _______________________________________________ >>>> Users mailing list >>>> [email protected] >>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users >>>> >>>> This mail was received via Mail-SeCure System. >>>> >>>> >>> This mail was sent via Mail-SeCure System. >>> >>> >>> >>> _______________________________________________ >>> Users mailing list >>> [email protected] >>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users >>> >>> This mail was received via Mail-SeCure System. >>> >>> >> This mail was sent via Mail-SeCure System. >> >> >> >> _______________________________________________ >> Users mailing list >> [email protected] >> http://lists.opensips.org/cgi-bin/mailman/listinfo/users >> >> This mail was received via Mail-SeCure System. >> >> > > > This mail was sent via Mail-SeCure System. > > > > _______________________________________________ > Users mailing list > [email protected] > http://lists.opensips.org/cgi-bin/mailman/listinfo/users > -- Andrei Dragus www.voice-system.ro _______________________________________________ Users mailing list [email protected] http://lists.opensips.org/cgi-bin/mailman/listinfo/users
