OK, I will wait a bit and then backport. Thanks for testing and assisting with troubleshooting.
Daniel On 06.09.17 14:29, Vitaliy Aleksandrov wrote: > Thanks for the quick fix. > > Installed the latest 5.0 branch with the mentioned patch and had no > crashes so far. > Will do an additional testing and inform if find any issues. > > On Wed, Sep 6, 2017 at 12:25 PM, Daniel-Constantin Mierla > <[email protected] <mailto:[email protected]>> wrote: > > I think I caught the issue and fixed with commit > b672d8ef63715cf816390a05ce7a441377c3e468 in master branch. > > It was caused by not resetting the T_ASYNC_CONTINUE flag after > t_continue(), which caused other parts of code to not reset the > reply field of any branch. The reply field could have been set by > another process, so at the time of destroying the transaction, the > pointer could have been to memory zone of another process, so > access it caused the crash. > > Along with this fix, I added few other safety checks in my way to > investigate the issue. > > Can you cherry pick this commit and test in branch 5.0? I want to > be sure there is no obvious side effect before porting it. > > Cheers, > Daniel > > > On 05.09.17 11:02, Daniel-Constantin Mierla wrote: >> >> Hello, >> >> does it happen to have the pcap (or ngrep) with the sip traffic >> for the call? It will be useful to see the flow with >> requests/replies/retransmissions and their timestamps... >> >> Is this version the snapshot of 5.0.2 release or a build from >> branch 5.0? >> >> Cheers, >> Daniel >> >> >> On 05.09.17 10:01, Vitaliy Aleksandrov wrote: >>> Hello kamailio list, >>> >>> Recently found a problem in my configuration that uses >>> async_route() functionality. >>> It crashes after several calls when wait_timer fires. >>> >>> #0 0xb74a8556 in raise () from /lib/libc.so.6 >>> #1 0xb74a9d78 in abort () from /lib/libc.so.6 >>> #2 0x08293ae2 in qm_free (qmp=0xad65d000, p=0x3d64692d, >>> file=0xb6216a16 "tm: h_table.c", func=0xb621663c >>> <__FUNCTION__.18751> "free_cell_helper", line=187, >>> mname=0xb621664d "tm") at core/mem/q_malloc.c:471 >>> #3 0xb613f103 in free_cell_helper (dead_cell=0xae2cd210, >>> silent=0, fname=0xb6239ea5 "timer.c", fline=655) at h_table.c:187 >>> #4 0xb61e7758 in wait_handler (ti=557858937, >>> wait_tl=0xae2cd258, data=0xae2cd210) at timer.c:655 >>> #5 0x0826a2cc in timer_list_expire (t=557858937, h=0xad6b9668, >>> slow_l=0xad6ba144, slow_mark=312) at core/timer.c:874 >>> #6 0x08267cb1 in timer_handler () at core/timer.c:939 >>> #7 0x0826a4d3 in timer_main () at core/timer.c:978 >>> #8 0x08069575 in main_loop () at main.c:1721 >>> #9 0x080707ca in main (argc=11, argv=0xbf85f044) at main.c:2723 >>> >>> When crash happens, kamailio prints the following message: >>> Sep 4 16:15:38 [18938]: : <core> [core/mem/q_malloc.c:469]: >>> qm_free(): BUG: qm_free: bad pointer 0x70707553 (out of memory >>> block!) called from tm: h_table.c: free_cell_helper(187) - aborting >>> >>> Also had a few crashes in retransmission_handler(): >>> >>> #0 0xb750b556 in raise () from /lib/libc.so.6 >>> #1 0xb750cd78 in abort () from /lib/libc.so.6 >>> #2 0xb6249b5a in retransmission_handler (r_buf=0xae036674) at >>> timer.c:367 >>> #3 0xb6247558 in retr_buf_handler (ticks=1234464444, >>> tl=0xae036688, p=0x1f40) at timer.c:594 >>> #4 0x0826a2cc in timer_list_expire (t=1234464444, h=0xad71c668, >>> slow_l=0xad71cd44, slow_mark=2232) at core/timer.c:874 >>> #5 0x08267cb1 in timer_handler () at core/timer.c:939 >>> #6 0x0826a4d3 in timer_main () at core/timer.c:978 >>> #7 0x08069575 in main_loop () at main.c:1721 >>> #8 0x080707ca in main (argc=11, argv=0xbff64134) at main.c:2723 >>> >>> ERROR: tm [timer.c:366]: retransmission_handler(): transaction >>> 0xae0365e0 scheduled for deletion and called from RETR timer >>> (flags 6d) >>> >>> Both timers fired for an INVITE transaction that was previously >>> suspended by async_route(), then resumed, sent out and received >>> a 4xx reply (407). >>> >>> This configuration worked fine with kamailio 4.2.x and problem >>> appeared after upgrading to 5.0.2. >>> >>> Trying to figure out how to narrow down the problem. Any input >>> is appreciated. >>> >>> >>> _______________________________________________ >>> Kamailio (SER) - Users Mailing List >>> [email protected] <mailto:[email protected]> >>> https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users >>> <https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users> >> >> -- >> Daniel-Constantin Mierla >> www.twitter.com/miconda <http://www.twitter.com/miconda> -- >> www.linkedin.com/in/miconda <http://www.linkedin.com/in/miconda> >> Kamailio Advanced Training - www.asipto.com <http://www.asipto.com> >> Kamailio World Conference - www.kamailioworld.com >> <http://www.kamailioworld.com> > > -- > Daniel-Constantin Mierla > www.twitter.com/miconda <http://www.twitter.com/miconda> -- > www.linkedin.com/in/miconda <http://www.linkedin.com/in/miconda> > Kamailio Advanced Training - www.asipto.com <http://www.asipto.com> > Kamailio World Conference - www.kamailioworld.com > <http://www.kamailioworld.com> > > -- Daniel-Constantin Mierla www.twitter.com/miconda -- www.linkedin.com/in/miconda Kamailio Advanced Training - www.asipto.com Kamailio World Conference - www.kamailioworld.com
_______________________________________________ Kamailio (SER) - Users Mailing List [email protected] https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
