----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviewboard.asterisk.org/r/4414/ -----------------------------------------------------------
Review request for Asterisk Developers. Bugs: ASTERISK-24727 https://issues.asterisk.org/jira/browse/ASTERISK-24727 Repository: Asterisk Description ------- A multi-asterisk box setup with direct media enabled would occasionally crash when two re-INVITE collisions on a call leg happen in a row. The re-INVITE logic only had one timer struct to defer the re-INVITE. When the second collision happens the timer struct is overwritten and put into the timer heap again. Resources for the first timer are leaked and the heap has two positions occupied by the same timer struct. Now the heap ordering is potentially corrupted, the timer will fire twice, and any resources allocated for the second timer will be released twice. * The solution is to put the collided re-INVITE into the delayed requests queue with all the other delayed requests and cherry pick the next request that can come off the queue when an event happens. * Changed to put delayed BYE requests at the head of the delayed queue. There is no sense in processing delayed UPDATEs and re-INVITEs when a BYE has been requested. * Made the start of a BYE request flush the delayed requests queue to prevent a delayed request from overlaping the BYE transaction. I saw a few cases were a delayed re-INVITE got started after the BYE transaction started. * Changed the delayed_request struct to use an enum instead of a string for the request method. Cherry picking the queue is easier with an enum than string comparisons and the compiler can warn if a switch statement does not cover all defined enum values. * Improved the debug output to give more information. It helps to know which channel is involved with an endpoint. Trunks can have many channels associated with the endpoint at the same time. Diffs ----- /branches/13/res/res_pjsip_session.c 431716 Diff: https://reviewboard.asterisk.org/r/4414/diff/ Testing ------- Ran the full testsuite with the patch. All but one test passes and the failing test fails with or without the patch. Setup a test environment with two asterisk boxes and direct media enabled for all legs to perform this call repeatedly: PJSIP/100 --> box 1 --> box 2 --> box 1 --> PJSIP/200 1) 100 calls 200 2) 200 answers 3) 100 hangs up 4) repeat call A crash no longer happens on either Asterisk box with the patch. Thanks, rmudgett
-- _____________________________________________________________________ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- asterisk-dev mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-dev