-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/3927/
-----------------------------------------------------------

(Updated Aug. 24, 2014, 11:42 p.m.)


Review request for Asterisk Developers.


Changes
-------

Fixed the issue pointed out by Richard.


Bugs: ASTERISK-24212
    https://issues.asterisk.org/jira/browse/ASTERISK-24212


Repository: Asterisk


Description
-------

Several tests in the testsuite had sporadic failures due to crashes that were 
occurring due to the scheduler. The crash goes something like this:

1) Scheduler thread realizes it's time to send an RTCP packet.
2) Scheduler thread removes RTCP task from the heap so that it can be run.
3) A separate thread ends a call in progress, and attempts to delete the RTCP 
scheduler task using ast_sched_del().
4) ast_sched_del() cannot find the scheduled task since it is not in the heap 
(or hashtab in Asterisk 12). This results in a failed assertion.
5) Since the test agents are compiled with DO_CRASH, failing an assertion 
results in a crash.
6) A crash results in a failed test.

The solution I have crafted here is to maintain a pointer in the scheduler 
context to which task is currently executing. If we attempt to delete the 
running task, we wait for it to complete before continuing and return that we 
successfully deleted the scheduled task.


Diffs (updated)
-----

  /branches/12/main/sched.c 421883 

Diff: https://reviewboard.asterisk.org/r/3927/diff/


Testing
-------

The test 
channels/pjsip/basic_calls/two_parties/nominal/alice_initiated/alice_hangs_up 
was a test that, when I ran it in a loop, would have a test failure typically 
within about a half hour of starting the test loop. With this patch applied, I 
no longer see the crash described in the description.

HOWEVER, the test still does occasionally fail, but that's due to a separate 
race condition involving translation paths not being set up when attempting to 
perform talk detection. So while the patch attached here may not necessarily be 
enough to close the referenced issue, it is fixing one of the reasons for test 
failure.


Thanks,

Mark Michelson

-- 
_____________________________________________________________________
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-dev mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-dev

Reply via email to