-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/3927/
-----------------------------------------------------------
(Updated Aug. 24, 2014, 11:42 p.m.)
Review request for Asterisk Developers.
Changes
-------
Fixed the issue pointed out by Richard.
Bugs: ASTERISK-24212
https://issues.asterisk.org/jira/browse/ASTERISK-24212
Repository: Asterisk
Description
-------
Several tests in the testsuite had sporadic failures due to crashes that were
occurring due to the scheduler. The crash goes something like this:
1) Scheduler thread realizes it's time to send an RTCP packet.
2) Scheduler thread removes RTCP task from the heap so that it can be run.
3) A separate thread ends a call in progress, and attempts to delete the RTCP
scheduler task using ast_sched_del().
4) ast_sched_del() cannot find the scheduled task since it is not in the heap
(or hashtab in Asterisk 12). This results in a failed assertion.
5) Since the test agents are compiled with DO_CRASH, failing an assertion
results in a crash.
6) A crash results in a failed test.
The solution I have crafted here is to maintain a pointer in the scheduler
context to which task is currently executing. If we attempt to delete the
running task, we wait for it to complete before continuing and return that we
successfully deleted the scheduled task.
Diffs (updated)
-----
/branches/12/main/sched.c 421883
Diff: https://reviewboard.asterisk.org/r/3927/diff/
Testing
-------
The test
channels/pjsip/basic_calls/two_parties/nominal/alice_initiated/alice_hangs_up
was a test that, when I ran it in a loop, would have a test failure typically
within about a half hour of starting the test loop. With this patch applied, I
no longer see the crash described in the description.
HOWEVER, the test still does occasionally fail, but that's due to a separate
race condition involving translation paths not being set up when attempting to
perform talk detection. So while the patch attached here may not necessarily be
enough to close the referenced issue, it is fixing one of the reasons for test
failure.
Thanks,
Mark Michelson
--
_____________________________________________________________________
-- Bandwidth and Colocation Provided by http://www.api-digital.com --
asterisk-dev mailing list
To UNSUBSCRIBE or update options visit:
http://lists.digium.com/mailman/listinfo/asterisk-dev