Hi,

I've been looking at ASTERISK-22079, and what might have caused it. The
comment I added to the bottom of that ticket does not explain the
segfault/backtrace that is attached to that ticket, so I have kept digging.

Fundamentally, I would like to know what stops a scheduled event from being
cancelled while it is already running (eg. if the running event is waiting
on a lock)? The problem being that both the thread that cancels the event,
and the running event are both likely to call an unref().

Even worse, I believe that it is possible as a result of the above that
while a thread is waiting on a lock, the data-structure it is waiting for
is unref'ed to the point of being destroyed, so it ends up with a lock on a
freed address.

To try and explain what I mean, -/-- represent different threads. The
following is probably a worst-case scenario:

- thread1 started
-- scheduler thread started
- ao2_obj created (ref = 1)
- ast_sched_add() called with ref increment (ref = 2)
- locks ao2_obj (ref = 2, locked)
-- sched fires event - waiting on lock
- ast_sched_del_unref() called (ref = 1)
-- sched event gets lock (ref = 1, locked)
-- unrefs obj, unlocks and returns 0 to destroy sched. (ref = 0, destroyed)
- any operation on ao2_obj is now illegal.

Am I missing something obvious? Is there a defence mechanism I am not
seeing?

Thanks,
Steve
-- 
_____________________________________________________________________
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-dev mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-dev

Reply via email to