Hi, I've been looking at ASTERISK-22079, and what might have caused it. The comment I added to the bottom of that ticket does not explain the segfault/backtrace that is attached to that ticket, so I have kept digging.
Fundamentally, I would like to know what stops a scheduled event from being cancelled while it is already running (eg. if the running event is waiting on a lock)? The problem being that both the thread that cancels the event, and the running event are both likely to call an unref(). Even worse, I believe that it is possible as a result of the above that while a thread is waiting on a lock, the data-structure it is waiting for is unref'ed to the point of being destroyed, so it ends up with a lock on a freed address. To try and explain what I mean, -/-- represent different threads. The following is probably a worst-case scenario: - thread1 started -- scheduler thread started - ao2_obj created (ref = 1) - ast_sched_add() called with ref increment (ref = 2) - locks ao2_obj (ref = 2, locked) -- sched fires event - waiting on lock - ast_sched_del_unref() called (ref = 1) -- sched event gets lock (ref = 1, locked) -- unrefs obj, unlocks and returns 0 to destroy sched. (ref = 0, destroyed) - any operation on ao2_obj is now illegal. Am I missing something obvious? Is there a defence mechanism I am not seeing? Thanks, Steve
-- _____________________________________________________________________ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- asterisk-dev mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-dev
