Hi,

while porting chan_visdn 0.18.3 to OpenPBX, I noticed in the generator
code a number of race conditions, a deadlock with a non-working workaround,
basically turning it into a delayed race condition, a memory leak, and
an IMO somewhat strange design in the timing. The kindof-deadlock keeps
chan_visdn from working properly.

As I don't exactly know why things are constructed the way they are,
I don't have much of a clue what strategy would be appropriate for
resolving this, so let me just explain the problems I found:

1. The existence of opbx_generator_is_active() probably simply ought
   to be considered a bug. All current uses seem to introduce race
   conditions and I can't think of many use cases where this would not
   be the case. opbx_generator_deactivate() is idempotent anyway, so
   the use of opbx_generator_is_active() usually is simply superfluous,
   unless the race condition is the desired behaviour.

2. opbx_generator_deactivate() would have a potential deadlock if
   there wasn't the 1-second timeout to trying to synchronize to the
   completion of the deactivation of the generator, due to the fact
   that it can be called with the channel lock held (which is what
   chan_visdn does), which set_format() needs to acquire in the
   generator thread, too, before it can complete the deactivation.

   Now, on the one hand, simply continuing if the synchronization
   doesn't happen within a certain amount of time, obviously simply
   turns this into a race condition, which probably isn't much better
   than a deadlock. And on the other hand, it's not really acceptable
   that answering a visdn channel takes a whole second because of
   this.

   The "obvious" solution of moving the call to the generator's
   release function to opbx_generator_deactivate() and thus relying on
   the recursive mutexes doesn't work, as opbx_generator_deactivate()
   would have to wait for the generator to cease generation. While
   waiting, it obviously still could be holding the channel lock,
   which also could be tried to be acquired by the generator thread
   in opbx_write() before it would be able to notice that it was expected
   to stop, thus again resulting in a deadlock.

3. Well, I don't quite understand why the generator is timed by
   gettimeofday()!? After all, that clock certainly doesn't have any
   connection with the clock that governs playback, which would be some
   DAC's crystal or the PSTN's clock, and thus usually will deviate
   from it. With VoIP, it can be somewhat difficult to get hold of
   that DAC's clock, of course, but I'd at least have expected the
   timing to be done as late as possible, and thus in particular
   independent of the source, rather than at the very source of the
   signal.

   Also, the timing isn't even done according to gettimeofday() while
   switching generators.

4. Finally, one fix. Below you find a patch for a potential
   memory leak in opbx_generator_activate().

Florian

---------------------------------------------------------------------------
diff --git a/corelib/generator.c b/corelib/generator.c
index 3d7f467..1c0df3c 100644
--- a/corelib/generator.c
+++ b/corelib/generator.c
@@ -86,6 +86,7 @@ int opbx_generator_activate(struct opbx_channel *chan, struct 
opbx_generator *ge
 
                if(opbx_generator_start_thread(chan)) {
                        /* Whoops! */
+                       gen->release(gen_data);
                        opbx_mutex_unlock(&pgcd->lock);
                        opbx_log(LOG_ERROR, "Generator activation failed: 
unable to start generator thread\n");
                        return -1;
_______________________________________________
Openpbx-dev mailing list
[email protected]
http://lists.openpbx.org/mailman/listinfo/openpbx-dev

Reply via email to