I would like to confirm Daniels description and repeat the debugged race 
condition:

Some of the openpbx code, which call the deactivate function of the 
generator, expects all 'deactivated' to be done when the function returns.
But with the new generator thread model, this is not true.
Example: in the generic bridge code, the generator is deactivted and the 
transcoding activated. But the generator-release is done afterwards due to 
the own thread and resets the read/write format to orginal values. But the 
format was already set for the transcoding -> now it is set wrong.

At this time, I see only two possible fixes:

1) the generator deactivate function must be a sync call and return when the 
   job is really finished only.
   * This is what I have done, but removed again, because it produces a 
     deadlock -> the function is called with channel lock help most of the 
     time, but the generator thread call also functions which locks the 
     channel (The workaround patch I posted, removed that, but is not a real 
     fix)
   Maybe the gen_free function, which is called by the generator thread on 
   deactivation, can be changed ...

2) the deactivation of generator and its gen_free functions don't do 
   anything which might re-change a setting done by the main thread in 
   the meantime ...

Armin

On Mon, 9 Jan 2006, Daniel Swarbrick wrote:
> Hi all,
> 
> I'm sure most of you are aware of the race condition problem that exists
> in the line generator threads that were introduced in svn-878. After a
> lot of bug squashing, the generator threads finally seemed to be stable,
> somewhere around svn-1180... _seemed_ to be stable. On my production
> system, I have a Cisco 1760 with two BRI WICs to handle PSTN, so from
> the point of view of OpenPBX, everything, including PSTN, is a SIP
> channel. I also run Cisco phones, so everything on my setup supports
> re-invite. No problems there.
> 
> Recently, on a second system, with an Eicon Diva 4-port BRI, I
> discovered that transferring an inbound CAPI call from one SIP phone to
> another resulted in garbled sound in one direction after the call had
> been handed over. Armin Schindler reported a similar problem when the
> PBX had to transcode, and put forward the theory that it was a race
> condition in the generator deactivate function.
> 
> This morning I investigated the problem a bit further, and found that a
> 'show channel CAPI/xxxx' revealed a read format of alaw, but a write
> format of slin, just after the call had been transferred. This seemed to
> confirm Armin's theory that the generator threads were being deactivated
> in the wrong order, and clobbering the write format of the channel. It
> seems that chan_capi is expecting alaw, but being fed slin - resulting
> in garbled sound. Incidentally, a 'show channel SIP/xxxx' segfaulted,
> even if executed prior to the transfer.
> 
> Using the patch that Armin wrote, which inserts a short delay in the
> generator deactivate, this problem disappears, however it introduces
> channel deadlocks under a variety of circumstances. It also seems to
> prevent MoH for the calling party on hold, during the transfer. Music on
> hold complains that it can't find any suitable format, even though I
> know it works if I manually set up an extension that directly calls
> MusicOnHold(). This symptom seems to be related to the channel write
> format not being set correctly.
> 
> How many others have encountered these problems? I realise relatively
> few have taken the plunge and run OpenPBX on a production system. So far
> I've had pretty good results on my primary production system (with the
> Cisco 1760 handling PSTN), but clearly the current code has just a few
> too many critical bugs to run it on my CAPI system. According to Armin,
> these problems should occur anytime there is transcoding being done, so
> I would expect sites that run cheaper SIP phones (that don't support
> re-invite) should be experiencing the problem.
> 
> Multi-threaded stuff goes a bit over my head, but if someone is ready to
> tackle this one (even Carlos perhaps, who committed the generators to
> trunk), I'm willing to assist in testing against all the various
> scenarios we've found.
> 
> _______________________________________________
> Openpbx-dev mailing list
> [email protected]
> http://lists.openpbx.org/mailman/listinfo/openpbx-dev
> 
_______________________________________________
Openpbx-dev mailing list
[email protected]
http://lists.openpbx.org/mailman/listinfo/openpbx-dev

Reply via email to