> On Dec. 16, 2013, 8:26 p.m., rmudgett wrote:
> > /branches/12/main/channel.c, lines 10399-10400
> > <https://reviewboard.asterisk.org/r/3069/diff/1/?file=49555#file49555line10399>
> >
> >     One thing that ast_channel_masquerade() did that is not being done 
> > anymore is queuing null frames to wake up threads before the masquerade 
> > takes place.
> >     
> >     Maybe do_channel_masquerade() needs to poke both channels to wake up 
> > their respective caretaker threads instead.  Those threads could be in the 
> > ast_read/ast_write functions.
> 
> Mark Michelson wrote:
>     The old ast_channel_masquerade() would queue null frames, but I 
> interpreted that as being done so that when the thread servicing the channel 
> woke up, the subsequent ast_read() call would end up performing the 
> masquerade. Since we set up and perform the masquerade all at once now, 
> queuing a null frame at the time of marking the channels as being involved in 
> a masquerade seems like the wrong move.
>     
>     I *think* that signaling blocking threads is actually taken care of 
> already.
>     
>     The clonechan has a null frame queued onto it when it gets marked as a 
> zombie.
>     The original channel has a SIGURG sent to its blocking thread shortly 
> after.
>     
>     Am I correct that this should result in any blocking threads being awoken 
> correctly?
> 
> rmudgett wrote:
>     I was hoping you would know with more certainty than I since the channel 
> fds are swapped around. :)

Yes, it should.


- Joshua


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviewboard.asterisk.org/r/3069/#review10423
-----------------------------------------------------------


On Dec. 12, 2013, 9:59 p.m., Mark Michelson wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviewboard.asterisk.org/r/3069/
> -----------------------------------------------------------
> 
> (Updated Dec. 12, 2013, 9:59 p.m.)
> 
> 
> Review request for Asterisk Developers, Joshua Colp, Jonathan Rose, Matt 
> Jordan, and rmudgett.
> 
> 
> Bugs: ASTERISK-22936
>     https://issues.asterisk.org/jira/browse/ASTERISK-22936
> 
> 
> Repository: Asterisk
> 
> 
> Description
> -------
> 
> This is more-or-less a take 2 of https://reviewboard.asterisk.org/r/3042
> 
> Initially, it was discovered that performing an attended transfer of a 
> multiparty bridge with a PJSIP channel would cause a deadlock. A PBX thread 
> started a masquerade and reached the point where it was calling the fixup() 
> callback on the "original" channel. For chan_pjsip, this involves pushing a 
> synchronous task to the session's serializer. The problem was that a task 
> ahead of the fixup task was also attempting to perform a channel masquerade. 
> However, since masquerades are designed in a way to only allow for one to 
> occur at a time, the task ahead of the fixup could not continue until the 
> masquerade already in progress had completed. And of course, the masquerade 
> in progress could not complete until the task ahead of the fixup task had 
> completed. Deadlock.
> 
> The initial fix was to change the fixup task to be asynchronous. While this 
> prevented the deadlock from occurring, it had the frightful side effect of 
> potentially allowing for tasks in the session's serializer to operate on a 
> zombie channel.
> 
> Taking a step back from this particular deadlock, it became clear that the 
> problem was not really this one particular issue but that masquerades 
> themselves needed to be addressed. A PJSIP attended transfer operation calls 
> ast_channel_move(), which attempts to both set up and execute a masquerade. 
> The problem was that after it had set up the masquerade, the PBX thread had 
> swooped in and tried to actually perform the masquerade. Looking at changes 
> that had been made to Asterisk 12, it became clear that there never is any 
> time now that anyone ever wants to set up a masquerade and allow for the 
> channel thread to actually perform the masquerade. Everyone always is calling 
> ast_channel_move(), performs the masquerade itself before returning.
> 
> In this patch, I have removed all blocks of code from channel.c that will 
> attempt to perform a masquerade if ast_channel_masq() returns true. Now, 
> there is no distinction between setting up a masquerade and performing the 
> masquerade. It is one operation. The only remaining checks for 
> ast_channel_masq() and ast_channel_masqr() are in ast_hangup() since we do 
> not want to interrupt a masquerade by hanging up the channel. Instead, now 
> ast_hangup() will wait for a masquerade to complete before moving forward 
> with its operation.
> 
> The ast_channel_move() function has been modified to basically in-line the 
> logic that used to be in ast_channel_masquerade(). ast_channel_masquerade() 
> has been killed off for real. ast_channel_move() now has a lock associated 
> with it that is used to prevent any simultaneous moves from occurring at 
> once. This means there is no need to make sure that ast_channel_masq() or 
> ast_channel_masqr() are already set on a channel when ast_channel_move() is 
> called. It also means the channel container lock is not pulling double duty 
> by both keeping the container locked and preventing multiple masquerades from 
> occurring simultaneously.
> 
> The ast_do_masquerade() function has been renamed to do_channel_masquerade() 
> and is now internal to channel.c. The function now takes explicit arguments 
> of which channels are involved in the masquerade instead of a single channel. 
> While it probably is possible to do some further refactoring of this method, 
> I feel that I would be treading dangerously. Instead, all I did was change 
> some comments that no longer are true after this changeset.
> 
> The other more minor change introduced in this patch is to res_pjsip.c to 
> make ast_sip_push_task_synchronous() run the task in-place if we are already 
> a SIP servant thread. This is related to this patch because even when we 
> isolate the channel masquerade to only running in the SIP servant thread, we 
> would still deadlock when the fixup() callback is reached since we would 
> essentially be waiting forever for ourselves to finish before actually 
> running the fixup. This makes it so the fixup is run without having to push a 
> task into a serializer at all.
> 
> 
> Diffs
> -----
> 
>   /branches/12/res/res_pjsip.c 403349 
>   /branches/12/main/channel.c 403343 
>   /branches/12/include/asterisk/channel.h 403343 
> 
> Diff: https://reviewboard.asterisk.org/r/3069/diff/
> 
> 
> Testing
> -------
> 
> Jonathan ran the same test he had originally run in order to make the 
> deadlock occur, and with this patch it no longer occurs. I also ran a bevy of 
> testsuite tests to see if there were any that did not pass with this patch 
> but did pass without the patch. I found none on my machine, though I'll admit 
> I did not run every test.
> 
> 
> Thanks,
> 
> Mark Michelson
> 
>

-- 
_____________________________________________________________________
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-dev mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-dev

Reply via email to