Re: [asterisk-dev] bridge_unreal: An alternative approach to Local/Unreal channel optimization

2014-03-11 Thread Matthew Jordan
On Mon, Mar 10, 2014 at 7:27 AM, Matthew Jordan mjor...@digium.com wrote:

 On Mon, Mar 10, 2014 at 6:59 AM, Joshua Colp jc...@digium.com wrote:

 Matthew Jordan wrote:

snip

 The NLB compatibility code actually checks whether something like a
 MixMonitor is on either Local channel and won't allow it to be used.

 Now that I've given a diagram to show where things optimize and how it
 isn't inside of chan_local... what do you think NOW? ;)


 I think this proposal is tantamount to killing Local channel optimization.

 I'm not sure that's a bad thing, but I'd certainly like to get more
 opinions.


Updating this thread with some more thoughts. These are a bit random, but
hopefully they'll spark some conversation about possibilities here:

 * We probably can't get rid of Local channel optimization. While it is
ugly - and prone to causing strange thing to happen both in the core and
from the perspective of an external user - there's at least one use case
that needs this feature: collapsing two RTP capable channels into a native
bridge. For example, assume we have the following:

 --   --    --
SIP/foo \ / Local;1  Local;2 \ /  SIP/bar
 --
B0B1

Here, B0 and B1 would currently be simple two party bridges with the media
flowing through the core. If feature requirements meant that SIP/foo and
SIP/bar could not be natively bridged - even if they were directly in a
bridge together - then optimizing this scenario doesn't buy much
performance. If, however, optimizing away the Local channel would result in
the bridge between SIP/foo and SIP/bar being a native bridge, then the
performance gain is significant.

* There's lots of strange edge cases with Local channels. Consider, for
example, some of the following scenarios (all of which are possible in 12):

** Local channel between Real channel and multi-party bridge with Real
channels. In this case, optimization should result in the Real/A channel
being pushed into bridge B1.

   B0   /  Real
Real/A --/  \--  -- - B1 --   Real
   Local;1   Local;2\  Real

** Local channel between Local channel and a multi-party bridge with Real
channels. Here, there's a possible race condition between the Local
channels: we don't know for sure what is on the other end of the Local/A;2
channel. Finding out is also a bad idea - a whole lot of things would have
to be locked in order to get that information. What's more, there may be
another Local channel beyond Local/A! This is where Josh's proposal comes
into play, as the information is passed down the chain - making it so that
optimizations don't have to occur in Local channel chains. At the same
time, we may want to try and optimize away the Local channel between the
multi-party bridge of Real channels and the other Local channel. Assuming
Local/A doesn't win in an optimization race, we'd want Local/A to take the
place of the existing Local channel - but we have to prevent it from
optimizing away at the same time.

  B0   /  Real
Local/A;2 --/  \--  -- - B1 --   Real
  Local;1   Local;2\  Real

** Local channel between two multi-party bridges. Here, there's really two
ways to handle this: either don't optimize away, or merge both bridges
together into one massive multi-party bridge.

Real  -- \  /  Real
Real  --  -- B0 - --- -- - B1 --   Real
Real  -- /  Local;1 Local;2 \  Real

** Two Local channels optimizing into a multi-party bridge. Both our Local
channel - as well as Local/B - may attempt to optimize the channels on the
other ends into B1 at the same time. The bridge has to carefully manage
this process.

   B0   /  Real
Real/A --/  \--  -- - B1 --   Local/B
   Local;1   Local;2\  Real

All of these scenarios are currently handled by core_unreal and core_local
in some fashion. It is, however, very complex code that - particularly with
Local channel chains - is prone to error. The implementation today faces
two problems:
(1) Knowledge of what is on the other side of the bridge is known by the
bridge, but not by either Local channel half. In order to get that
knowledge, both Local channel halves must take control of the bridge (and
all of its participants), then synchronize with each other.
(2) When multiple Local channels can optimize in a chain, they have to
communicate with each other (or at least compete with each other) to see
who optimizes out first. This can change the information that a Local
channel has about how it can optimize: for example, a Local channel may
view that it is in a two party bridge with another Local channel, attempt
to optimize, only to find out later that it is now in a multi-party bridge
with 

Re: [asterisk-dev] bridge_unreal: An alternative approach to Local/Unreal channel optimization

2014-03-11 Thread Joshua Colp

Matthew Jordan wrote:

snip



All of these scenarios are currently handled by core_unreal and
core_local in some fashion. It is, however, very complex code that -
particularly with Local channel chains - is prone to error. The
implementation today faces two problems:
(1) Knowledge of what is on the other side of the bridge is known by the
bridge, but not by either Local channel half. In order to get that
knowledge, both Local channel halves must take control of the bridge
(and all of its participants), then synchronize with each other.
(2) When multiple Local channels can optimize in a chain, they have to
communicate with each other (or at least compete with each other) to see
who optimizes out first. This can change the information that a Local
channel has about how it can optimize: for example, a Local channel may
view that it is in a two party bridge with another Local channel,
attempt to optimize, only to find out later that it is now in a
multi-party bridge with multiple Real channels.
(3) When optimization occurs, there can be *no* information in flight on
the Local channel. This is particularly difficult as the write queue
exists on the ast_channel struct - which means that the bridging layer
has to be informed to not write to the channel when the optimization
occurs. Again, more points of synchronization and locking.

There's a few possible approaches that may simplify the implementation:

  * Use approaches such as Josh's native Local bridge to move logic out
of core_unreal and core_local into bridge implementations. The bridges
actually have state now, and *know* who is in the bridge with them. A
bridge implementation could be written that handles a Local channel +
one other channel, and it could tell the Local channel when it can optimize.


I ended up toying with a prototype[1] last night which does Local 
channel optimization using this approach.


It implements a native bridge technology which requires at least one 
Local channel to be present in the bridge. Once two channels have joined 
it stores the bridge and peer channel on each Local channel shared 
structure in the bridge. If the shared structure contains information 
about both sides of the Local channel it queues up a task with all of 
the bridges/channels to optimize. The task is executed in a serialized 
fashion using a taskprocessor and moves the respective channels around. 
If there is a chain of Local channels involved then multiple tasks are 
queued. Some may fail due to actions taken before they are executed, but 
another task will have already been queued to optimize once again. This 
happens until the entire chain is collapsed.


[1] http://svn.digium.com/svn/asterisk/team/file/bridge_unreal_optimizer/

--
Joshua Colp
Digium, Inc. | Senior Software Developer
445 Jan Davis Drive NW - Huntsville, AL 35806 - US
Check us out at: www.digium.com  www.asterisk.org

--
_
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-dev mailing list
To UNSUBSCRIBE or update options visit:
  http://lists.digium.com/mailman/listinfo/asterisk-dev


Re: [asterisk-dev] bridge_unreal: An alternative approach to Local/Unreal channel optimization

2014-03-11 Thread Corey Farrell
My one concern is if we stop optimizing Local channels and allow the
ast_channel to live for the duration of the call, this could significantly
increase open FD's.  This would be a bigger issue for systems using
res_timing_timerfd, since that causes alert pipe's to be created.


On Tue, Mar 11, 2014 at 11:39 AM, Joshua Colp jc...@digium.com wrote:

 Matthew Jordan wrote:

 snip



 All of these scenarios are currently handled by core_unreal and
 core_local in some fashion. It is, however, very complex code that -
 particularly with Local channel chains - is prone to error. The
 implementation today faces two problems:
 (1) Knowledge of what is on the other side of the bridge is known by the
 bridge, but not by either Local channel half. In order to get that
 knowledge, both Local channel halves must take control of the bridge
 (and all of its participants), then synchronize with each other.
 (2) When multiple Local channels can optimize in a chain, they have to
 communicate with each other (or at least compete with each other) to see
 who optimizes out first. This can change the information that a Local
 channel has about how it can optimize: for example, a Local channel may
 view that it is in a two party bridge with another Local channel,
 attempt to optimize, only to find out later that it is now in a
 multi-party bridge with multiple Real channels.
 (3) When optimization occurs, there can be *no* information in flight on
 the Local channel. This is particularly difficult as the write queue
 exists on the ast_channel struct - which means that the bridging layer
 has to be informed to not write to the channel when the optimization
 occurs. Again, more points of synchronization and locking.

 There's a few possible approaches that may simplify the implementation:

   * Use approaches such as Josh's native Local bridge to move logic out
 of core_unreal and core_local into bridge implementations. The bridges
 actually have state now, and *know* who is in the bridge with them. A
 bridge implementation could be written that handles a Local channel +
 one other channel, and it could tell the Local channel when it can
 optimize.


 I ended up toying with a prototype[1] last night which does Local channel
 optimization using this approach.

 It implements a native bridge technology which requires at least one Local
 channel to be present in the bridge. Once two channels have joined it
 stores the bridge and peer channel on each Local channel shared structure
 in the bridge. If the shared structure contains information about both
 sides of the Local channel it queues up a task with all of the
 bridges/channels to optimize. The task is executed in a serialized fashion
 using a taskprocessor and moves the respective channels around. If there is
 a chain of Local channels involved then multiple tasks are queued. Some may
 fail due to actions taken before they are executed, but another task will
 have already been queued to optimize once again. This happens until the
 entire chain is collapsed.

 [1] http://svn.digium.com/svn/asterisk/team/file/bridge_unreal_optimizer/


 --
 Joshua Colp
 Digium, Inc. | Senior Software Developer
 445 Jan Davis Drive NW - Huntsville, AL 35806 - US
 Check us out at: www.digium.com  www.asterisk.org

 --
 _
 -- Bandwidth and Colocation Provided by http://www.api-digital.com --

 asterisk-dev mailing list
 To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-dev

-- 
_
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-dev mailing list
To UNSUBSCRIBE or update options visit:
   http://lists.digium.com/mailman/listinfo/asterisk-dev

Re: [asterisk-dev] bridge_unreal: An alternative approach to Local/Unreal channel optimization

2014-03-10 Thread Joshua Colp

Matthew Jordan wrote:




snip



It's important to point out that optimization's goal was never the
removal of the channel. If anything, nuking Local channels has - in
my opinion - always made life more difficult for everyone, not
easier.

The goal was performance - minimize the frame path. If I'm picturing
 this correctly, this doesn't *quite* optimize as efficiently as
completely removing the Local channels - but it may still be
sufficient.

Real-01   Local-02;1 Local-02;2 Real-03 --
-  --- \   /   \  /
\  / -B0-   -NLB- -B1- Real-03
Real-01

In this case - and this is assuming I understand the proposed Native
 Local Bridge correctly! - Local-02;1 has as its actual destination
target Real-03, while Local-02;2 has as it actual destination
Real-01. When B0 pushes a frame to Local-02;1, Local-02;1 knows that
it should just pass it on to its destination. Rather than passing to
its bridge, it writes directly to Real-03. The same happens in
reverse for Real-03 to Local-02;2.


As it is right now this approach can't optimize a bridged scenario above 
but I'm not sure that's a bad thing. While having a goal of optimizing 
things as much as possible is good in this scenario that's costing us a 
lot of complex code (with issues) and also requiring outside consumers 
to understand what can happen. I'd personally like to see Local channels 
become a connection between things instead of channels that can 
transmogrify, morph, and disappear. It makes both of our lives easier. 
To outside consumers they become these channels have the same semantics 
as other channels but are implemented as a connection, and a single 
event will be produced which shows you how they are connected.


As for your diagram no NLB would be present there because a bridge does 
not exist within chan_local. Making it use the bridging framework would 
require rewriting it, as it can not work within the confines of what 
bridging requires (ie: you can't have a channel doing two things at once).


Where the NLB would be in use is this:

Real-01 (B1) - Local-02;1 - Local-02;2 (B2) - Local-03;1 - 
Local-03;2 (B3) - Real-02


In this case B2 would be an NLB and optimize things so media coming from 
Real-01 would be queued onto Local-03;2 for reading by B3 and media 
coming from Real-02 would be queued onto Local-02;1 for reading by B1. 
This bypasses Local-02;2 and Local-03;1 in the middle.


This works no matter what each far end is doing.

The reason optimizing your example is hard is because frames have to 
come from a channel within the bridge and pass through it.




Creating a chain of these works by the real 'endpoints' getting
passed down the chain of Local channels via control frames.

There's two issues I can see with this - one minor, one maybe not.
(1) There's a small amount of work here that occurs by the Local
channel passing the frame on to its destination channel. It's minor,
but it would be slightly more work than what occurs during today's
optimization. (2) More seriously: I wonder if the destination
shouldn't be a channel but a bridge. The above optimization cannot
work for multi-party bridges: there is no single channel destination.
Today's optimization does work in that scenario via a bridge swap -
the single party on one end gets swapped with the Local channel in
the multi-party bridge. This really is a minor case - the idea of
optimizing channels into multi-party bridges is admittedly
ridiculously new - but it may be useful to think through this use
case.


Yes, as it is right now this doesn't optimize as much as the code that 
currently exists. I can say though that now when any video and audio 
frame go through a Local channel they no longer attempt to optimize out. 
(Yes, every 20ms pretty much the code attempts to do the optimization).


snip



I would think you'd need it if you had a hook that needed the audio
on that Local channel - such as a MixMonitor.


The NLB compatibility code actually checks whether something like a 
MixMonitor is on either Local channel and won't allow it to be used.


Now that I've given a diagram to show where things optimize and how it 
isn't inside of chan_local... what do you think NOW? ;)


--
Joshua Colp
Digium, Inc. | Senior Software Developer
445 Jan Davis Drive NW - Huntsville, AL 35806 - US
Check us out at: www.digium.com  www.asterisk.org

--
_
-- Bandwidth and Colocation Provided by http://www.api-digital.com --

asterisk-dev mailing list
To UNSUBSCRIBE or update options visit:
  http://lists.digium.com/mailman/listinfo/asterisk-dev


Re: [asterisk-dev] bridge_unreal: An alternative approach to Local/Unreal channel optimization

2014-03-10 Thread Matthew Jordan
On Mon, Mar 10, 2014 at 6:59 AM, Joshua Colp jc...@digium.com wrote:

 Matthew Jordan wrote:



 snip



 It's important to point out that optimization's goal was never the
 removal of the channel. If anything, nuking Local channels has - in
 my opinion - always made life more difficult for everyone, not
 easier.

 The goal was performance - minimize the frame path. If I'm picturing
  this correctly, this doesn't *quite* optimize as efficiently as
 completely removing the Local channels - but it may still be
 sufficient.

 Real-01   Local-02;1 Local-02;2 Real-03 --
 -  --- \   /   \  /
 \  / -B0-   -NLB- -B1- Real-03
 Real-01

 In this case - and this is assuming I understand the proposed Native
  Local Bridge correctly! - Local-02;1 has as its actual destination
 target Real-03, while Local-02;2 has as it actual destination
 Real-01. When B0 pushes a frame to Local-02;1, Local-02;1 knows that
 it should just pass it on to its destination. Rather than passing to
 its bridge, it writes directly to Real-03. The same happens in
 reverse for Real-03 to Local-02;2.


 As it is right now this approach can't optimize a bridged scenario above
 but I'm not sure that's a bad thing. While having a goal of optimizing
 things as much as possible is good in this scenario that's costing us a lot
 of complex code (with issues) and also requiring outside consumers to
 understand what can happen. I'd personally like to see Local channels
 become a connection between things instead of channels that can
 transmogrify, morph, and disappear. It makes both of our lives easier. To
 outside consumers they become these channels have the same semantics as
 other channels but are implemented as a connection, and a single event will
 be produced which shows you how they are connected.


I don't disagree with that.

Optimization of Local channels is a complexity that is hard for us, and
hard for users of Asterisk. It improves performance - but I'm not sure how
much we improve it by is actually worth that pain.

When it was first written, people cared less about the guts of Asterisk,
and operations internally were generally simpler. If this complexity is
going to stick around, we have to find a way to manage it appropriately.



 As for your diagram no NLB would be present there because a bridge does
 not exist within chan_local. Making it use the bridging framework would
 require rewriting it, as it can not work within the confines of what
 bridging requires (ie: you can't have a channel doing two things at once).

 Where the NLB would be in use is this:

 Real-01 (B1) - Local-02;1 - Local-02;2 (B2) - Local-03;1 -
 Local-03;2 (B3) - Real-02

 In this case B2 would be an NLB and optimize things so media coming from
 Real-01 would be queued onto Local-03;2 for reading by B3 and media coming
 from Real-02 would be queued onto Local-02;1 for reading by B1. This
 bypasses Local-02;2 and Local-03;1 in the middle.

 This works no matter what each far end is doing.

 The reason optimizing your example is hard is because frames have to come
 from a channel within the bridge and pass through it.


Well, there are both good and bad things about this.

On the good side, AMI events that exist today won't change. Fewer breaking
changes is good. Optimization begin/end events wouldn't occur, but the
semantics of Local channels otherwise behaves the same.

On the bad side, the simplest case is now the case that receives no
improvements. You only get benefit from the Native Local Bridge if you have
chains of Local channels - which I would imagine to be relatively rare in
practice.




 Creating a chain of these works by the real 'endpoints' getting
 passed down the chain of Local channels via control frames.

 There's two issues I can see with this - one minor, one maybe not.
 (1) There's a small amount of work here that occurs by the Local
 channel passing the frame on to its destination channel. It's minor,
 but it would be slightly more work than what occurs during today's
 optimization. (2) More seriously: I wonder if the destination
 shouldn't be a channel but a bridge. The above optimization cannot
 work for multi-party bridges: there is no single channel destination.
 Today's optimization does work in that scenario via a bridge swap -
 the single party on one end gets swapped with the Local channel in
 the multi-party bridge. This really is a minor case - the idea of
 optimizing channels into multi-party bridges is admittedly
 ridiculously new - but it may be useful to think through this use
 case.


 Yes, as it is right now this doesn't optimize as much as the code that
 currently exists. I can say though that now when any video and audio frame
 go through a Local channel they no longer attempt to optimize out. (Yes,
 every 20ms pretty much the code attempts to do the optimization).


Which, if it can't optimize, is a bunch of needless work.

Trade-offs!



 

Re: [asterisk-dev] bridge_unreal: An alternative approach to Local/Unreal channel optimization

2014-03-09 Thread Matthew Jordan
On Sat, Mar 8, 2014 at 1:19 PM, Joshua Colp jc...@digium.com wrote:
 Greetings everyone on this glorious weekend!

 I've had an idea bouncing around my head for the past many months on an
 alternative approach for optimizing Local/Unreal channels. This morning
 everything finally clicked and I put it together[1] (I'm still working on
 it/tweaking it, but it DOES work).

 The traditional approach has been to collapse the chain of Local channels
 down until you are left with the minimum amount required. Unfortunately
this
 can be rather complex and error prone as you need to go through the entire
 chain and then figure out the best way to accomplish this (keeping in mind
 juggling multiple locks and potentially multiple bridges). You also end up
 needing to give information when this happens so consumers know what is
 going on.

In any of the code bases, this is a difficult and complex thing to do. In
12, while I'm not sure we made the problem worse, we certainly didn't make
it any better.

In order to optimize, each Local channel half has to first determine if
they even can optimize. If they are in a bridge with multiple participants,
there are ways in which they can - either by a bridge merge or a bridge
swap. (Merge puts two multi-party bridges together; swap moves a single
participant into another bridge (single or multi)). If they can they then
have to synchronize with the other half, lock both bridges that the halves
are in - including all of the participants (via the bridge lock) - then
move a lot of channels around.

To date, the Local channel optimization test - which collapses 150 Local
channels - is the number one failing test in the test suite. Weird timing
errors cause weird errors. While I'm confident we'll get to the bottom of
all the edge cases, it is very, very, very complex.

We eliminated the vast majority of masquerades - but this particular
operation is, in many ways, just as nasty.

 The bridge_unreal approach doesn't do this. It aims to optimize the path
for
 frames traveling through the chain, allowing them to skip intermediary
hops
 where they don't need to go through. This results in a very similar
 situation for the frames but does not move/change/alter/hangup the
 intermediary channels involved.

It's important to point out that optimization's goal was never the removal
of the channel. If anything, nuking Local channels has - in my opinion -
always made life more difficult for everyone, not easier.

The goal was performance - minimize the frame path. If I'm picturing this
correctly, this doesn't *quite* optimize as efficiently as completely
removing the Local channels - but it may still be sufficient.

Real-01   Local-02;1 Local-02;2 Real-03
-- ----
   \   /   \  /  \  /
-B0-   -NLB- -B1-
Real-03   Real-01

In this case - and this is assuming I understand the proposed Native Local
Bridge correctly! - Local-02;1 has as its actual destination target
Real-03, while Local-02;2 has as it actual destination Real-01. When B0
pushes a frame to Local-02;1, Local-02;1 knows that it should just pass it
on to its destination. Rather than passing to its bridge, it writes
directly to Real-03. The same happens in reverse for Real-03 to Local-02;2.

Creating a chain of these works by the real 'endpoints' getting passed down
the chain of Local channels via control frames.

There's two issues I can see with this - one minor, one maybe not.
(1) There's a small amount of work here that occurs by the Local channel
passing the frame on to its destination channel. It's minor, but it would
be slightly more work than what occurs during today's optimization.
(2) More seriously: I wonder if the destination shouldn't be a channel but
a bridge. The above optimization cannot work for multi-party bridges: there
is no single channel destination. Today's optimization does work in that
scenario via a bridge swap - the single party on one end gets swapped with
the Local channel in the multi-party bridge. This really is a minor case -
the idea of optimizing channels into multi-party bridges is admittedly
ridiculously new - but it may be useful to think through this use case.

 It does this by passing each far end channel through the entire chain with
 each intermediary hop storing them and the next hop in the chain examining
 and forwarding them on over and over. Once this completes each end has the
 channel that is at the far end and is able to queue frames onto it
directly,
 bypassing the intermediary hops. This happens over time (less than a
second,
 I'm not talking minutes here) but leads to eventual optimization. Even in
a
 compromised optimized state frames will still flow as expected.

 This also works perfectly fine when a hop uses /n and wishes to remain in
 the path of frames. Each side of that hop will optimize themselves and
skip
 any intermediary hops.