Re: [sipX-dev] Multi-branch identity and routing

Robert Joly Wed, 17 Jun 2009 08:23:35 -0700

 

> -----Original Message-----
> From: Lawrence, Scott (BL60:9D30) 
> Sent: Monday, June 15, 2009 3:50 PM
> To: Joly, Robert (CAR:9D30)
> Cc: sipX developers
> Subject: RE: [sipX-dev] Multi-branch identity and routing
> 
> 
> > > My current working assumption is that Park/Retrieve and 
> Call Pickup 
> > > are always branch-local services.
> > > 
> > >       * You cannot pickup a phone that's ringing in 
> another branch.  
> > > The
> > >         pickup code is handled by the pickup server in the branch 
> > > where
> > >         the call is made.  (It's possible that this actually works
> > >         remotely, but if it's _any_ additional effort to 
> make it work,
> > >         then I don't think that we need to commit to it).
> > > 
> > >       * You cannot park a call from one branch and retrieve it in
> > >         another.  The park orbits are specific to a branch.
> > > 
> > > I don't think that making either of these 'global' makes 
> much sense 
> > > - both rely on some out-of-band indications to work. I 
> don't pickup 
> > > a call unless I can hear it ringing.
> > > Park/Retrieve assumes that I park it and then tell someone (by 
> > > yelling or using the intercom) to retrieve it; if that 
> someone is in 
> > > another branch, then park/retrieve must becomes a very 
> elaborate way 
> > > to do a consultative transfer.
> > > 
> > > Remote workers are something of a special case, in that they will 
> > > always need to route their calls to _some_ specific externally 
> > > accessible proxy.  This could be the branch domain, or 
> could even be 
> > > some other special external access domain (a pseudo-branch for 
> > > mobile users, perhaps).  In any event, as I noted above, 
> the proxy 
> > > that they initially go through must use a host-specific 
> Record-Route 
> > > to ensure that they remain bound to it.
> 
> On Mon, 2009-06-15 at 15:23 -0400, Joly, Robert (CAR:9D30) wrote:
> 
> > I agree with everything you are saying here but that is not 
> the point 
> > I was trying to make.  Let me try to explain more clearly.  In call 
> > park scenarios, a phone retrieving the parked call get 
> referred to the 
> > Contact of the parked user (XX-4771).  For many reasons (CDR & NAT 
> > traversal, ...), we do not want that phone to send an 
> INVITE directly 
> > to the contact but instead we want it to go through the proxy.  In 
> > order to ensure that this happens, *every* phone on a given 
> system has 
> > its outbound proxy pointing to that system; *every phone*, not just 
> > remote workers.
> 
> I think I agree with you (I'm hoping that we can make some 
> exceptions for mobile phones like laptop softphones, but perhaps not).


Exempting these mobile phones would mean that they will not be able to
reliably unpark a parked remote worker.  May or may not be a big deal -
not my call.

> 
> > Now, assuming that every phone of a branch has its OBProxy 
> pointing at 
> > its home branch sipXecs(s), the statement that "a User can 
> register at 
> > any call router in any Branch, and any other authorization decision 
> > can be made in any service in any Branch, because the 
> credentials and 
> > permissions databases are uniformly replicated to all 
> servers in all 
> > Branches." does not apply since that user will always send its 
> > REGISTER to its branch proxy in accordance to the 
> provisioned outbound proxy.
> > Unless we can somehow invent a way to get rid of the OBProxy 
> > constraint, I think the current state of affairs makes it hard to 
> > justify the need for the back-up registrar IMO.
> 
> But if the 'branch proxy' it sends to is an SRV name that 
> resolves first to the proxy that's really in the branch, and 
> then (at a lower priority) to another backup proxy, the it 
> will work.  The outbound proxy is an SRV name that maps to 
> both in preference order. 

Nice on paper but having branch B fill in as a fallback for branch A
creates problems that IMO make it a non-starter:
1- For multi-branch deployments, such a scheme will create a rat's nest
of DNS SRV records.  In the simple case you describe, you would need two
DNS SRV records; one for the users in branch A's local private network
resolving to A's private IP and B's public IP and one for A's remote
workers resolving to A's public IP and B's public IP.  Multiply that by
20 for a 20 site deployment and you have an unsupportable configuration.
(Just changing the public IP address of one box could mean changing DNS
SRV records all over the place).

2- When a user dials 100 to get to his voicemail system, he may actually
be routed to the voicemail system of the fallback branch which knows
nothing about the user.  Things like BLF, MWI, AA would also exhibit
similar erratic behaviors.

3- When a remote worker registers, the Path information is recorded
(RFC3327) to ensure that all subsequent calls to that user will be
handled by the server that has a pinhole to that remote worker.  If
Branch A is down and Branch B is used as a fallback, A's users that are
located inside Branch A's local private network will now appear as
remote workers when they register on branch B and Path information is
going to be recorded.  Now, when Branch A comes back on line, a call
between a local user on A to another local user on A that got registered
though B will go through A and end up being handled by B because of the
Path info.   I cannot claim to foresee the complete set of
NAT-traversal-related problems that this will pose but I can see a few
big ones already.  A very serious experimentation would have to be done
to uncover them all and address them when a solution exists.

4- We have seen phones that do not support DNS SRV lookups for outbound
proxies. Relying on SRV to resolve outbound proxies is elegant but it is
pushing the envelope and I suspect that it will create a host of interop
issues

Because of all this, I very strongly think that we cannot realistically
use a branch as a fallback for another.  If a branch is vital and cannot
be out of service then the solution is to deploy HA at that branch...

> 
> 
> > > > If it actually works, this proposal has the advantage 
> of keeping 
> > > > things relatively simple, does not require the addition 
> of a new 
> > > > component and new server role and does not have the
> > > backup-proxy as a
> > > > single point of contact/failure.
> > > 
> > > But that wouldn't be highly available - if my branch is down or 
> > > unreachable, my phone can't register (and thus can't get
> > > calls) anywhere.
> > 
> > 4 points on that:
> >  * For a user at home branch X, this solution will be as "highly 
> > available" as branch 'X' is but at least a user at branch Y is 
> > unaffected by branch X going down (branch X is only a 
> > single-point-of-failure for branch-X users)
> 
> One of the constraints I'm trying to satisfy is that a 
> failure of a branch proxy is not a failure for anyone so long 
> as the link to that branch remains up.

How important is that constraint.  This brings *a lot* of code and
solution complexity into the mix and a much simplified version of this
proposal could be implemented if it wasn't for it.  Before we embark on
this, it would be nice to convince ourselves that this constraint rests
on solid bases.  More specifically, it would be nice to understand the
fraction of outages that are linked to network vs. sipXecs failures.  If
the former is prevalent then we can remove the constraint and run with
the simpler approach.  
Does anybody on the list have an idea regarding that fraction?




> 
> >  * With the backup-registrar, the solution will be as 
> "highly available"
> > as the 'backup registrar' is but if it/they become 
> unreachable or down 
> > then this will affect the whole enterprise (backup-registrar is a
> > single-point-of-failure)
> 
> The 'backup registrar' can be an HA pair if that level of 
> redundancy is required.
> 
> >  * Although much less severe, losing connectivity to a 
> branch is a bad 
> > occurrence even with the backup-registrar in play as you will lose 
> > your AA, VM and any configured find-me/follow-me treatment.
> 
> That's something of a tradeoff that we can make better over 
> time.  Some of our services are not now distributed (most 
> conspicuously voicemail).
> My thinking was that making those services branch-local 
> reduces the bandwidth requirement for the links between 
> branches, and makes operation within the branch robust when 
> the inter-branch link fails. 
> 
> The flip side is that some services local to the branch are 
> not available from elsewhere in that case.  Eventually, we 
> could design a distributed voicemail system that can operate 
> from multiple servers and resynchronize when reconnected, but 
> that seemed like biting off more than we could chew in the 
> 4.2 timeframe.
> 
> Some of the less stateful services like AA (even personal AA) 
> and find-me-follow-me (which is really just a bunch of 
> aliases) would be relatively easy to make distributed - you 
> just need to make sure that the configuration for them is 
> also replicated onto whatever those service fall back to when 
> they can't reach the branch.  This isn't a huge amount of 
> work, but also didn't seem essential.
> 
> My hope is that we gradually make more and more of the 
> services support (and be configured for) truly distributed operation.
> 
> 
> 
> 
_______________________________________________
sipx-dev mailing list [email protected]
List Archive: http://list.sipfoundry.org/archive/sipx-dev
Unsubscribe: http://list.sipfoundry.org/mailman/listinfo/sipx-dev
sipXecs IP PBX -- http://www.sipfoundry.org/

Re: [sipX-dev] Multi-branch identity and routing

Reply via email to