Re: [rrg] Why a host-based solution does not necessarily add signalling load

Robin Whittle Wed, 21 Jan 2009 19:19:29 -0800

Short version:        ITR-based reachability testing scales better
                      than host-based reachability testing - and
                      so a hybrid might be worth thinking about.


                      However, neither of them scale acceptably, so
                      there is no point seriously considering a
                      hybrid.

                      The only reachability testing system which
                      scales acceptably is Ivip's approach of
                      using a separate set of servers to probe full
                      reachability, via multiple ETRs, to the
                      destination network.  The probing load does
                      not increase in proportion to either the number
                      of ITRs or sending hosts currently sending
                      packets to the destination network.

                      If the ITR was in the ISP, then Ivip's approach
                      does not cope with certain patterns of outages
                      which ITR-based or host-based reachability
                      testing approaches could probably cope with.
                      However, as long as the ITR is in the sending
                      host or the multihomed sending host's network
                      (or is an OITRD) then the outcome is just as
                      good as today, since BGP copes with those
                      outages pretty quickly.

                      While host-based probing could cope well with
                      an outage to the upstream ISP compared with
                      ITR-based probing from an ISP-located ITR, this
                      is already coped with perfectly well by LISP
                      etc. or Ivip where the multihomed end-user
                      network has its own ITRs.  (Non-multihomed
                      end-user networks can have their own ITRs or
                      use ITRs in the single upstream ISP.)


Pekka Nikander wrote, quoting Joel Halpern:

>> Conversely, although the liveness testing must be on the basis of
>> individual ETRs, it does seem likely that many hosts in a site will be
>> trying to reach endpoints behind the same set of ETRs.  As such,
>> having a border guy (or someone else, if you really want to complicate
>> life) testing / monitoring that liveness / reachability on behalf of
>> the various sources within the site would seem likely to
>>
>>   1) reduce the probing traffic
>>   2) increase the odds of having accurate data when packets need to be
>>      sent.
> 
> Exactly.

A further reduction in probing, and greater flexibility in deciding
which ETR the traffic packets will be tunneled to can be achieved by
the Ivip approach of using a completely separate reachability probing
system - separate from ITRs and from sending hosts.  The end-user
would either do this, or have some company do it for them.  The
probing could be done from multiple points all over the Net, to the
ETRs or more likely through the ETRs to routers and/or hosts in the
end-user network.

Most likely, that company's specialised probing system would be
configured to make the decision on how best to map the destination
network's micronets - and would change the mapping within seconds,
according to whatever criteria were specified by the administrators
of the destination network.  Within a few seconds, all ITRs in the
world handling packets addressed to these micronets would be
tunneling these packets to the ETR chosen in the new mapping decision.

This is a generalised DFZ->destination-network probing approach,
since the probing servers would, broadly speaking, be in the DFZ.
This is the only scalable approach, since the same level of probing
would still occur if 100,000 ITRs were sending packets to the
destination network as if one, none or a few were sending packets.

The key to using a separate, dedicated, reachability probing system
(quite outside the Ivip system itself, and so which can be made to
work on any principles, any protocols etc. which suit the destination
network being probed) is Ivip's real-time mapping distribution
system.  This tells all the ITRs which need to know which ETR to
tunnel the packets to.  This greatly simplifies the ITR and ETR
design and separates out reachability testing and the resulting
decision-making from the core-edge-separation system itself. (LISP,
APT, TRRP and Six/One Router monolithically integrate them.)

So having something other than the ITRs doing the probing and making
the decisions does involve some extra complexity - a real-time
mapping system.  I believe this is a small price to pay for the
greater flexibility, more robust probing (all the way to the
destination network, not just to the ETRs), greater simplicity in
ITRs and ETRs, reduction in probing traffic etc.  Also, it enables
real-time control of ETR address for incoming TE.


> To recap what I tried to say:
> 
> So far we've had two models on the table:  tunnel-router-based and
> host-based.  The tunnel-router based obviously checks reachability at
> the TR granularity, and host-based on host granularity.  

The last sentence applies to LISP, APT, TRRP etc. but not to Ivip.

With Ivip, end-user networks can and generally will probe
reachability as they choose, including most likely through each ETR
to their network, including to individual hosts if they like.

So this is finer grained at the receiving end than just to ETR
granularity.  One ETR in an ISP may be perfectly reachable and may
reach multiple destination end-user networks, but not the destination
end-user network in question.  Ivip copes with this perfectly well -
that network's micronets are mapped to an ETR (typically in another
ISP) which can reach the network.

The probing would typically be done from multiple sites, including
from actual ITRs or through actual ITRs, all over the Net.

Host-to-host reachability testing is arguably the most adaptable,
since it tests the full path used by traffic packets.  So this
approach can, in principle, cause the choices made by the sending
host (which address of the destination use to send the packets to) to
adapt to every conceivable network outage.

ITR-to-ETR probing on its own doesn't tell each ITR about the
destination network being unreachable - so it needs to be done in a
way that the ETR confirms the network is reachable.  This could be
done via:

1 - ITR->ETR probing, with a special protocol telling the ETR which
    end-user network (or perhaps an EID address within that network)
    the probe concerns.  In this way, the ETR can return a message
    that it is reachable, but that the destination network is not.

2 - ITR->(transparently through the ETR)->destination-network probing.
    This would require some new protocol or some identified router
    in the destination network which would always respond to
    existing probe protocols.  This would require extra information
    in the mapping data, to specify the one or more addresses in the
    destination network each ITR should probe.

I figure LISP etc. will need to do one of the above.  AFAIK, at
present, LISP only involves ITRs probing reachability of whole ETRs -
not the ability of the ETR to deliver the packets to any particular
destination EID.

In this diagram (from my message Re: No liveness requirement in the
ID/Loc Split concept 15 Jan) the sending host is HA, the ITRs are
shown in the ISP networks.  (However, the ITRs could be in the NA
sending host's network, or with Ivip, the ITR function could be built
into the sending host.  I depict this in Fig 2.)

                        ~~~~~~~~~~~
   NA           ISP1   ~           ~   ISP3         NB
            PE1      BR1---     ---BR3     PE3
 ........  /    ITR1   ~\         /~   ITR3   \  ........
 .      . /     ETR1   ~ \       / ~   ETR3    \ .      .
 . HA---CEA            ~    DFZ    ~           CEB---HB .
 .      . \            ~ /       \ ~           / .      .
 ........  \   ISP2    ~/         \~   ISP4   /  ........
            PE2      BR2---     ---BR4      PE4
               ITR2    ~           ~   ITR4
               ETR2     ~~~~~~~~~~~    ETR4

(Fig 1. LISP etc. or Ivip with ITRs in ISPs.)

Either form of probing from the ITR is able to cope fine with any DFZ
outages between the ITR (ITR1 or ITR2, depending on how CEA sends out
the packets) and the two ETRs in question: ETR3 and ETR4.

In addition, either form of probing from the ITR can cope separately
with (for instance) an outage between ITR1 and ETR4, while from some
other ITR, for another sending host, there is an outage to ETR3 but
not to ETR4.

Ivip's approach is not so flexible.  With Ivip, I am assuming that
each ETR is either:

1 - Reachable from essentially every ITR, or

2 - Unreachable from some (or all) ITRs to the degree that the
    destination-network's probing system will detect the problem and
    change the mapping to the ETR which is reachable.

So ITR-based probing appears to have this advantage over this
approach h (ITRs in the ISPs) to Ivip.  The advantage is limited,
however, since BGP already does a pretty good job of resolving loss
of links and routers between ISPs.   The proper way to deploy Ivip is
to ensure that every multihomed end-user network has its own ITRs.


Host-based probing has the additional advantage that it can detect
outages from the HA to the upstream ISPs ISP1 and ISP2.  The same
would apply for an ITR-based probing system, if the ITR was located
in the sending host's network NA.

However, the problem of outages from the sending host to the upstream
ISPs is not one which requires any special solutions, since the
sending-host's network's Customer Edge (CEA) router does this task
just fine.  This will continue to work fine for Ivip and for LISP,
APT, TRRP etc. no matter whether the ITR is located in the sending
host's network or in the upstream ISP.

Also, maybe in the case of non-upgraded networks in the Ivip-era or
LISP-era which are either single-homed or multihomed with current BGP
techniques,  here are no ITRs in the ISP or the sending host's
network.  In that case, there are Ivip OITRDs (Open ITRs in the DFZ)
or LISP PTRs (Proxy Tunnel Routers) somewhere in the DFZ.  Again, the
existing behavior of the CE router in the sending host's network
automatically sends outgoing packets to whichever ISP is currently
preferred according to outgoing TE preferences and which of the ISPs
are currently reachable and advertising the micronet (Ivip) or EID
(LISP) prefixes which are advertised by the OITRDs or PTRs.


The next section concerns how ITR-based probing or Ivip's approach
works when an upstream ISP is reachable, but that ISP has either no
connectivity to the rest of the Net, or patchy connectivity.

  LISP etc. with ITRs in the ISPs:

     Packets would be lost if the CE router sent packets to ISP1
     where ITR1 accepted them, but was then unable to send them to
     the ETR.

     (Would ITR1 recognise it couldn't reach beyond ISP1 and then
     somehow signal this so CEA would no longer send packets to
     ISP1?)


  Ivip with ITRs in the ISPs:

     As for LISP, but this is not how Ivip should be deployed -
     multihomed networks should either have their own ITRs, or
     rely entirely on OITRDs outside their network and outside
     their ISPs.

The problem in both cases is that CEA is sending packets to ISP1
which can't get them to their destination.  That is beyond the scope
of LISP etc. or Ivip to solve.

A better outcome is if the ITR is in the sending network, assuming
this is a multihomed end-user network.  For singlehomed end-user
networks, there is no difference in this probing respect between
having the ITR in the end-user network or in the single upstream ISP.
 (Also, OITRDs or PTRs are beyond these ISPs and are somewhere in the
DFZ.)

                            ~~~~~~~~~~~
   NA               ISP1   ~           ~   ISP3         NB
                PE1      BR1---     ---BR3     PE3
 ............  /           ~\         /~   ITR3   \  ........
 .          . /     ETR1   ~ \       / ~   ETR3    \ .      .
 . HA-ITRA-CEA            ~    DFZ    ~           CEB---HB .
 .          . \            ~ /       \ ~           / .      .
 ............  \   ISP2    ~/         \~   ISP4   /  ........
                PE2      BR2---     ---BR4      PE4
                           ~           ~   ITR4
                   ETR2     ~~~~~~~~~~~    ETR4

(Fig 2 - Sending host network has its own ITRs, and/or ITR function
in the sending host.)

  LISP etc. with ITRs in the sending host's network (NA):

    Now there is no problem, since CEA is handling packets which
    have already been encapsulated.  If ISP1 is disconnected from the
    Net, its PE router will not be advertising any prefixes, so CEA
    will send all outgoing packets to ISP2.

    Note, this is not applicable to Six/One Router - the equivalents
    of ITRs are always in the ISPs.

  Ivip with ITRs in the sending host's network (NA):

    As above and likewise with Ivip's forwarding approaches, based on
    the ETR address (or some of the address, for IPv6) being encoded
    into a modified IP header.  There is no need for the actual ITRs
    to do probing, since the disconnection of ISP1 from the Net is
    detected by CEA due to ISP1's PE1 router no longer advertising
    routes to any addresses outside ISP1.

So while host-based probing also detects problems as close as the
upstream ISP, the core-edge separation systems (LISP, APT, TRRP and
Ivip) have no problem in this regard, as long as each upgraded
multihomed end-user network has its own ITRs.  (Or non-upgraded
BGP-multihomed networks rely on OITRDs/PTRs rather than ITRs in their
upstream ISPs.)

> This has
> received criticism that a host-based model causes more
> signalling/reachability testing traffic than the tunnel-router-based one.

My critique along these lines is:

  Fundamental objections to a host-based scalable routing solution
  http://www.irtf.org/pipermail/rrg/2008-November/000233.html

> My point was a simple one: not necessarily so.  If TR-level granularity
> is acceptable, then it is doable with a hybrid host+TR-based solution
> (like the one I've been hand waving during the last month or so), and in
> that case the amount of reachability testing / signalling will be at the
> *same* level as for a TR-only solution.

I don't think ITR-based reachability testing is acceptable.

I think the only scalable approach is to have neither sending hosts
nor ITRs doing the reachability probing and decision-making.  The
only proposal so far which achieves this is Ivip.

 - Robin                           http://www.firstpr.com.au/ip/ivip/




_______________________________________________
rrg mailing list
[email protected]
http://www.irtf.org/mailman/listinfo/rrg

Re: [rrg] Why a host-based solution does not necessarily add signalling load

Reply via email to