* bgpd.texi: Document the -l argument. Update the 'BGP decision process' table
  to reflect what /actually/ is implemented. Add docs on 'compare-routerid' in
  the bestpath section.

  Add a section on MED, to highlight the issues it has by default, and to
  highlight that it is terminally broken for its original purpose in many
  modern iBGP topologies.

* routemap.texi: set an anchor on 'set metric' so bgpd.texi can reference it.
---
 doc/bgpd.texi     | 238 ++++++++++++++++++++++++++++++++++++++++++++++++++++--
 doc/routemap.texi |   1 +
 2 files changed, 234 insertions(+), 5 deletions(-)

diff --git a/doc/bgpd.texi b/doc/bgpd.texi
index 7d92b5e..80f4888 100644
--- a/doc/bgpd.texi
+++ b/doc/bgpd.texi
@@ -53,6 +53,13 @@ Set the bgp protocol's port number.
 @item -r
 @itemx --retain
 When program terminates, retain BGP routes added by zebra.
+
+@item -l
+@itemx --listenon
+Specify a specific IP address for bgpd to listen on, rather than its 
+default of INADDR_ANY / IN6ADDR_ANY. This can be useful to constrain bgpd
+to an internal address, or to run multiple bgpd processes on one host.
+
 @end table
 
 @node BGP router
@@ -104,18 +111,59 @@ This command set distance value to
 @node BGP decision process
 @subsection BGP decision process
 
+The decision process Quagga BGP uses to select routes is as follows:
+
 @table @asis
 @item 1. Weight check
+prefer higher local weight routes to lower routes.
   
-@item 2. Local preference check.
+@item 2. Local preference check
+prefer higher local preference routes to lower.
+
+@item 3. Local route check
+Prefer local routes (statics, aggregates, redistributed) to received routes.
+
+@item AS path length check
+Prefer shortest hop-count AS_PATHs. 
+
+@item 4. Origin check
+Prefer the lowest origin type route.  That is, prefer IGP origin routes to
+EGP, to Incomplete routes. 
+
+@item 5. MED check
+Where routes with a MED were received from the same AS,
+prefer the route with the lowest MED. See ...
+
+@item 6. External check
+Prefer the route received from an external, eBGP peer
+over routes received from other types of peers.
+
+@item 7. IGP cost check
+Prefer the route with the lower IGP cost.
+
+@item 8. Multi-path check
+If multi-pathing is enabled, then check whether
+the routes not yet distinguished in preference may be considered equal. If
+@ref{bgp bestpath as-path multipath-relax} is set, all such routes are
+considered equal, otherwise routes received via iBGP with identical AS_PATHs
+or routes received from eBGP neighbours in the same AS are considered equal.
+
 
-@item 3. Local route check.
+@item 10. Router-ID check
+Prefer the route with the lowest router-ID.  If the
+route has an ORIGINATOR_ID attribute, through iBGP reflection, then that
+router ID is used, otherwise the router-ID of the peer the route was
+received from is used.
 
-@item 4. AS path length check.
+@item 11. Cluster-List length check
+The route with the shortest cluster-list
+length is used.  The cluster-list reflects the iBGP reflection path the
+route has taken.
 
-@item 5. Origin check.
+@item 12. Peer address
+Prefer the route received from the peer with the higher
+transport layer address, as a last-resort tie-breaker.
 
-@item 6. MED check.
 @end table
 
 @deffn {BGP} {bgp bestpath as-path confed} {}
@@ -125,11 +173,31 @@ decision process.
 @end deffn
 
 @deffn {BGP} {bgp bestpath as-path multipath-relax} {}
+@anchor{bgp bestpath as-path multipath-relax}
 This command specifies that BGP decision process should consider paths
 of equal AS_PATH length candidates for multipath computation. Without
 the knob, the entire AS_PATH must match for multipath computation.
 @end deffn
 
+@deffn {BGP} {bgp bestpath compare-routerid} {}
+@anchor{bgp bestpath compare-routerid}
+
+Ensure that where iBGP routes are equal on most metrics, including
+local-pref, AS_PATH length, IGP cost, MED, the tie is broken based on
+router-ID.  If a route has an ORIGINATOR_ID attribute, i.e.  it has been
+reflected, that ID will be used.  Otherwise, the router-ID of the peer the
+route was received from will be used.
+
+The advantage of this is that the route-selection (at this point) will be
+deterministic, across iBGP.  The disadvantage is that such equal routes will
+tend to take the same exit out of the AS, via the lowest-ID router.
+
+If this option is enabled, then the external-age check, where already
+selected eBGP routes are preferred, is skipped.
+@end deffn
+
+
+
 @node BGP route flap dampening
 @subsection BGP route flap dampening
 
@@ -151,6 +219,166 @@ The route-flap damping algorithm is compatible with 
@cite{RFC2439}. The use of t
 is not recommended nowadays, see 
@uref{http://www.ripe.net/ripe/docs/ripe-378,,RIPE-378}.
 @end deffn
 
+@node BGP MED
+@section BGP MED
+
+The BGP @acronym{MED, Multi_Exit_Discriminator} attribute is a
+non-transitive attribute, intended to allow one AS to indicate preferences
+for ingress points to another AS.  E.g., if AS X and AS Y have 2 different
+BGP peerings, then AS X might set a MED of 100 on routes advertised on one
+of those and a MED of 200 on another.  AS Y then, when selecting between
+otherwise equal routes to or via AS X, should prefer to take the path via
+the lower MED peering of 100 with AS X.  Setting the MED allows an AS to
+influence the routing taken to it within another, neighbouring AS.
+
+In this use of MED, it is not really meaningful to compare the MED value on
+routes where the next AS on the paths differs.  E.g., if AS Y also had a
+route for some destination via AS Z in addition to the routes from AS X, and
+AS Z also set a MED, it wouldn't make sense for AS Y to compare AS Z's MED
+values to those of AS X. The MED values have been set by different
+administrators, with different frame of reference.
+
+The default behaviour of BGP therefore is to not compare MED values across
+routes received from different neighbouring ASes.  In Quagga this is done by
+comparing the neighbouring, left-most AS in the received AS_PATHs of the
+routes and only comparing MED if those are the same.
+
+Unfortunately, this behaviour means MED can cause the order of preference
+over all the routes to become undefined.  That is, given routes A, B, and C,
+if A is preferred to B, and B is preferred to C, a defined, transitive order
+of preference should mean that A is preferred to C.
+
+However, when MED is involved this need not be the case.  With MED it is
+possible that C is actually preferred over A.  This can be true even where
+BGP defines a deterministic "most preferred" route out of the full set of
+A,B,C.  For any given set of routes there may be a deterministically
+preferred route, however with MED there may be no way to arrange them into
+any order of preference.
+
+That MED can induce non-transitive orders of preference over routes can
+cause issues. Firstly, in creating routing table churn locally at speakers;
+secondly in creating routing instability in non-full-mesh iBGP topologies,
+where sets of speakers continually oscillate between different paths.
+
+The first issue arises from how speakers often implement routing decisions. 
+Though BGP defines a selection process that will deterministically select
+the same route as best, at any given speaker, even with MED, that process
+requires evaluating all routes together.  For performance and ease of
+implementation reasons, many implementations evaluate route preferences in a
+pair-wise fashion instead.  Given there is no well-defined order when MED is
+involved, the best route that will be chosen becomes subject to
+implementation details, such as the order the routes are stored in.  That
+may be (locally) non-deterministic, e.g.@: it may be the order the routes
+were received in.  This may be considered undesirable, though it need not
+cause problems.
+
+This first issue can be fixed with a more deterministic route selection that
+ensures routes are ordered by the neighbouring AS during selection. 
+@xref{bgp deterministic-med}.  This may (but need not) reduce the number of
+updates as routes are received, and may in some cases reduce routing churn.
+
+A deterministic comparison tends to imply an additional overhead of sorting
+over any set of n routes to a destination.  The implementation of
+deterministic MED in Quagga scales significantly worse than most sorting
+algorithms at present however, and may be expensive in terms of CPU if there
+are many paths for a destination (with or without MED).  
+
+Deterministic local evaluation can @emph{not} fix the second issue of MED
+however.  Which is that the non-transitive preference of routes that MED can
+induce is routing instability or oscillation across multiple speakers, when
+combined with non-full-mesh iBGP topologies that reduce routing information. 
+This has primarily been documented with iBGP route-reflection topologies. 
+However, any other route-hiding technologies potentially could also cause
+oscillation with MED.  
+
+The second issue occurs where speakers each have only a subset of routes. 
+E.g.  speaker X might have routes A,B, and speaker Y might have route C.  X
+selects A as its best, Y obviously can only choose C.  They exchange routes
+and then X might choose C as best from A,B,C while Y might choose A as best
+from A,C - the non-transitive, non-defined order of preference of routes
+that MED may induce allows this.  They then withdraw their routes and the
+cycle repeats.  This can occur even if all speakers use a deterministic
+order in route selection.  [D
+
+More complex and insiduous cycles of oscillation have been documented in the
+literature.  See, e.g., @cite{McPherson, D.  and Gill, V.  and Walton, D.,
+ "Border Gateway Protocol (BGP) Persistent Route Oscillation Condition",
+ IETF RFC3345}, and @cite{Flavel, A.  and M.  Roughan, "Stable and flexible
+ iBGP", ACM SIGCOMM 2009}, and @cite{Griffin, T.  and G.  Wilfong, 
+"On the correctness of IBGP configuration", ACM SIGCOMM 2002} for concrete 
examples and further
+references.
+
+There is as of this writing @emph{no} known way to use MED for its original
+purpose; @emph{and} reduce routing information in non-full-mesh iBGP
+topologies (e.g with reflectors); @emph{and} be sure to avoid the
+instability problems of MED due the non-transitive routing preferences it
+can induce.
+
+The instability problems that MED can introduce on more complex,
+non-full-mesh, iBGP topologies may be avoided either by:
+
+@itemize 
+@item
+Deleting MED from all routes received from neighbouring ASes,
+and/or by ignoring MED entirely in the decision process.  There is no way to
+do this at this time in Quagga.
+@item
+Setting @ref{bgp always-compare-med}, however this allows MED to be compared
+across values set by different neighbour ASes, which may not produce
+desirable results.
+@item
+Setting MED to the same value (e.g.  0) using @ref{routemap set metric} on all
+received routes, in combination with setting @ref{bgp always-compare-med} on
+all speakers.
+@end itemize
+
+As MED is evaluated after the AS_PATH length check, another possible use for
+MED is for intra-AS steering of routes with equal AS_PATH length, as an
+extension of the last case above.  As MED is evaluated before IGP metric,
+this can allow cold-potato routing to be implemented, sending traffic to
+preferred hand-offs with neighbours, rather than the closest IGP hand-off. 
+This would be done with @ref{routemap set metric} and by setting @ref{bgp
+always-compare-med} on all speakers.  
+
+Note that even if action is taken to address the MED non-transitivity
+issues, other oscillations may still be possible.  E.g.  on IGP cost if iBGP
+and IGP topologies are at cross-purposes with each other.
+
+@deffn {BGP} {bgp deterministic-med} {}
+@anchor{bgp deterministic-med}
+
+Carry out route-selection in way that produces more deterministic answers
+locally, even in the face of MED and the lack of a well-defined order of
+preference it can induce on routes.  Without this option the preferred route
+with MED may be determined largely by the order that routes were received
+in.
+
+Setting this option will have a performance cost that may be noticable when
+there are many routes for each destination.  Currently in Quagga it is
+implemented in a way that scales poorly as the number of routes per
+destination increases.
+
+The default is that this option is not set.
+@end deffn
+
+Note that there are other sources of indeterminism in the route selection
+process, @xref{BGP decision process}.
+
+@deffn {BGP} {bgp always-compare-med} {}
+@anchor{bgp always-compare-med}
+
+Always compare the MED on routes, even when they were received from
+different neighbouring ASes.  Setting this option makes the order of
+preference of routes more defined, and should eliminate MED induced
+oscillations.
+
+This option can be used, together with @ref{routemap set metric} to use MED
+as an intra-AS metric to steer equal-length AS_PATH routes to, e.g., desired
+exit points.
+@end deffn
+
+
+
 @node BGP network
 @section BGP network
 
diff --git a/doc/routemap.texi b/doc/routemap.texi
index db3e72d..7938c96 100644
--- a/doc/routemap.texi
+++ b/doc/routemap.texi
@@ -171,6 +171,7 @@ Set the route's weight.
 @end deffn
 
 @deffn {Route-map Command} {set metric @var{metric}} {}
+@anchor{routemap set metric}
 Set the BGP attribute MED.
 @end deffn
 
-- 
2.5.0


_______________________________________________
Quagga-dev mailing list
[email protected]
https://lists.quagga.net/mailman/listinfo/quagga-dev

Reply via email to