Hi Alvaro,
Thanks for your very careful review.
I may not agree with every comment made, but many will certainly improve
the document.
See inlined below...
Thanks,
-Thomas
2016-02-23, Alvaro Retana (aretana):
The Abstract says that the procedures are "inspired from BGP unicast
route damping". It seems to me that the intent is in fact to adopt
the algorithm from RFC2439. However, the text is not explicit/clear
about that.
Saying so would I think actually be a misleading simplification, the
reader may miss the facts:
- that the proposal is to keep advertising a dampened multicast VPN
state up (in RFC2439, a dampened route stops being advertised)
- that the document is not BGP-specific but also specifies a mechanism
for the PIM FSM
- that only exponential decay is borrowed from RFC2439
1. As you all know, the history behind BGP damping has not been
without it being considered useless and even having
recommendations (from RIPE, for example) not to use it.
A few things are important to have in mind:
- the application context here is not the Internet
- multicast state propagates in a very different fashion
- the damping algorithm techniques is not the same
- the side-effects of damping unicast and damping multicast as proposed
here are fundamentally different: here damping causes no impact on the
service
Overall, I would say that the weaknesses of RFC2439 and the
recommendations in RFC7196 were well known by co-authors and that we
came to the conclusion that this multicast VPN would not suffer from
similar weaknesses.
How did you arrive at the default and maximum values?
By simulating with simple parameters and choosing conservative low-risk
values.
Considering that, by design, whatever the parameters, multicast streams
will be delivered unchanged and that the only thing you tradeoff against
is less dynamicity and a possibly slightly increased bandwidth use, the
default and maximum values do not have to be perfectly tuned.
It concerns me that there are no known implementations (from the
Shepherd's report).
This concern is valid.
See below...
Because of that, I think this document would be better suited as an
Experimental RFC, with the explicit purpose of gaining experience with
the values and determine the impact in live deployments (which then
could support a standard version). Please consider changing the
intended Status.
Let me go back to why this proposal started: some lab testing was done
showing that it was easy, in the lab, to create significant overload on
PE and RRs BGP stacks by flapping multicast state at the edge. Having a
standard track to provide the appropriate tooling against this DoS risk
seems to me as making sense. I think that the proposed procedures are
not close enough to the solution and problem addressed by RFC2439 to say
that RFC2439's history is an argument to pass through an Experimental
RFC first.
According to the e-mail archive, it looks like an early presentation
of this work happened in an mboned meeting, but I didn't find
discussion on the pim or idr lists. Once the comments below are
addressed I will want to forward the document to pim/idr for their review.
Practically speaking pim/mboned is roughly the same people.
But yes, forwarding the document to these working group is fine.
Major:
1. There are 6 authors listed on the front page. According to
RFC7322, the total number is generally limited to 5. Please work
among yourselves to cut the number of authors. Alternatively, we
can just list an Editor (there's one already identified)..or you
can produce a justification detailing the contributions of each
author to consider an exception.
Ok, we will address that.
1. Replace the reference to RFC4601 with a reference to
draft-ietf-pim-rfc4601bis. Note that the section numbers have
changed slightly!
Yes, we can update.
(you see me surprised to see that appear as a "major" issue!)
1. Are you adopting the exponential decay algorithm from RFC2439?
That seems to be what's happening because you are not explicitly
defining a new algorithm, but some of the text leave doubts. For
example:
* "inspired from BGP unicast route damping" I know the
application is different, but if the algorithm is the same
then please say it.
The procedures associated to the exponential decay are different.
1.
* Section 5.1. (PIM procedures)
o "updating the *figure-of-merit* based on the decay
algorithm must be done prior to this increment" This
statement seems to directly imply that the algorithm is
used. Please reorder the steps to explicitly call this
one out, instead of plugging it in as an afterthought.
BTW, should the "must" be "MUST"? Ordering should help
you not having to deal with that last question.
I've revised the text to describe this step in its own bullet, prior to
"updating the figure-of-merit", to avoid this "after thought" impression
and make it as mandatory as the other steps.
1.
*
o "Same techniques as the ones described in [RFC2439] can be
applied…" "Can be"? This sentence seems to imply that
what is described in RFC2439 is optional. Are there other
ways of determining the same thing? What about the
exponential decay algorithm?
No, there are just multiple ways possible to update the figure-of-merit,
including the ones RFC2439 mention or detail.
I've reformulated the text to avoid misinterpretation:
These specifications do not impose the use of a particular technique
to update the *figure-of-merit* following the exponential decay
algorithm based on the configured *decay-half-life*. In particular
the same techniques as the ones described in [RFC2439] can be
applied. The only requirement is that the *figure-of-merit* has to
be updated prior to increasing it and that its decay below the
*reuse-threshold* has to be timely reacted upon: in particular, if
the recomputation is done periodically, the period should be low
enough to not significantly delay the inactivation of damping on a
multicast state beyond what the operator wanted to configure (i.e.
for a *decay-half-life* of 10s, recomputing the *figure-of-merit*
each minute would result in a multicast state to remained damped for
a much longer time than what the parameters are supposed to command).
1.
*
o It would also help if the terminology was consistent. For
example, instead of "damping becomes active" use
"suppressed". I can see how "suppressed" may give the
wrong impression as only the propagation of state is
affected. Explaining then how the terminology applies
would make it easier to reuse, avoid confusion and be
clear. Note that there's no mention of RFC2439 in the
terminology section.
Using the "suppressed" term to describe a state that we artifically keep
active is the most confusing thing that I can think of. As you say this
would give a wrong impression. I would go as far as to say that the
document would be barely understandable.
But maybe we can add this to the terminology section:
In these specifications, damping of a multicast state will be said
to be "active" or "inactive". Note that the term used for a unicast
route which is dampened is "suppressed", but we avoid this term is
these specifications given that a dampened multicast state is kept
active.
Would that help ?
1.
*
2. Section 3. (Overview): "…it is expected that this technique will
allow to meet the goals of protecting the multicast routing
infrastructure control plane without a significant
average increase of bandwidth". In general, I want to make sure
that the qualities of the solution and the expected results are
properly reflected in the document. [I'm using the text above as
the base for my comment, but the impact is larger.] Some questions:
* "…it is expected that this technique will…" I wonder why an
assertion can't be made that this technique can (vs just
expecting that it will) address specific problems. Is it the
case that experience is needed to make a stronger assertion?
Are the goals the same (or at least similar) in every
network? Are there implementations available? If so, please
consider an "Implementation Status" section (see rfc6982).
What has been the deployment experience? This goes back to
my comment above about the Intended Status of this document.
"It is expected" reflects the idea that the slight increase in bandwidth
will not be significant in most cases.
We can expand the text a bit to explain what would be the cases where
that would not work.
Let me suggest the following reformulation:
"That said, basic simulation of the exponential decay algorithm show
that the multicast state churn can be drastically reduced without
significantly increasing the duration for which multicast traffic is
forwarded. Hence, using this technique will efficiently protect the
multicast routing infrastructure control plane against the issues
described here, without a significant average increase of bandwidth.
The exception will be a scenario where the network dimensioning does not
allow to extend the time a multicast flow is forwarded beyond the
duration for which is it needed by receivers".
1.
* What specifically are the goals? In a couple of places the
text points back at Section 1. (Introduction), but I'm not
sure exactly what the goals are. Of special interest for
understanding the goals is the part in Section 4.2. (Existing
PIM, IGMP and MLD timers) where other solutions are discarded
for not meeting them.
o There is scattered text that talks about "…ensure that the
load put on the BGP control plane, and on the P-tunnel
setup control plane, remains under control…", "protecting
these control planes…avoiding negative effects…although at
the expense of a minimal increase in average of
bandwidth use…". However, the description is too vague
to point at what can satisfy these goals and what can't.
Section one 1 says:
- " Hence, mechanisms need to be put in place to ensure that the load
put on the BGP control plane, and on the P-tunnel setup control plane,
remains under control regardless of the frequency at which multicast
memberships changes are made by end hosts."
-then "This document describes procedures, remotely inspired from
existing BGP route damping, aimed at protecting these control planes
while at the same time avoiding negative effects on the service
provided, although at the expense of a minimal increase in average of
bandwidth use in the network."
The intent was that the text would be enough to make the goals clear.
Would the following change of the second sentence provide suitable
detail to help understand what can satisfy these goals and what can't
: ...?
[...] aimed at offering means to set an upper bound to the affected
control planes (BGP RFC6514 processing, and the P-tunnel control plane
protocol in certain cases as well) while at the same time preserving
service provided (delivering the stream to the end user as requested),
although at the expense of a minimal increase in average of bandwidth
use in the network.
I see that we can reorder the text to avoid splitting the explanation of
goals.
The new text would look like the following:
In VPN contexts, providing isolation between customers of a shared
infrastructure is a core requirement resulting in stringent
expectations with regards to risks of denial of service attacks.
By nature multicast memberships change based on the behavior of
multicast applications running on end hosts, hence the frequency of
membership changes can legitimately be much higher than the typical
churn of unicast routing states. Section 16 of [RFC6514]
specifically spells out the need for damping the activity of
C-multicast and Leaf Auto-discovery routes.
Hence, mechanisms need to be put in place to ensure that the load put
on the BGP control plane, and on the P-tunnel setup control plane,
remains under control regardless of the frequency at which multicast
memberships changes are made by end hosts.
This document describes procedures, remotely inspired from existing
BGP route damping, aimed at offering means to set an upper bound to
the amount of processing for the mVPN control planes protocols
([RFC6514], and the P-tunnel control plane protocol in certain cases
as well), while at the same time preserving service provided
(delivering the stream to the end user as requested), although at the
expense of a minimal increase in average of bandwidth use in the
network.
1.
* Section 4.1. (Rate-limiting of multicast control traffic)
mentions the "risk described in Section 1", which does mention
"risks of denial of service attacks". Is that the risk you're
referring to, or something else?
Yes. I've made that explicit in section 4.1.
1.
* Section 4.3. (BGP Route Damping) mentions "the principle
described in this document", which I thought was related to
the goals, but Section 1 says that the "base principle is
described in Section 3". I'm assuming the "principle" in
question is such that a "network operator…can delay the
propagation of multicast state prune messages between PEs,
when faced with a rate of multicast state dynamicity exceeding
a certain configurable threshold". That sounds like a
potential goal to me.
The base principe described in section 3 is a way to achieve the goal,
rather than a goal in itself.
1.
2. Section 5.2. (Procedures for multicast VPN state damping)
* In the Introduction you write that "Section 16 of [RFC6514]
specifically spells out the need for damping the activity…" I
think that RFC6514 does a lot more than that: Section 16.1.
(Dampening C-Multicast Routes) "proposes OPTIONAL
route dampening procedures similar to what is described in
[RFC2439]." Those procedures look very similar to the ones
in this document. What is the difference? Is the intent of
this document to complement, replace or maybe update what is
already specified in RFC6514?
Indeed, the base ideas for dampening were already here when we wrote
RFC6514.
draft-ietf-bess-multicast-damping provides precision on how to implement
RFC6514 16.1.1, but this is not an update per se as nothing in RFC6514
is changed.
We can make that fully explicit by saying in Section 1:
Section 16 of [RFC6514] specifically spells out the need for damping
the activity of C-multicast and Leaf Auto-discovery routes, and
outlines how to do it by "delay the advertisement of withdrawals of
C-multicast routes". These specifications provides appropriate
detail on how to implement that and how to make that controllable
by the operator.
1.
* There's an rfc2119 conflict. "…then the withdrawal of a
C-multicast route…SHOULD NOT be damped. An implementation of
the specification in this document MUST whether, not damp
these withdrawals by default, or alternatively provide a
tuning knob to disable the damping of these withdrawals."
s/whether/either The "MUST..not damp" and "SHOULD NOT be
damped" are in conflict. I think that eliminating the last
sentence would fix the problem and still allow an
implementation to put in any knobs that it wants.
(Ok for s/whether/either)
These two sentences were discussed already quite a lot, and I think we
need to keep this.
There is in fact not a conflict: what needs to be enforced in a
deployment is that "[under some conditions] ... SHOULD NOT damp some
specific withdrawals", to ensure that in implementations that this will
doable we add "MUST ([not damp these withdrawals by default] or [provide
a tuning knob to not damp these withdrawals])".
Removing the second sentence would mean something not strong enough: on
implementations that damp withdrawals by default (they can do that,
because this is acceptable in some deployments), we absolutely need the
knob (to address the more problematic deployment scenario).
1.
2. Section 7.3. (Default and maximum values) lists values that are
"RECOMMENDED to adopt as default conservative values". Any guid
1. ance about when and/or how an operator should consider
changing the recommended defaults? What does "conservative"
mean in this context? What if the operator wants to be
more aggressive?
We can certainly improve this section a little bit.
I propose to add:
This section proposes default and maximum values, conservative so as
to not significantly impact network dimentioning but still prevent
post-dampening multicast state churn to go beyond what can be
considered a reasonably low churn for a multicast state.
The following values are RECOMMENDED to adopt as default values:
And, as illustrations:
With these values, as an illustrations:
o a multicast state not updated more frequently than one every 6s
will not be dampened
o a multicast state changing once per second for 3s, and then not
changing, will not be dampened
o a multicast state changing once per second for 4s, and then not
changing, will be dampened after the fourth change for
approximately 13s
o a multicast state changing twice per second for 30s, and then not
changing, will be dampened after the fourth change for
approximately 13s
1.
Minor:
1. In 4.2
* s/prune override interval/J/P_Override_Interval
I'd rather keep the plain text version.
1.
* Reference for explicit tracking..?? BTW, how would the
mechanism in this document interact with explicit tracking?
Will add a reference.
1. Section 5.1. (PIM procedures):
* "…a router implementing these procedures MUST…apply unchanged
procedures for everything…". I guess that these "unchanged
procedures" are the ones in rfc4601bis, right? In other
words, what you seem to want is that, in addition to what
rfc4601bis specifies, for the other steps defined in this
document to happen. If that is correct, please reword the
description to make it clear — at least put a reference so
that there is no question about which procedures are left
unchanged.
Ok.
1.
* "…freeze the upstream state machine…and setup a trigger to
update it…" Maybe a word like "hold" or "maintain" might be
better. In fact, even better would be an explicit indication
that "events that may result in the state changing
[rfc4601bis] SHOULD be ignored until the reuse threshold is
reached", or something along those lines.
I'll adopt your suggested text, with the precision that only changes in
the upstream state machinie will be ignored.
1.
* What should the state be updated to when the reuse threshold
is reached?
I've added this detail.
hold the upstream state machine in Joined state so that events that may
result in the upstream state changing based on RFC4601bis SHOULD be
ignored until the reuse threshold is reached, and setup a trigger to
update the upstream state machine based on downstream state machines
once the reuse threshold is reached. The effect is that in the meantime,
PIM Join messages will be sent as refreshes to the upstream neighbor,
but no PIM Prune message will be sent.
1.
* I had some trouble parsing this text: "When
the recompilation is done periodically, the period should be
low enough to not significantly delay the inactivation of
damping on a multicast state beyond what the operator wanted
to configure (i.e. for a *decay-half-life* of 10s, recomputing
the *figure-of-merit* each minute would result in a multicast
state to remained damped for a much longer time than what the
parameters are supposed to command)." I think I got it…but
what I don't get (based on my understanding of RFC2439) is
that the figure-of-merit should decay according to the
half-life, so I don't get why its value would be adjusted at a
period that is not related to the half-life.
You have many ways to simulate a behavior triggered by a decaying value
crossing a threshold:
- periodically recompute the value and see whether it crossed the
threshold: naive/easy approach, the period does not need to be related
to the decay-half-life as long as it is short enough
- compute the time at which it will reach the threshold and setup a
timer to fire at that time
- more optimized to group computation for multiple routes/states, see
RFC2439
1.
2. Section 5.2. (Procedures for multicast VPN state damping)
* There are several places in this section where rfc2119
language is used to describe what an implementation should do
that sound to me as an attempt to define functionality that
is mandatory to implement (MTI). I find that hard/impossible
to enforce and would like to see the rfc2119 language removed.
Please see below..
Yes, the MUSTs in this 5.1 and 5.2 intent to carry the meaning of
"mandatory to implement".
1.
* The text says that an "implementation of [RFC6513] relying on
the use of PIM to carry C-multicast routing information MUST
support this technique." That "MUST" is really strong and it
makes me think that this document should then be marked as an
update to RFC6513. Is that the intent? Reading through it
again, is the intent MTI?
No, the intent is not to update RFC6513.
I've added the precision: "MUST support this technique, to be compliant
with these specifications."
1.
* "…the following procedure is proposed as an alternative to the
procedures in Section 5.1…" "proposed"?? Does this mean that
5.1 is also applicable in this case (when "BGP is used to
distribute C-multicast routing information")?
Yes.
1.
* It sounds like the operator would have an option — if so,
when should each be considered?
o Later in the same section you wrote: "…choice to implement
damping based on BGP routes or the procedures described in
Section 5, is up to the implementor, but at least one of
the two MUST be implemented." I think it should be
section 5.1.
Yes indeed. I will correct.
1.
*
o Do you really mean the "implementor", or are you
referring to the operator of the network? The "MUST"
sounds too strong for me because it is not needed for
interoperability (rfc2119) -- or is this an attempt at MTI?
This is an indication for implementors, so MTI.
1.
*
o Same question/observation for "In the perspective of
allowing damping to be done on RRs and ASBRs, implementing
the BGP approach is RECOMMENDED." Maybe
s/RECOMMENDED/recommended
I understand that you seem to prefer avoiding RFC2119 language for MTI
things.
But I don't know another way than RFC2119 language to indicate what is
mandatory to implement to be compliant with a spec, and I think this is
a fairly well established practice. This is not the first document to
use RFC2119 to indicate MTI things.
What is the rationale for not using RFC2119 language ?
1.
*
* "…it can be considered useful to also be able to apply damping
on RRs as well." When is it considered useful?
o Note that later you also write: "…in such a context, it is
RECOMMENDED to not enable any multicast VPN route damping
on RRs…" This partially answers the question. It would
be nice to put the guidance together.
I've added text to improve that.
1.
*
* "…damping SHOULD NOT be applied to BGP routes of the following
sub-types…" Are there cases when it is ok? In other words,
why is the "SHOULD NOT" not a "MUST NOT"?
Maybe someone can find a case where this does not break things, under
some conditions.
We saw nothing mandating the use of "MUST NOT".
1.
2. Section 6.1. (Damping mVPN P-tunnel change events) "Possible ways
to do so depend on the type of P-tunnel, and local implementation
details are left up to the implementor. The following is
proposed as example of how the above can be achieved." Either you
leave it as an implementation detail or you provide guidance. If
this document was Experimental, then providing guidance it great!
There is a gap between "example" and "guidance".
I think an example can help the reader (implementor or deployer).
Guidance would mean that we start influencing the implementor, which is
not the idea here.
Nits:
1. Please put references on first appearance. For example, IGMP and
MLD are mentioned in the introduction, but no reference is made
until the 5th mention. The same for BGP route damping…
Ok.
1.
* You also need a reference to route reflectors.
Added.
1.
2. "these control planes" Which control planes? Please be specific.
There are other places where "these" is uses that may not be
completely clear..please take a look.
Reworded as part of other changes following your comments.
1. s/these specifications/this specification
ok
1. s/when enabled /when enabled,
ok
1. "PIM-SM specifications [RFC4609]" RFC4609 is not the PIM spec.
Yes, typo.
Thanks.
1. Section 6.2. (Procedures for Ethernet VPNs) "…an implementation
of these procedures MUST follow the procedures described in
Section 6.1." It is not completely clear which "these procedures"
are. I'm guessing RFC7117.
Yes, fixed.
_______________________________________________
BESS mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/bess