Re: [spring] Last Call: (SPRING Problem Statement and Requirements) to Informational RFC

2016-01-05 Thread Alexander Vainshtein
Hi all,

I have read the Segment Routing Problem Statement and Requirements draft and I 
have a couple of comments  on it.



Editorial:

The Abstract states that "Multicast use-cases and requirements are out of scope 
of this document", but this (or equivalent) statement does not appear anywhere 
in the body of the document. IMHO and FWIW this contradicts the last para in 
Section 4.3 of RFC 7322 that states that "the RFC should be self-contained as 
if there were no Abstract".



Technical:

The draft requires, in Section 2, that "The SPRING architecture SHOULD leverage 
the existing MPLS dataplane without any modification...".

In addition, in Section 3.3 it requires that "The SPRING architecture SHOULD 
allow incremental and selective deployment without any requirement of flag day 
or massive upgrade of all network elements"  .



My reading (FWIW) of these two requirements is that SPRING with MPLS dataplane 
should work on existing MPLS forwarding HW.



If this understanding is correct, it is in potential conflict with another 
requirement formulated in the Section1 of the draft: "The SPRING architecture 
SHOULD allow optimal virtualization: put policy state in the packet header and 
not in the intermediate nodes along the path".



This conflict stems from the following (admittedly, naïve) observation:



1.   The policy state representing the desired source route must be pushed 
in its entirety onto the packet by the source (here source is interpreted in 
the same way as in the draft itself) and must be parsed by all the transit 
nodes.

2.   The amount of the policy state grows (linearly?) as the number of 
elements in the source route selected by the packet. In particular, the policy 
representing a strict route could be potentially quite large.

3.   In the nodes that use hardware-based forwarding, the size of the 
policy state that can be pushed and parsed with the expected throughput is 
inherently limited. These limits differ for different implementations but they 
usually cannot be exceeded without replacing the forwarding hardware.

4.   Passing "offending" packets for software handling could result in 
dramatic decrease of throughput. S



In the case of the MPLS dataplane, the policy state is expressed as the MPLS 
label stack where each segment is represented by a label stack entry. AFAIK, 
existing (and probably future) forwarding HW that supports MPLS is inherently 
limited (the limits differ for different implementations) both regarding the 
number of labels that could be pushed on the packet, and regarding the total 
depth of the label stack that it can parse.



Note: The limit on the number of labels that can be pushed on a packet by 
forwarding HW is obvious. The limit on  that can be parsed becomes essential in 
the scenarios when ECMP is used, because:

*As per RFC 7325, Section 2.4.5.1., "The most entropy is 
generally found in the label stack entries near the bottom of the label stack 
(innermost label, closest to S=1 bit)"

*As per Section 2.4.5.2 of the same RFC, "Inspecting the IP 
payload provides the most entropy in provider networks.  The practice of 
looking past the bottom of stack label for an IP payload is well accepted and 
documented in [RFC4928] and in other RFCs".

*Both these methods (hashing the label stack and hashing IP 
header) obviously require parsing the entire label stack.



The limits of forwarding HW could be considered an implementation problem, were 
it not for the draft requiring (and I fully agree with validity of this 
requirement) that SPRING could be used on existing MPLS-capable HW.



>From my POV the document should at least explicitly acknowledge this conflict 
>as part of the SPRING problem statement. Preferably it should also include 
>some guidelines for its resolution:

*One possibility that comes to mind could be a requirement 
to provide the information about hardware-specific limitations to 
traffic-engineering entities in order to avoid computation of paths that do not 
meet HW-imposed constraints.

*Another possibility is to clearly indicate that loose 
route options are preferable for using with SPRING. To the best of my 
understanding this could be translated into a requirement for a new type of 
constrained path computation algorithms that yield loose (rather than strict) 
routes

Of course there may be other (and, possibly, better) ways to resolve this 
conflict. But, from my POV, if it is not acknowledged explicitly, its 
resolution becomes much more problematic.



Hopefully, these LC comments would be useful.



Regards,

Sasha



Office: +972-39266302

Cell:  +972-549266302

Email:   alexander.vainsht...@ecitele.com



-Original Message-
From: spring [mailto:spring-boun...@ietf.org] On Behalf Of The IESG
Sent: Tuesday, December 15, 2015 8:11 PM
To: IETF-Announce
Cc: pifra...@cisco.com; 

Re: [spring] draft-ginsberg-spring-conflict-resolution: SRGB INCONSISTENCY

2016-01-05 Thread Martin Horneffer

Hello Les, Acee, Stephane, everyone,

happy new year!

From an operator's (carrier's) point of view I clearly and strongly 
support this alternative solution: Treat an inconsistent set of SRGB 
announcements as broken and ignore it.


 - It is the simplest solution.
 - It only affects traffic of the badly configured and implemented router.
 - It gives a clear indication to the operator /where/ they have to 
repair something.


With respect to Stephane's comments:
 - I would also support a repetition or clarification that an 
inconsistent set of SRGB annoucements is /broken/.
 - No strong opinion from my side as how to define the offending node 
as non-SR-capable as I don't see any use case for nodes with only 
adjacency SIDs.


Best regards,
Martin


Am 04.01.16 um 06:55 schrieb Les Ginsberg (ginsberg):


One of the topics discussed in 
https://datatracker.ietf.org/doc/draft-ginsberg-spring-conflict-resolution/ 
is how to handle inconsistent SRGB advertisements from a given node.


The draft defines one possible solution -from Section 2:

" Each range is examined in the order it was advertised.  If it does

   not overlap with any advertised range which preceded it the

   advertised range is used.  If the range overlaps with any preceding

   range it MUST NOT be used and all ranges advertised after the first

   encountered overlapping range also MUST NOT be used."

This is one instance of a class of solutions which attempt to make use 
of part of the advertisements even when there is some inconsistency 
(overlap) in the set of SRGB ranges received. A more complete 
discussion of this class of solutions can be seen in 
https://mailarchive.ietf.org/arch/attach/spring/txtk0n56G.txt - many 
thanx to Bruno for writing this.


However, there is an alternative solution which was suggested (notably 
by Acee Lindem) after the draft was written. This alternative is to 
ignore the entire set of SRGB advertisements and treat the problematic 
router as if it were not SR MPLS capable. This alternative was 
discussed during the presentation in SPRING WG at IETF94 (see 
https://www.ietf.org/proceedings/94/slides/slides-94-spring-2.pdf 
slide #3 
). 
 It is also discussed in Bruno's post 
(https://mailarchive.ietf.org/arch/attach/spring/txtk0n56G.txt - see 
Section 2.2).


The basis of the alternative solution is that a correct implementation 
should never allow inconsistent SRGB ranges to be successfully 
configured (let alone advertised). So this is not a case of a 
misconfiguration – it is a case of a defective implementation. It 
 then seems appropriate to put the onus on the originating router to 
only send valid SRGB advertisements rather than forcing all the 
receivers to try to "correct" the invalid information in some 
consistent way. This has a number of advantages:


1.It is by far the simplest to implement

2.It isolates the router which is untrustworthy

3.As the problem can only occur as a result of a defective 
implementation the behavior of the originating router is unpredictable 
– it is therefore safer not to use the information


It is worth noting that since the invalid advertisement is the result 
of some sort of defect in the originating router’s implementation, it 
is not safe to assume that the source will actually be using the 
advertised SRGB in a manner consistent with the selective choice made 
by the receiving routers.


I therefore propose that the above quoted text from 
https://datatracker.ietf.org/doc/draft-ginsberg-spring-conflict-resolution/ 
be replaced with the following:


“The set of received ranges is examined . If there is overlap between 
any two of the advertised ranges the entire SRGB set is considered 
invalid and is ignored.


The originating router is considered to be incapable of supporting the 
SR-MPLS forwarding plane. Routers which receive an SRGB advertisement 
with overlapping ranges SHOULD report the occurrence.”


Comments?

   Les



___
spring mailing list
spring@ietf.org
https://www.ietf.org/mailman/listinfo/spring


___
spring mailing list
spring@ietf.org
https://www.ietf.org/mailman/listinfo/spring


Re: [spring] draft-ginsberg-spring-conflict-resolution

2016-01-05 Thread Martin Horneffer

Hello Bruno, Les and everyone,

while I do appreciate and understand Les' motivation to forward this 
document quickly, I would rather support Bruno's approach to first do a 
little of analyses and discussions of the possible options before 
finally deciding for one. So: many thanks to both of you for the work 
you have done here!


And I don' agree that this should delay progress for years; just a rough 
analysis and discussion should be ok and help the technology to improve 
significantly. Of course, if there were people who wanted to impede 
progress of SR standardization, they could use this for their purposes. 
But then they would always find ways, regardless of the exact approach.


When it comes to evaluating the different options, I would ask the 
authors and the WG to not limit the view to "traffic lost", but to 
overall operational /robustness/ and /security/.


In this context those criteria - traffic affected, robustness and 
security - may well all lead to the same preference of options. They are 
not exactly the same, however.


For example in Bruno's discussion in 3.6, all three criteria would 
strongly favor option 3.


Best regards,
Martin


Am 17.12.15 um 18:26 schrieb bruno.decra...@orange.com:

Hi Les, authors, WG

As an individual contributor, please find below some more detailed comments and 
considerations on the draft.

Following the "please send text" request expressed during the meeting, please 
find enclosed some proposed text. (xml, txt, diff versus public version).

I wished I had sent this before, but writing the text took longer than 
expected. BTW I still not happy with my text, but hopefully current text  (or 
at least the table of content) should be enough to give an idea on the 
direction I have in mind.

Thanks,
Regards,
Bruno

__
As previous expressed on the mailing list and during the meeting, I'm 
especially concerned with Mapping Server advertisement where a single typo/bug 
can conflict with many/all SIDs in the network. In this case, I don't think 
that dropping all traffic to those SIDs is a desirable option.
One option is to prefer individual advertisement (prefix SID) over general ones 
(Mapping Server). i.e. more specific wins. Note that this is the approach 
currently taken by the IS-IS draft:
"   For a given prefix, if both a MS entry with its Prefix-SID Sub-TLV
and a Prefix TLV (e.g.: TLV135) with its Prefix-SID are received, the
Prefix-SID advertised within the Prefix TLV MUST be preferred while
the MS entry MUST be ignored."

We can probably assume that this is already implemented (by compliant 
implementations), so I'm not sure to see the value of changing existing 
implementations if this is to get a more disruptive result for the network.
__
Regarding SID conflict, the current draft proposes to drop all conflicting 
information.
Looking at the big picture, this means: the more (Mapping Server) redundancy, 
the more risk of conflict, the less availability.
This is probably not the property that we are looking for.
__

I support the comment to consider the error handling work "recently" done in 
the IDR WG. IMHO, WG and authors did a good work on such a difficult subject (just like 
for SID conflicts, at the beginning opinions were diverse, and everyone had good reasons).
I'm not sure how much reading the end result (RFC 7606) helps in understanding 
all the trade-off considered during the work. Reading the operational 
requirements, given that the IGP infrastructure is probably even more important 
than BGP for network operators/clients/traffic, it may be worthwhile to read 
BGP error handling operation requirements ( 
https://tools.ietf.org/html/draft-ietf-grow-ops-reqs-for-bgp-error-handling-07 
) Discussion with the people involved may help. FYI, as for me, the point I 
learnt are:
- when we detect an error on the receiving side, there is a bug. In general, it's 
difficult to know whether the bug is on the sending side or the receiving side. Clearly, 
for the receiving side, the first reaction is to blame the sending side. For each error, 
it's useful to consider both options (i.e. error may be on my/receiving side). And even 
if the error is on the sending side, the receiver may run the same implementation (so 
"sender is too buggy to live" may apply to your own implementation i.e. the 
receiving side).
- when we detect an error, it's useful to consider all possible causes and 
consequences before deciding to make things worse (e.g. killing a 
session/transit node/prefix). In particular, there may not be a single error 
but many (SIDs, source routers (especially if the error is on the receiving 
side)). So it's useful to consider that the decision may be multiplied 10 or 
100 times.

Clearly there are difference between IGP & BGP: usually more redundancy in BGP 
(more signaling path (redundant RR) and more AS exit points), more prefixes hence 
the cost of dropping one prefix is relatively less important, two-way point to 
point