Re: [OSPF] Solicit feedbacks on draft-dong-ospf-maxage-flush-problem-statement

Dongjie (Jimmy) Wed, 10 Aug 2016 21:25:33 -0700

Hi Les,

The current draft is about problem statement, so IMO what the WG needs to 
consider is whether this is a vulnerability of OSPF protocol, and whether it 
can have negative impact to the network. If the problem is acknowledged, IMO it 
is worth to be documented.


The "ROI" as you mentioned is for the evaluation of the proposed solutions. I 
totally agree that for the timer bug case, recognizing and ignoring the 
received abnormal Maxage LSAs cannot stop the misbehaved router from generating 
further Maxage LSA, as it is a systematic problem, which can only be fixed 
after the operator identifies that router. This is also similar to the 
systematic corruption of IS-IS remain time.  And this is why this draft 
mentions two kinds of potential solutions, the mitigation mechanism can avoid 
the network being severely impacted by the problem, while for systematic 
problems, problem localization is needed to identify the misbehaved router and 
then solve the problem.

Best regards,
Jie

From: OSPF [mailto:[email protected]] On Behalf Of Les Ginsberg (ginsberg)
Sent: Monday, August 08, 2016 2:14 AM
To: Dongjie (Jimmy) <[email protected]>; [email protected]
Cc: Zhangxudong (zhangxudong, VRP) <[email protected]>; 
[email protected]
Subject: Re: [OSPF] Solicit feedbacks on 
draft-dong-ospf-maxage-flush-problem-statement

Jie -

Thinking about the following some more:

<snip>
What remains is the possibility that an implementation has some bug and 
unintentionally modifies the age to something other than what it should be due 
to the actual elapsed time since LSA generation. I suppose a mechanism 
equivalent to what the IS-IS draft defined i.e. setting the age to "new" (0 in 
OSPF case) when first receiving a non-self-generated LSA could be useful to 
prevent negative impacts of such an implementation bug. Is this what you intend?

[Jie]: More specifically, the problem could be caused by either "setting the LS 
age field incorrectly due to implementation bug" or "system timer runs so fast 
that the LS age reaches MaxAge much earlier than other routers". Another less 
likely case is that the LS age field is corrupted before the LSA is assembled 
into OSPF packet.
<end snip>

The benefits are extremely limited. If a router prematurely ages an LSA due to 
a timer bug, ignoring the received LSA age on reception isn't going to prevent 
premature purging by the router which has the bug. So the effect of ignoring 
the received LSA age prior to reaching MAXAGE will be short lived. You are then 
left with the possibility that an implementation corrupts the LSA age BEFORE 
calculating checksum/crypto authentication - but its local timeout logic is 
unaffected. This has very limited value. Whether the WG considers this worth 
pursuing is something you need to ask. For myself, I don't see much ROI here.

  Les



From: Dongjie (Jimmy) [mailto:[email protected]]
Sent: Monday, August 01, 2016 9:43 PM
To: Les Ginsberg (ginsberg); [email protected]<mailto:[email protected]>
Cc: Zhangxudong (zhangxudong, VRP); 
[email protected]<mailto:[email protected]>
Subject: RE: [OSPF] Solicit feedbacks on 
draft-dong-ospf-maxage-flush-problem-statement

Hi Les,

Please see my replies with [Jie2]:

From: Les Ginsberg (ginsberg) [mailto:[email protected]]
Sent: Monday, August 01, 2016 9:57 PM
To: Dongjie (Jimmy); [email protected]<mailto:[email protected]>
Cc: Zhangxudong (zhangxudong, VRP); 
[email protected]<mailto:[email protected]>
Subject: RE: [OSPF] Solicit feedbacks on 
draft-dong-ospf-maxage-flush-problem-statement

Jie -

From: Dongjie (Jimmy) [mailto:[email protected]]
Sent: Monday, August 01, 2016 1:44 AM
To: Les Ginsberg (ginsberg); [email protected]<mailto:[email protected]>
Cc: Zhangxudong (zhangxudong, VRP); 
[email protected]<mailto:[email protected]>
Subject: RE: [OSPF] Solicit feedbacks on 
draft-dong-ospf-maxage-flush-problem-statement

Hi Les,

Please see inline with [Jie]:

From: Les Ginsberg (ginsberg) [mailto:[email protected]]
Sent: Monday, August 01, 2016 3:09 PM
To: Dongjie (Jimmy); [email protected]<mailto:[email protected]>
Cc: Zhangxudong (zhangxudong, VRP); 
[email protected]<mailto:[email protected]>
Subject: RE: [OSPF] Solicit feedbacks on 
draft-dong-ospf-maxage-flush-problem-statement

Jie -

Fully agree that IS-IS and OSPF differ in this regard.

https://www.ietf.org/id/draft-ietf-isis-remaining-lifetime-01.txt addresses 
problems where corruption of the remaining lifetime occurs either during 
transmission/reception or due to some DOS attack. This isn't a concern w OSPF 
(hope you agree).

[Jie]: Yes, for OSPF the corruption during packet transmission can be detected.

What remains is the possibility that an implementation has some bug and 
unintentionally modifies the age to something other than what it should be due 
to the actual elapsed time since LSA generation. I suppose a mechanism 
equivalent to what the IS-IS draft defined i.e. setting the age to "new" (0 in 
OSPF case) when first receiving a non-self-generated LSA could be useful to 
prevent negative impacts of such an implementation bug. Is this what you intend?

[Jie]: More specifically, the problem could be caused by either "setting the LS 
age field incorrectly due to implementation bug" or "system timer runs so fast 
that the LS age reaches MaxAge much earlier than other routers". Another less 
likely case is that the LS age field is corrupted before the LSA is assembled 
into OSPF packet.

[Jie]: Regarding the solutions space, IMO we need to consider both cases: "LS 
age reaches MaxAge" and "LS age close to MaxAge". For IS-IS, RFC 6232 and RFC 
6233 provide solutions for the detection and identification of corrupted IS-IS 
purge, while OSPF does not have similar mechanisms.

[Les:] It is incorrect to say that RFC 6232 makes it possible to detect a 
corrupt purge. What it does do is to provide an indication as to which IS 
initiated a purge. I don't know how OSPF would address this issue, but for 
OSPFv2 at least any solution would likely not be backwards compatible. For this 
reason I suggest that you not try to address this issue in the same draft.

[Jie2]: Agreed, RFC 6232 provide the mechanism to track the misbehaved routers 
so that operator can fix the problem, the detection can be based on the rules 
in RFC 6233 or some other anomalies. Indeed for OSPFv2 legacy LSAs, it is 
difficult to introduce the mechanism similar to RFC 6232, while it can be 
easier for the OSPFv2/v3 Extended LSAs. So it depends on how backward 
compatible the solution should be. I agree with you that the solution for 
Problem Localization in OSPF needs to be provided in a separate document.

Solutions to LS age  corruption can be done in a backwards compatible way, but 
they  MUST NOT result in discarding purges which pass authentication- doing so 
places you at risk for having inconsistent LSDBs in the network.

[Jie2]: Exactly. The received MaxAge LSAs cannot simply be discarded, the 
decision must be made carefully, probably based on some additional information. 
The authors has discussed some possible solution internally, and will prepare 
some material for further open discussion.

As written, the draft makes claims that are at least misleading - and I believe 
actually incorrect. In Section 6 you say:

"The LS age field may be altered as a result of
   packet corruption, such modification cannot be detected by LSA
   checksum nor OSPF packet cryptographic authentication."

This isn't correct.

[Jie] Thanks for pointing out this. This sentence need to be revised to mention 
"LSA corruption" rather than "packet corruption".

What would be helpful - at least to me - is to move from a generic problem 
statement to the specific problem you want to solve and the proposed solution. 
This also requires you to more clearly state the cases where there is an actual 
vulnerability. It would be a lot easier to support the draft if this were done.

[Jie] Thanks for your suggestion. Yes we can update this draft with more 
specific problem statements as I mentioned above.

[Jie] As for the proposed solutions, the current draft specifies the 
requirements on the potential solutions, from which we envision that different 
solutions maybe needed for "Impact Mitigation" and "Problem Localization". The 
solution for "Impact mitigation" can be the easier one, for which we can start 
to discuss the potential solutions now. While the solution for "problem 
localization" may need more considerations.

[Les:] A discussion of the requirements is useful and necessary, but IMO until 
you propose a solution there isn't enough substance for the document to become 
a WG document.

[Jie2] Yes the current draft focuses on the problem statement and the 
requirements, the goal is to firstly get the MaxAge flush problem acknowledged 
and reach consensus on the requirements. Then the plan is to specify the 
solutions in separate documents.  Your valuable suggestions will be considered, 
and further contributions are welcome.

Best regards,
Jie

    Les

Best regards,
Jie

   Les


From: Dongjie (Jimmy) [mailto:[email protected]]
Sent: Sunday, July 31, 2016 11:48 PM
To: Les Ginsberg (ginsberg); [email protected]<mailto:[email protected]>
Cc: Zhangxudong (zhangxudong, VRP); 
[email protected]<mailto:[email protected]>
Subject: RE: [OSPF] Solicit feedbacks on 
draft-dong-ospf-maxage-flush-problem-statement

Hi Les,

Thanks for your comments.

OSPF packet level checksum and authentication can only protect the assembled 
LSU packet one hop on the wire, while cannot detect any change to LSA made by 
the routers. This is because the OSPF packets are re-assembled on each hop, 
which is slightly different from IS-IS. So the problem for OSPF is mainly due 
to the problems inside the router, for example protocol implementations, system 
timers, or some hardware problem. Actually this problem has been seen in 
several production networks.

We can improve the description in the draft to make this clear.

Best regards,
Jie

From: Les Ginsberg (ginsberg) [mailto:[email protected]]
Sent: Monday, August 01, 2016 1:30 PM
To: Dongjie (Jimmy); [email protected]<mailto:[email protected]>
Cc: Zhangxudong (zhangxudong, VRP); 
[email protected]<mailto:[email protected]>
Subject: RE: [OSPF] Solicit feedbacks on 
draft-dong-ospf-maxage-flush-problem-statement

Jie -

The draft says (Section 2):

"Since cryptographic authentication is executed at the OSPF packet
   level, it can only protect the assembled LSU packet for one hop and
   does not provide any additional protection for the corruption of LS
   age field."

But as authentication is calculated at the OSPF packet level, any change to the 
LS age field for an individual LSA contained within the OSPF packet (e.g. by 
some packet corruption in transmission) would cause authentication to fail when 
the packet is received. So the statement you make is not correct. I therefore 
am struggling to understand what problem you believe is not addressed by 
existing authentication techniques.

   Les



From: OSPF [mailto:[email protected]] On Behalf Of Dongjie (Jimmy)
Sent: Sunday, July 31, 2016 8:15 PM
To: [email protected]<mailto:[email protected]>
Cc: Zhangxudong (zhangxudong, VRP); 
[email protected]<mailto:[email protected]>
Subject: [OSPF] Solicit feedbacks on 
draft-dong-ospf-maxage-flush-problem-statement

Hi all,

draft-dong-ospf-maxage-flush-problem-statement describes the problems caused by 
the corruption of the LS Age field, and summarizes the requirements on 
potential solutions. This draft received good comments during the presentation 
on the IETF meeting in B.A.

The authors would like to solicit further feedbacks from the mailing list, on 
both the problem statement and the solution requirements. Based on the 
feedbacks, we will update the problem statement draft, and work together to 
build suitable solutions.

The URL of the draft is:
https://tools.ietf.org/html/draft-dong-ospf-maxage-flush-problem-statement-00

Comments & feedbacks are welcome.

Best regards,
Jie

_______________________________________________
OSPF mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/ospf

Re: [OSPF] Solicit feedbacks on draft-dong-ospf-maxage-flush-problem-statement

Reply via email to