Re: [Pce] Questions about PCE Stateful Synchronisation

Adrian Farrel Thu, 22 Oct 2015 09:11:06 -0700

Just to pitch in and agree a bit more.
 
Since we're running PCEP over TCP, the loss of a message appears to be due to 
an implementation malfunction. As Dhruv says, this puts PCEP in the same s BGP. 
It's true that can implementation glitches can happen (with some people's code 
;-) but consider how BGP spots a route that has not been advertised and what it 
does about it.
 
Now, if we are really concerned, we could consider some form of redundancy 
check across all LSPs between a PCE and PCC. For example, the PCE could send a 
synch summary request, PCC could hash across all LSPs (using some set of 
fields) and send the result in a synch summary response, and the PCE could use 
this to check whether its set of LSPs matches those it knows about. If the hash 
fails the PCE would need to do a complete resynch because neither it nor the 
PCC can know which LSPs are out of synch.
 
However, before doing this, we might like to consider how often this sort of 
out of synch condition might occur.
 
Ciao,
Adrian
--
Read my latest...
Tales from the Wood - Eighteen new fairy tales
http://www.feedaread.com/books/Tales-from-the-Wood-9781786100924.aspx
http://www.amazon.co.uk/Tales-Wood-Adrian-Farrel/dp/1786100924
Or buy from me direct.
 
 
 
From: Pce [mailto:[email protected]] On Behalf Of Robert Varga
Sent: 22 October 2015 12:20
To: Dhruv Dhody; Olivier Dugeon; Dhruv Dhody
Cc: [email protected]
Subject: Re: [Pce] Questions about PCE Stateful Synchronisation
 
Hello,


I competely agree with Dhruv.

Design goal for stateful extension was to minimize control plane overhead for 
the usual case of steady state, so that PCE resources (which talks typically to 
many head end nodes) are used towards resolving failures. This concern was 
always the first one raised by both PCE and PCC implementations I talked to.

The resynchronization mechanism proposed is meant to be used as a background 
maintenance task, which 'scrubs' the TED and takes a back seat to recovering 
the network when LSPs churn. This cannot be done if the PCCs re-publish the 
same steady state over and over -- for a network of 1M LSPs with 1 hour refresh 
that amounts to ~278 PCRpt/s just to maintain steady state. PCE implementations 
I have talked to have found this mechanism sufficient to maintain enough 
consistency to keep operating even in face of such bugs not being caught by QA.

Anyway, the specification does not prohibit this behavior on PCC part and if a 
need arises for the PCE to request this functionality, it can be negotiated as 
a separate extension at session startup.

Bye,
Robert

On 10/22/2015 06:58 AM, Dhruv Dhody wrote:
Hi Olivier,
 
IMHO PCEP is much closer to BGP (when compared to IGP), and in that 
PCE-triggered full re-sync is similar to BGP’s route-refresh mechanism. So I 
guess we are re-using a similar concept which scales well too :)
 
As you also mention that such issue might be seen on a long running session, 
this re-sync if needed, would be after a long time. 
IMHO sending PCRpt repeatedly say at every 3600 secs for all LSP would be 
costly too. 
 
I will leave it to the authors of the base draft to answer your other question. 
In the implementations that I am aware of, that is the local policy and usually 
set to report all LSPs.  
 
Regards,
Dhruv
 
From: Pce [mailto:[email protected]] On Behalf Of Olivier Dugeon
Sent: 21 October 2015 19:43
To: Dhruv Dhody
Cc: [email protected]
Subject: Re: [Pce] Questions about PCE Stateful Synchronisation
 
Hello Dhruv,

Thanks pointing me this section, but I read the draft before asking my 
questions. For me, it is not completely cover this issue, and my answer is Yes 
and No.

Yes, because the mechanism allow such re-synchronization without closing the 
PCEP session, but at the price of exchanging PCUpd / PCRpt message when only 
one PCRpt message is sufficient. It solves the problem of orphan LSP. In this 
case, I agree that when the PCE asks to update an orphan LSP it gets an error 
(I suppose as it is not specify in the draft).

An No, because it not solves completely the issue if the PCE is not aware of an 
LSP. The PCE could not request an update about an LSPs it ignores. 

So, the only working solution for the PCE is to ask regularly all PCCs to send 
their complete LSP DB to be sure that it is always synchronized, but it is a 
very costly mechanism.

In addition, I have some doubt that this mechanism scale well. For large 
networks, say more then 1000 PE routers, so more than 1000 PCEP sessions, with 
a full mesh of TE tunnels (I know it is an extreme case) this increase 
drastically the exchange between the PCE and PCCs and add extra CPU process for 
the PCE. The later must parse 1,000,000 tunnels in its LSPs DB and ask PCCs to 
refresh them. Or, asks all PCCs to update all their tunnels. In the first case 
the PCE spent its time to send PCUpd message waiting for the corresponding 
PCRpt answer and in the second case it spent its time to parse 1000 tunnels 
each time it asks a PCC for a full refresh.

Flood periodically Link State in IGP protocol has proof for a long time that it 
scale well even in very large network. Why not re-using a similar concept ?

And, last but not least, what's happen with LSP tunnels that are configured on 
router that are not PCEP enable ? There are simply ignored ?

Regards,

Olivier
Le 21/10/2015 15:01, Dhruv Dhody a écrit :
Hi Olivier, 
 
Wanted to bring "PCE triggered re-syncronization" 
[https://tools.ietf.org/html/draft-ietf-pce-stateful-sync-optimizations-03#section-6]
 to your notice. This can be used by a PCE to periodically re-synchronize the 
database without bringing down the PCEP session. 
 
Will this not cover the issue you have in mind? 
 
Regards,
Dhruv
 
 
On Wed, Oct 21, 2015 at 3:29 PM, Olivier Dugeon <[email protected]> 
wrote:
Dear authors of draft-ietf-pce-stateful and 
draft-ietf-pce-stateful-sync-optimizations,

I know that we are in the last miles before publish PCE Stateful draft 
collection as RFCs, but regarding the chairs' review, I have a global 
interrogation about synchronisation. Even I-Ds try to avoid it, I'm afraid that 
there will different cases where de-synchronisation is not avoided between PCCs 
and PCEs. In particular, in case of problem, not a real failure, more a bug, 
memory saturation or whatever mal-function could occur on the PCE or PCC side, 
a PCE could miss a PCRpt message from a PCC or respectively a PCC could miss to 
send a PCRpt message to a PCE. I'm also afraid, after a long live period (say, 
several weeks or months) that some orphan LSPs appear in the PCE LSPs database 
without the possibility to detect them and remove them.

To go back in a full sync state, it is then necessary to restart properly the 
PCEP session, i.e. force a re-synchronisation. But, to do that, you need to 
discover the problem. That's another topic.

So, my question is why do you not have use a similar mechanism to routing 
protocol, i.e. OSPF, IS-IS or BGP, to periodically send LSPs state from the PCC 
to the PCE. Using an 'out of date' indication will allow the PCE to remove in 
its LSP-DB 'out of date' LSPs like OSPF do when it flushes an LSA with ageing 
equal to 3600 in its TED.

What it is sufficient is to add a new statement in draft-ietf-pce-stateful 
(e.g., in section 9.1.  Control Function and Policy) telling that:
 - the PCC MUST send PCRpt message on a regular basis, before MAX_AGE expire.
 - the PCE MUST ignore LSPs that are not refresh since a period of time greater 
than MAX_AGE.

Then, two cases are possible:
 a) MAX_AGE is fixed in the RFC e.g. to 3600 seconds like in OSPF (seems 
reasonable)
 b) Negotiate/exchange during PCEP session establishment or when PCRpt message 
is sent

If option (a) is quiet simple but not flexible, it has the great advantage to 
not introduce new PCEP Object while option (b) need new PCEP Object definition, 
but provide a greater flexibility.

If we agree on the statement above,  I think that option (a) is sufficient and 
just need additional text in current draft while if we want to support option 
(b), I could work on a new draft.

Regards,

Olivier
-- 
logo Orange <http://www.orange.com>

Olivier Dugeon
Orange Expert, Future Networks
Open Source Referent
Orange/IMT/OLN/WTC/IEE/OPEN

fixe : +33 2 96 05 28 80
mobile : +33 6 82 90 37 85
[email protected] <mailto:[email protected]>

_______________________________________________
Pce mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/pce
 
 




_______________________________________________
Pce mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/pce

_______________________________________________
Pce mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/pce

Re: [Pce] Questions about PCE Stateful Synchronisation

Reply via email to