Stephane, I think the case you describe is "complicated". There are two possible arrangements of PCEs in what you describe. 1. Both PCEs can control the same PCC 2. The PCEs are "stacked" in a hierarchical way. Then you are interested in what happens when the PCC goes down. In case 2, the child PCE acts as a "buffer" for requests and commands from the parent PCE that can't be immediately delivered to the PCC. I don't think any changes in behaviour are needed. In case 1 my question is about how PCE1 knows that the PCC is down. Presumably the session went down. Now, PCE2 could be in active or passive standby mode. 1a. In active standby mode, PCE2 also has a session with the PCC so also knows that the PCC is down. (It is an operational choice whether one is dominant.) PCE1 and PCE2 do not need to coordinate on the status of the PCC, even though they (obviously) do coordinate on the status of the LSPs. I don't think any changes in behaviour are needed. 1b. In passive standby mode, PCE1 has a session to the PCC, but PCE2 would only open a session when PCE1 fails or when the PCC contacts it. The PCEs will coordinate (synchronise) their LSP-DBs, so the challenge here is when there is a second order failure - the PCC is down *and* PCE2 needs to take over as primary. In other words, when PCE2 takes over, it knows the state of the LSP but it doesn't know whether that state is stale because of the PCC failure. However, I think the first thing it does is try to connect to the PCC: if that fails it should assume that the state of the LSP is unknown; when PCE2 does manage to connect to the PCC it should verify the LSP state for all LSPs. Note that there may also be some synchronisation windows. For example, PCE1 has sent a PCInitiate to the PCC and the PCC has failed. PCE1 has no way to know whether the LSP was set up before the failure and it was only the report that was not received (actually, the PCE could discover this from inspection of the rest of the network, but normally this is not done). In this case, PCE1 cannot synch to PCE2 accurately using "normal" ideas of "LSP exists". But (of course) it could synch *exactly* what it knows: viz. I tried to set up this LSP and I don't know (yet) whether it was successful. (This window exists for all operations.) Of course, there are always windows in the synchronisation procedure. For example, PCE1 could issue a command to the PCC and fail before it manages to synch to PCE2. Some people might say that this is simply a DB synchronisation problem. While synchronising final state is essential, there may be other things (such as intended state, message exchanges, etc.) that could also be considered for synchronisation. In fact, a totally stable synchronisation requires synch'ing intent to issue a command. This is all a well-known and solved problem (for DB synch and for redundant protocol components). The question for the WG is the trade-off between edge cases and cost/complexity. However, I don't think there is a change to PCE-PCC protocol. </rambling> Adrian From: [email protected] [mailto:[email protected]] Sent: 23 June 2017 08:09 To: Dhruv Dhody; Alexander Vainshtein; [email protected] Cc: Marina Fizgeer; Michael Gorokhovsky; [email protected]; [email protected]; Alexander Ferdman Subject: RE: [Pce] Is there any activity related to PCE graceful restart? Hi Dhruv, If the PCE keeps a "stale state" from the PCC, in the framework of state-sync, we need a way to indicate to other PCEs that the state is stale, so if the master PCE is another PCE, it will not try to update the state for this LSP until the PCC is back online. Brgds, From: Dhruv Dhody [mailto:[email protected]] Sent: Thursday, June 22, 2017 11:23 To: LITKOWSKI Stephane OBS/OINIS; Alexander Vainshtein; [email protected] Cc: Marina Fizgeer; Michael Gorokhovsky; [email protected]; [email protected]; Alexander Ferdman Subject: RE: [Pce] Is there any activity related to PCE graceful restart? Hi Stephane, Sasha, I agree with your overall assessment. Finer point inline. From: Pce [mailto:[email protected]] On Behalf Of [email protected] Sent: 22 June 2017 13:44 To: Alexander Vainshtein <[email protected]>; [email protected] Cc: Marina Fizgeer <[email protected]>; Michael Gorokhovsky <[email protected]>; [email protected]; [email protected]; Alexander Ferdman <[email protected]> Subject: Re: [Pce] Is there any activity related to PCE graceful restart? Hi Sasha, As Dhruv mentioned, restarting a PCE is not a big deal, we have already the mechanism defined to handle this without traffic disruption. Your email mentions also, PCC control plane restart which is a bit more tricky IMO. >From a PCC point of view, I think you request the PCC to keep the dataplane intact when the PCC process or RSVP process or IS-IS is getting down (during failure or restart or upgrade.). For the PCC process, I think this could be addressed by a purely local mechanism. [[Dhruv Dhody]] I agree. IMO the PCC process can rebuild its state from a local mechanism, say by data plane (SR label stack), by learning from TE manager, RSVP-TE process etc. The session establishment with PCE should be triggered once the state is build, so that state synchronization (and optimizations) work well.
And incase this is not possible, the traffic can continue to flow in the old path till a new path is signaled after restart. Is there a case where the traffic gets impacted and a protocol mechanism can solve it? Now I see a case where the PCE needs to keep the state from a PCC when the PCC restarts => [[Dhruv Dhody]] Also, when to delete the state learned from the PCC at the Stateful PCE is also a local matter of the PCE. The State time out is specified from PCC point of view, at PCE the behavior is up to the implementation. For disjoint cases, good to keep the state longer at PCE :) Regards, Dhruv my favorite disjointness use case or any other use case where LSPs on other PCCs depends on the LSP of the PCC which is restarting. Let's say that PCC1 owns LSP1, PCC2 owns LSP2. LSP1 and LSP2 depends of each other. If the PCE loses the state of LSP1 because PCC1 restarts, it may reroute LSP2 on a path that does not fulfill the dependency of the two LSPs anymore while at the same time LSP1 was kept intact by PCC1 from a forwarding plane point of view. Is it a critical issue ? During a transient period (the PCC restart), some LSPs may not fulfil their constraint anymore. Does it justify extensions to the protocol ? I do not have a strong opinion on that: it's always a question of complexity to introduce vs gain. Adrian, Dhruv, did I miss something ? Am I wrong ? Brgds, Stephane From: Pce [mailto:[email protected]] On Behalf Of Alexander Vainshtein Sent: Monday, June 19, 2017 14:48 To: [email protected] Cc: Michael Gorokhovsky; [email protected]; [email protected]; Marina Fizgeer; Alexander Ferdman Subject: Re: [Pce] Is there any activity related to PCE graceful restart? Adrian, Lots of thanks for a prompt response. However, our primary interest is the control plane (including PCC) restart in a network element with separated control and forwarding planes. Specifically, my colleagues and I try to understand, how to make such a restart non-traffic affecting while: - The network uses Segment Routing for setting up paths computed by the PCE. This means that these paths are only known to their respective head-end nodes. This situation is different from the scenario where RSVP-TE is used to signal these paths, since they cannot be re-learned from the neighbors as part of the RSVP-TE graceful restart procedures - The protocols used for distributing SR-related information (i.e., IGP and BGP SR extensions) are GR-capable, and GR for them is enabled in the network - The PCE is an active stateful PCE, i.e., it instructs the head-end node, which paths should be set up without any requests coming from the nodes. Hopefully this clarifies the context for our question. It may well be that the requirement for non-traffic affecting control plane restart can be addressed without any changed to the existing protocols. Alternatively, it is possible that some minor changes (like making the PCE aware of separation between the control and forwarding planes, negotiation of GR capabilities and grace periods etc.) are required. Any inputs would be highly appreciated. Regards, and lots fo thanks in advance, Sasha Office: +972-39266302 Cell: +972-549266302 Email: [email protected] From: Adrian Farrel [mailto:[email protected]] Sent: Sunday, June 18, 2017 8:34 PM To: Alexander Vainshtein <[email protected]>; [email protected]; [email protected]; [email protected] Cc: Marina Fizgeer <[email protected]>; Michael Gorokhovsky <[email protected]>; [email protected]; Alexander Ferdman <[email protected]> Subject: RE: [Pce] Is there any activity related to PCE graceful restart? Sasha, What are you hoping to achieve? That a restarting PCE can retain its TED and LSP-DB? That a restarting PCE can synch state with the network? That a restarting PCE with outstanding (unanswered) messages in either direction can not need to resend them? That a restarting PCE can resend outstanding (unanswered) messages without problems caused by duplication? I think you may want to read around the definition of the request-id. Although 5440 doesn't make it explicit, a lot comes from how you process the request-id. That "a lot" arose from consideration of parallel sessions and distilled to not needing to write about restart. Cheers, Adrian (resurrecting old memories, possibly not entirely accurately) From: Pce [mailto:[email protected]] On Behalf Of Alexander Vainshtein Sent: 18 June 2017 16:43 To: [email protected]; [email protected]; [email protected] Cc: Marina Fizgeer; Michael Gorokhovsky; [email protected]; Alexander Ferdman Subject: Re: [Pce] Is there any activity related to PCE graceful restart? Re-sending with the correct WG mailing list address. Regards, Sasha Office: +972-39266302 Cell: +972-549266302 Email: [email protected] From: Alexander Vainshtein Sent: Sunday, June 18, 2017 6:41 PM To: '[email protected]' <[email protected]>; '[email protected]' <[email protected]>; '[email protected]' <[email protected]> Cc: '[email protected]' <[email protected]>; Michael Gorokhovsky <[email protected]>; Marina Fizgeer <[email protected]>; Alexander Ferdman <[email protected]> Subject: Is there any activity related to PCE graceful restart? Hi all, My colleagues and I tried to find any work in the PCE WG related to PCEP graceful restart. So far, we did not succeed. This could mean one of the following: - Our search did not go deep enough. In this case pointers to any specific documents would be highly appreciated - Such work does not exist because (for some reason) it is not required. This looks problematic to me, especially if we deal with a stateful active PCE and the path computed by the PCE was implemented using Segment Routing (so that only the head end of the computed path is aware of the path). However, I could have missed something obvious, and any clarifications would be highly appreciated - Such work is required but, so far, nobody has taken care of it. The implications are obvious:-(. Your feedback would be highly appreciated. Regards, and lots of thanks in advance, Sasha Office: +972-39266302 Cell: +972-549266302 Email: [email protected] ___________________________________________________________________________ This e-mail message is intended for the recipient only and contains information which is CONFIDENTIAL and which may be proprietary to ECI Telecom. If you have received this transmission in error, please inform us by e-mail, phone or fax, and then delete the original and all copies thereof. ___________________________________________________________________________ ___________________________________________________________________________ This e-mail message is intended for the recipient only and contains information which is CONFIDENTIAL and which may be proprietary to ECI Telecom. If you have received this transmission in error, please inform us by e-mail, phone or fax, and then delete the original and all copies thereof. ___________________________________________________________________________ ________________________________________________________________________________ _________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you. ________________________________________________________________________________ _________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you.
_______________________________________________ Pce mailing list [email protected] https://www.ietf.org/mailman/listinfo/pce
