Hi Dmytro, For anything what is not clear in existing RFCs, can be still clarified in operational draft (that is the purpose of that draft) https://www.ietf.org/archive/id/draft-ietf-pce-operational-01.html or in: https://www.ietf.org/archive/id/draft-many-pce-stateful-amendment-02.html
Or we can start even completely draft if some of those topics is "too big". Even if there will be no conclusion about which implementation is correct or not correct, at least such draft can document possible approaches, so future implementations will not need to figure it out by doing interop testing (like in your case). For any Cisco PCC specific issues/clarifications, you can reach to me privately as well and I can try to explain or forward your question to somebody who can help with that. Regards, Samuel From: Dmytro Shypovalov <dmy...@vegvisir.ie> Sent: Tuesday, July 1, 2025 4:42 PM To: Samuel Sidor (ssidor) <ssi...@cisco.com> Cc: olivier.dug...@orange.com; pce@ietf.org Subject: Re: [Pce] Re: PCEP delegation and dual-PCE redundancy - too vendor-specific? Hi Samuel, I had few other problems with Cisco PCC, such as sync working in a weird way (Cisco sends all LSP in down state, then sync complete report, and only then updates the LSP status), adding 1 extra byte to symbolic path name TLV, and some mess with LSP-ID but through trial and error I managed to solve those. On the other hand, LSP delegation works better on Cisco PCC. Juniper PCC has a different set of problems (and so I'm sure do Nokia, Huawei, IP Infusion etc), but my main concern is not that vendor A is better than vendor B, but the poor quality of PCEP RFCs which are very vague, inconsistent, don't cover many scenarios and each vendor then implements their own version of PCEP that is not very compatible with others. Regards, Dmytro On Tue, 1 Jul 2025 at 10:17, Samuel Sidor (ssidor) <ssi...@cisco.com<mailto:ssi...@cisco.com>> wrote: Hi Dmytro, I'm glad to see that integration with Cisco PCC was smoother for you. You can also check configuration "segment-routing traffic-eng pcc redundancy [pcc-centric | pce-centric]". PCC-centric model is the default on Cisco PCCs and that's following logic, which you already described (PCC driving delegation for PCE Initiated LSPs). If you want behavior, which is more aligned with Juniper, then you try to switch to PCE centric model, where PCC is relying on PCEs to reclaim/drop delegation for PCE initiated LSPs, but as you figured out already, it requires more complex logic on PCE (or synchronization between PCEs, e.g. via state-sync<https://datatracker.ietf.org/doc/draft-ietf-pce-state-sync/>) and/or tweaking using timers. Regards, Samuel From: Dmytro Shypovalov <dmy...@vegvisir.ie<mailto:dmy...@vegvisir.ie>> Sent: Tuesday, July 1, 2025 10:39 AM To: olivier.dug...@orange.com<mailto:olivier.dug...@orange.com> Cc: pce@ietf.org<mailto:pce@ietf.org> Subject: [Pce] Re: PCEP delegation and dual-PCE redundancy - too vendor-specific? Hi Olivier, Thanks for your feedback. I did more reading and lab testing, I also tried to test the approach I described in my first email and that didn't work very well. RFC8231 allows different LSP to be delegated to different PCE, and while Cisco PCC usually elects a primary PCE and delegates all LSP to it, Juniper PCC delegates the LSP to whichever PCE that created it and keeps it delegated there. Then I took a different approach, where I implemented delegation and de-delegation on both PCE which works per LSP (so there is no primary/secondary, all depends on where PCC chooses to delegate). An obvious problem here is what if both PCE try to send PCInitiate for a new LSP simultaneously - then PCC would complain about symbolic path name in use and drop the session. This scenario is not covered by RFCs, so I solved it by implementing a configurable "init-delay" timer. The idea is that one of the PCE (let's say secondary) has a 5-10 second delay when initing new LSP. If within this time, the other PCE (without delay timer) has already inited the LSP, we receive PCReport so we know the LSP status and there is no need to send PCInitiate. Another problem is that Juniper PCC does not redelegate, instead if the delegated PCE dies, Juniper sends PCReport with R flag and all fields set to 0 in LSP identifier (similar to the special PCReport when sync is complete). I implemented a trigger that if such a report is received (and the LSP is still alive), we need to send immediate PCInitiate for this LSP, disregarding the init-delay timer. So far tests look good with both Cisco and Juniper, but I will look into this more, trying to introduce different scenarios to break the setup. I haven't implemented PCC-initiated LSP yet but I think there will be less problem with race conditions during delegation like I described above. I also need to try more of different PCC implementations to see if someone implemented it in yet another way. If only there was an organisation that developed standards that all vendors would follow, instead of each vendor implementing custom non-compatible solutions... Regards, Dmytro On Mon, 30 Jun 2025 at 18:10, <olivier.dug...@orange.com<mailto:olivier.dug...@orange.com>> wrote: Hello Dmytro, I ran into the same problem during my various tests. I noted that it is a pure PCC problem and behaviors remain the same whatever the PCE. For me, IMHO, the correct behavior is the Cisco one i.e. when the primary PCE failed and PCC detects that the PCEP session goes down, it starts re-delegate to the secondary PCE all its own LSPs that have been initiated or updated (i.e. the LSPs configured by the PCC but delegated to the primary PCE) by the primary PCE. Regarding your questions: 1/ Some routers (see below) implement priority parameter for the PCE session. Thus, PCC will delegate tunnel to PCE with the highest priority. But, I agree that it is vendor dependent. From PCE side, it simply fill its LSP_DB with reported LSPs with delegate bit set to 1 or 0. But, this will not indicate that PCE is the primary or secondary. You could e.g. split your network in 2 independent part where each PCE is both primary for its part of the network and secondary for the other one. Again, it is a matter of vendor implementation to specify if a PCE is a primary or secondary one. From my knowledge, vendor prefer to propose redundancy within a single PCE (e.g. cluster of PCE). 2/ Agree with this expected behavior. Again, the PCE will not be able to determine that it is now the primary. But, from the re-delegation, it will check that its LSP_DB is in sync with LSPs reported by PCCs and made appropriate adjustment. Normally there is not if PCCs didn't remove LSPs when they detect that the PCEP session with the primary PCE goes down. 3/ This the normal behavior. Secondary PCE store in its LSP_DB the LSPs initiated by the primary PCE without delegation. When PCCs report to the secondary PCE the LSPs with delegation, the secondary PCE must preform reconciliation (see point above). Note: I also know that on Juniper, when PCEP session goes down, the router starts by removing all initiated LSPs. This behavior could be tune with the `lsp-cleanup-timer` parameter. By default, it was set to 0 (old JunOS version) and now set to 60 sec. Thus, by adjusting this timer to higher value e.g. 3600 sec. this perhaps let more time for the router to re-delegate the LSPs to the secondary PCE as expected. There is also a `delegation-priority` parameter which could influence the choice between primary and secondary PCE and thus the re-delegation (never try for the moment). Another way to improve this kind of high availability, is to setup a Virtual IP address between both PCE instance. Primary PCE is the one who got the Virtual IP address. Independently to the high availability scenario, this is also a problem that occurs when the PCE reboot i.e. all PCEP session goes down and depending of the configuration, some PCCs will removed their LSPs while other one keep them up during a time sufficient for the PCE (or a new one) to fire up again. Best Regards Olivier Le 24/06/2025 à 16:38, Dmytro Shypovalov a écrit : Dear PCE working group, I am working on a PCE for SR-TE and currently trying to implement dual-PCE redundancy. I prefer to follow the RFC and not implement any proprietary inter-PCE protocols unless absolutely necessary. PCEP has been so far not a very standard-friendly protocol in my opinion, because every vendor implements things differently and there is not much interoperability. I managed to make single-PCE scenario somewhat works, although there is different logic in handling state sync and LSP-ID state for different vendors. So my question is how dual PCE redundancy is supposed to work; let's say for now for PCE-initiated LSP only. RFC8231#section-5.7<https://datatracker.ietf.org/doc/html/rfc8231#section-5.7> describes the procedures, but again vendors implement it differently. Cisco delegates LSP to the PCE that initiated it, and reports to the other PCE (without delegate flag). and in case this PCE fails, redelegates to the second PCE and the LSP remain delegated to it forever. Juniper also delegates to the primary PCE and reports to the other without delegate flag, but if the primary PCE fails, Juniper does not redelegate, instead it waits a bit and then removes the LSP. My understanding is that it expects the secondary PCE to re-initiate the LSP at this point. I haven't tested Nokia, Huawei and others but wouldn't be surprised if they also have different ways of handling this. So my question is, how to properly implement dual-PCE redundancy? Is there a standard way or it relies on vendor-proprietary mechanisms? I think about the following logic: 1. When session comes up, wait until state sync (message with all zero's) + some timer; then * If received PCReports with delegate flag (and some of those matching our LSP) -> assume PCC elected us as primary PCE; send PCUpdate for these delegated LSP, send PCInit for our LSP not seen in PCReports * If received PCReports matching our LSP but without delegate flag -> assume PCC elected us as secondary PCE; do not send PCUpdate or PCInit * If did not receive any PCReports matching our LSP - [this is a tricky part - we can't send PCInit because if 2 PCE send it simultaneously, PCC will complain] - maybe start a random timer and try to PCInit after it, so the other PCE will receive reports with our LSP? 1. When PCC re-delegates LSP to us -> assume we are now elected as primary LSP, send PCUpdate for delegated LSP and PCInit for LSP not seen in PCReports 2. When PCC removes LSP but they are still active on PCE side -> assume Juniper-style redundancy, send PCInit for those LSP I think this will work based on my observations so far but I want to ask PCE experts just in case - what is the IETF way and are there any implementations that interoperate with different vendors? Maybe the more intelligent approach would be implementing draft-ietf-pce-state-sync-09 but from reading it I'm not sure that will work in all scenarios including Juniper PCC. So it can add more problems with split brain while not solving the original issue. Please let me know what you think. Regards, Dmytro _______________________________________________ Pce mailing list -- pce@ietf.org<mailto:pce@ietf.org> To unsubscribe send an email to pce-le...@ietf.org<mailto:pce-le...@ietf.org> -- [logo Orange]<http://www.orange.com/> Olivier Dugeon Senior research engineer in QoS and network control Orange/INNOV/NET/WNI/IPN/iTeQ mobile : +33 6 82 90 37 85 olivier.dug...@orange.com<mailto:olivier.dug...@orange.com> ____________________________________________________________________________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you.
_______________________________________________ Pce mailing list -- pce@ietf.org To unsubscribe send an email to pce-le...@ietf.org