Hi Dmytro,

For anything what is not clear in existing RFCs, can be still clarified in 
operational draft (that is the purpose of that draft)
https://www.ietf.org/archive/id/draft-ietf-pce-operational-01.html
or in:
https://www.ietf.org/archive/id/draft-many-pce-stateful-amendment-02.html

Or we can start even completely draft if some of those topics is "too big".

Even if there will be no conclusion about which implementation is correct or 
not correct, at least such draft can document possible approaches, so future 
implementations will not need to figure it out by doing interop testing (like 
in your case).

For any Cisco PCC specific issues/clarifications, you can reach to me privately 
as well and I can try to explain or forward your question to somebody who can 
help with that.

Regards,
Samuel

From: Dmytro Shypovalov <dmy...@vegvisir.ie>
Sent: Tuesday, July 1, 2025 4:42 PM
To: Samuel Sidor (ssidor) <ssi...@cisco.com>
Cc: olivier.dug...@orange.com; pce@ietf.org
Subject: Re: [Pce] Re: PCEP delegation and dual-PCE redundancy - too 
vendor-specific?

Hi Samuel,

I had few other problems with Cisco PCC, such as sync working in a weird way 
(Cisco sends all LSP in down state, then sync complete report, and only then 
updates the LSP status), adding 1 extra byte to symbolic path name TLV, and 
some mess with LSP-ID but through trial and error I managed to solve those. On 
the other hand, LSP delegation works better on Cisco PCC.

Juniper PCC has a different set of problems (and so I'm sure do Nokia, Huawei, 
IP Infusion etc), but my main concern is not that vendor A is better than 
vendor B, but the poor quality of PCEP RFCs which are very vague, inconsistent, 
don't cover many scenarios and each vendor then implements their own version of 
PCEP that is not very compatible with others.

Regards,
Dmytro








On Tue, 1 Jul 2025 at 10:17, Samuel Sidor (ssidor) 
<ssi...@cisco.com<mailto:ssi...@cisco.com>> wrote:
Hi Dmytro,

I'm glad to see that integration with Cisco PCC was smoother for you.

You can also check configuration "segment-routing traffic-eng pcc redundancy 
[pcc-centric | pce-centric]". PCC-centric model is the default on Cisco PCCs 
and that's following logic, which you already described (PCC driving delegation 
for PCE Initiated LSPs).

If you want behavior, which is more aligned with Juniper, then you try to 
switch to PCE centric model, where PCC is relying on PCEs to reclaim/drop 
delegation for PCE initiated LSPs, but as you figured out already, it requires 
more complex logic on PCE (or synchronization between PCEs, e.g. via 
state-sync<https://datatracker.ietf.org/doc/draft-ietf-pce-state-sync/>) and/or 
tweaking using timers.

Regards,
Samuel

From: Dmytro Shypovalov <dmy...@vegvisir.ie<mailto:dmy...@vegvisir.ie>>
Sent: Tuesday, July 1, 2025 10:39 AM
To: olivier.dug...@orange.com<mailto:olivier.dug...@orange.com>
Cc: pce@ietf.org<mailto:pce@ietf.org>
Subject: [Pce] Re: PCEP delegation and dual-PCE redundancy - too 
vendor-specific?

Hi Olivier,

Thanks for your feedback.

I did more reading and lab testing, I also tried to test the approach I 
described in my first email and that didn't work very well. RFC8231 allows 
different LSP to be delegated to different PCE, and while Cisco PCC usually 
elects a primary PCE and delegates all LSP to it, Juniper PCC delegates the LSP 
to whichever PCE that created it and keeps it delegated there.

Then I took a different approach, where I implemented delegation and 
de-delegation on both PCE which works per LSP (so there is no 
primary/secondary, all depends on where PCC chooses to delegate). An obvious 
problem here is what if both PCE try to send PCInitiate for a new LSP 
simultaneously - then PCC would complain about symbolic path name in use and 
drop the session. This scenario is not covered by RFCs, so I solved it by 
implementing a configurable "init-delay" timer. The idea is that one of the PCE 
(let's say secondary) has a 5-10 second delay when initing new LSP. If within 
this time, the other PCE (without delay timer) has already inited the LSP, we 
receive PCReport so we know the LSP status and there is no need to send 
PCInitiate.

Another problem is that Juniper PCC does not redelegate, instead if the 
delegated PCE dies, Juniper sends PCReport with R flag and all fields set to 0 
in LSP identifier (similar to the special PCReport when sync is complete). I 
implemented a trigger that if such a report is received (and the LSP is still 
alive), we need to send immediate PCInitiate for this  LSP, disregarding the 
init-delay timer.

So far tests look good with both Cisco and Juniper, but I will look into this 
more, trying to introduce different scenarios to break the setup. I haven't 
implemented PCC-initiated LSP yet but I think there will be less problem with 
race conditions during delegation like I described above. I also need to try 
more of different PCC implementations to see if someone implemented it in yet 
another way.


If only there was an organisation that developed standards that all vendors 
would follow, instead of each vendor implementing custom non-compatible 
solutions...


Regards,
Dmytro


On Mon, 30 Jun 2025 at 18:10, 
<olivier.dug...@orange.com<mailto:olivier.dug...@orange.com>> wrote:

Hello Dmytro,

I ran into the same problem during my various tests. I noted that it is a pure 
PCC problem and behaviors remain the same whatever the PCE.

For me, IMHO, the correct behavior is the Cisco one i.e. when the primary PCE 
failed and PCC detects that the PCEP session goes down, it starts re-delegate 
to the secondary PCE all its own LSPs that have been initiated or updated (i.e. 
the LSPs configured by the PCC but delegated to the primary PCE) by the primary 
PCE.

Regarding your questions:

1/ Some routers (see below) implement priority parameter for the PCE session. 
Thus, PCC will delegate tunnel to PCE with the highest priority. But, I agree 
that it is vendor dependent. From PCE side, it simply fill its LSP_DB with 
reported LSPs with delegate bit set to 1 or 0. But, this will not indicate that 
PCE is the primary or secondary. You could e.g. split your network in 2 
independent part where each PCE is both primary for its part of the network and 
secondary for the other one. Again, it is a matter of vendor implementation to 
specify if a PCE is a primary or secondary one. From my knowledge, vendor 
prefer to propose redundancy within a single PCE (e.g. cluster of PCE).

2/ Agree with this expected behavior. Again, the PCE will not be able to 
determine that it is now the primary. But, from the re-delegation, it will 
check that its LSP_DB is in sync with LSPs reported by PCCs and made 
appropriate adjustment. Normally there is not if PCCs didn't remove LSPs when 
they detect that the PCEP session with the primary PCE goes down.

3/ This the normal behavior. Secondary PCE store in its LSP_DB the LSPs 
initiated by the primary PCE without delegation. When PCCs report to the 
secondary PCE the LSPs with delegation, the secondary PCE must preform 
reconciliation (see point above).

Note: I also know that on Juniper, when PCEP session goes down, the router 
starts by removing all initiated LSPs. This behavior could be tune with the 
`lsp-cleanup-timer` parameter. By default, it was set to 0 (old JunOS version) 
and now set to 60 sec. Thus, by adjusting this timer to higher value e.g. 3600 
sec. this perhaps let more time for the router to re-delegate the LSPs to the 
secondary PCE as expected. There is also a `delegation-priority` parameter 
which could influence the choice between primary and secondary PCE and thus the 
re-delegation (never try for the moment).

Another way to improve this kind of high availability, is to setup a Virtual IP 
address between both PCE instance. Primary PCE is the one who got the Virtual 
IP address.

Independently to the high availability scenario, this is also a problem that 
occurs when the PCE reboot i.e. all PCEP session goes down and depending of the 
configuration, some PCCs will removed their LSPs while other one keep them up 
during a time sufficient for the PCE (or a new one) to fire up again.

Best Regards

Olivier
Le 24/06/2025 à 16:38, Dmytro Shypovalov a écrit :
Dear PCE working group,

I am working on a PCE for SR-TE and currently trying to implement dual-PCE 
redundancy. I prefer to follow the RFC and not implement any proprietary 
inter-PCE protocols unless absolutely necessary.

PCEP has been so far not a very standard-friendly protocol in my opinion, 
because every vendor implements things differently and there is not much 
interoperability. I managed to make single-PCE scenario somewhat works, 
although there is different logic in handling state sync and LSP-ID state for 
different vendors.

So my question is how dual PCE redundancy is supposed to work; let's say for 
now for PCE-initiated LSP only. 
RFC8231#section-5.7<https://datatracker.ietf.org/doc/html/rfc8231#section-5.7> 
describes the procedures, but again vendors implement it differently.

Cisco delegates LSP to the PCE that initiated it, and reports to the other PCE 
(without delegate flag). and in case this PCE fails, redelegates to the second 
PCE and the LSP remain delegated to it forever.
Juniper also delegates to the primary PCE and reports to the other without 
delegate flag, but if the primary PCE fails, Juniper does not redelegate, 
instead it waits a bit and then removes the LSP. My understanding is that it 
expects the secondary PCE to re-initiate the LSP at this point.

I haven't tested Nokia, Huawei and others but wouldn't be surprised if they 
also have different ways of handling this.

So my question is, how to properly implement dual-PCE redundancy? Is there a 
standard way or it relies on vendor-proprietary mechanisms?

I think about the following logic:


  1.  When session comes up, wait until state sync (message with all zero's) + 
some timer; then

     *   If received PCReports with delegate flag (and some of those matching 
our LSP) -> assume PCC elected us as primary PCE; send PCUpdate for these 
delegated LSP, send PCInit for our LSP not seen in PCReports

     *   If received PCReports matching our LSP but without delegate flag -> 
assume PCC elected us as secondary PCE; do not send PCUpdate or PCInit

     *   If did not receive any PCReports matching our LSP - [this is a tricky 
part - we can't send PCInit because if 2 PCE send it simultaneously, PCC will 
complain] - maybe start a random timer and try to PCInit after it, so the other 
PCE will receive reports with our LSP?

  1.  When PCC re-delegates LSP to us -> assume we are now elected as primary 
LSP, send PCUpdate for delegated LSP and PCInit for LSP not seen in PCReports
  2.  When PCC removes LSP but they are still active on PCE side -> assume 
Juniper-style redundancy, send PCInit for those LSP

I think this will work based on my observations so far but I want to ask PCE 
experts just in case - what is the IETF way and are there any implementations 
that interoperate with different vendors? Maybe the more intelligent approach 
would be implementing draft-ietf-pce-state-sync-09 but from reading it I'm not 
sure that will work in all scenarios including Juniper PCC. So it can add more 
problems with split brain while not solving the original issue.

Please let me know what you think.


Regards,
Dmytro


_______________________________________________

Pce mailing list -- pce@ietf.org<mailto:pce@ietf.org>

To unsubscribe send an email to pce-le...@ietf.org<mailto:pce-le...@ietf.org>
--
[logo Orange]<http://www.orange.com/>


Olivier Dugeon
Senior research engineer in QoS and network control

Orange/INNOV/NET/WNI/IPN/iTeQ


mobile : +33 6 82 90 37 85
olivier.dug...@orange.com<mailto:olivier.dug...@orange.com>



____________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc

pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler

a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,

Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.



This message and its attachments may contain confidential or privileged 
information that may be protected by law;

they should not be distributed, used or copied without authorisation.

If you have received this email in error, please notify the sender and delete 
this message and its attachments.

As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.

Thank you.
_______________________________________________
Pce mailing list -- pce@ietf.org
To unsubscribe send an email to pce-le...@ietf.org

Reply via email to