Re: [Pce] Your Discusses and Comments on draft-ietf-pce-pcep

Lars Eggert Tue, 23 Sep 2008 06:23:38 -0700

Hi,

sorry for the delay in responding, I was out on vacation.

I read through your resolutions, and I'm happy to say I agree with allof them. I'll leave the discuss in place until you've had a chance toaddress the security related parts of the IESG comments, but Iconsider all the non-security related issues resolved.


Thanks,
Lars

On 2008-9-2, at 16:04, ext Adrian Farrel wrote:

Hi Lars,
We have tried to split our work on this I-D into security andeverything else. We have not handled security yet and are about todo some work with a Cisco security expert soon. But in this revision(v14) we have tried to cover all of the other points.
Please find below, a summary of your Discusses and Comments with anote on what we have changed. If you are running wdiff, pleasecompare with the v12 revision.
Thanks,
Adrian
Section 6.3., paragraph 0:
6.3.  Keepalive Message
DISCUSS: I have a few suggestions on improving the keepalive
mechanism.
You have raised this as a Discuss, but you phrase it as"suggestions."
Please clarify whether for you it is imperative that  the Keepalive
process is updated as you suggest, or whether these may betreated as
suggestions.
I should have phrased this better. My discuss is on the currently-
specified keepalive mechanism, which keeps generating keepalivesabsentany indication that the peer is currently receiving them and in away
that is inefficient compared to a traditional dummy request- response
exchange that triggers when the deadtimer expires. (TCP  guarantees
delivery, so sending keepalives at a rate higher than the  deadtimer
doesn't add value - they can't get lost.)
In order to make this discuss actionable, I've suggested analternativedesign, but I'd be fine with any other alternative that doesn'tsuffer
from these issues. (Another such example would be to  use TCP-level
keepalives, which although not part of the spec are implemented byall
platforms I know.)
I'm wondering if our different views stem from different views onwhat
this mechanism is supposed to be good for. In my mind, keepalives are
meant to let an end-point eventually discover non-respondingconnections.Is PCEP using keepalives for purposes that require more timelyresponses,such as failover? If so, TCP may simply be the wrong underlyingprotocol
for PCEP, and SCTP may be a better match, given  its design goal of
supporting fail-over.
(1) Both the PCC and PCE are allowed to send keepalives
according to their timers. This wastes bandwidth. The receptionof a
keepalive request from the other end should restart the keepalive
timer on the receiving end - the reception is an indication thatthe
peer is alive. This means that keepalives will usually only be sent
from one end (the one with the shorter keepalive timer) andresponded
to by the other end.
Please note that Keepalive messages are not responded.
They are sent to the receiver according to the frequencyspecified bythe receiver. Thus the Keepalive may be unidirectional, or may bevery
unbalanced according to the requirements of the peers.
Thanks for the clarification. I misunderstood this when readingthe spec.I still believe that this keep-alive design has undesirablefeatures. A
more traditional empty request-response exchange doesn't  suffer from
these drawbacks.
We decided this point needed more explanation of keepalives, and sowe have reworked the text considerably to give us...
4.2.1.  Initialization Phase

 The initialization phase consists of two successive steps (described
 in a schematic form in Figure 1):

 1) Establishment of a TCP connection (3-way handshake) between the
 PCC and the PCE.

 2) Establishment of a PCEP session over the TCP connection.

 Once the TCP connection is established, the PCC and the PCE (also
 referred to as "PCEP peers") initiate PCEP session establishment
 during which various session parameters are negotiated.  These
 parameters are carried within Open messages and include the Keepalive
 timer, the Deadtimer and potentially other detailed capabilities and
 policy rules that specify the conditions under which path computation
 requests may be sent to the PCE.  If the PCEP session establishment
 phase fails because the PCEP peers disagree on the session parameters
 or one of the PCEP peers does not answer after the expiration of the
 establishment timer, the TCP connection is immediately closed.
 Successive retries are permitted but an implementation should make
 use of an exponential back-off session establishment retry procedure.

 Keepalive messages are used to acknowledge Open messages, and once
 the PCEP session has been successfully established.

 Only one PCEP session can exist between a pair of PCEP peers at any
 one time.  Only one TCP connection on the PCEP port can exist between
 a pair of PCEP peers at any one time.

 Details about the Open message and the Keepalive message can be found
 in Section 6.2 and Section 6.3 respectively.

             +-+-+                 +-+-+
             |PCC|                 |PCE|
             +-+-+                 +-+-+
               |                     |
               | Open msg            |
               |--------             |
               |        \   Open msg |
               |         \  ---------|
               |          \/         |
               |          /\         |
               |         /  -------->|
               |        /            |
               |<------     Keepalive|
               |             --------|
               |Keepalive   /        |
               |--------   /         |
               |        \/           |
               |        /\           |
               |<------   ---------->|
               |                     |

 Figure 1: PCEP Initialization phase (initiated by a PCC)

 (Note that once the PCEP session is established, the exchange of
 Keepalive messages is optional)

4.2.2.  Session Keepalive

 Once a session has been established, a PCE or PCC may want to know
 that its PCEP peer is still available for use.

 It can rely on TCP for this information, but it is possible that the
 remote PCEP function has failed without disturbing the TCP
 connection.  It is also possible to rely on the mechanisms built into
 the TCP implementations, but these might not provide sufficiently
 timely notifications of failures.  Lastly, a PCC could wait until it
 has a path computation request to send and use its failed
 transmission or the failure to receive a response as evidence that
 the session has failed, but this is clearly inefficient.

 In order to handle this situation, PCEP includes a keepalive
 mechanism based on a Keepalive timer, a Dead timer, and a Keepalive
 message.

 Each end of a PCEP session runs a Keepalive timer.  It restarts the
 timer every time it sends a message on the session.  When the timer
 expires, it sends a Keepalive message.  Other traffic may serve as
 Keepalive (see Section 6.3).

 The ends of the PCEP session also run Dead timers, and they restart
 them whenever a message is received on the session.  If one end of
 the session receives no message before the Dead timer expires, it
 declares the session dead.

 Note that this means that the Keepalive message is unresponded and
 does not form part of a two-way keepalive handshake as used in some
 protocols.  Also note that the mechanism is designed to reduce to a
 minimum the amount of keepalive traffic on the session.

 The keepalive traffic on the session may be unbalanced according to
 the requirements of the session ends.  Each end of the session can
 specify (on an Open message) the Keepalive timer that it will use
 (i.e., how often it will transmit a Keepalive message if there is no
 other traffic) and a Dead timer that it recommends its peer to use
 (i.e., how long the peer should wait before declaring the session
 dead if it receives no traffic).  The session ends may use different
 Keepalive timer values.

 The minimum value of the Keepalive timer is 1 second, and it is
 specified in units of 1 second.  The recommended default value is 30
 seconds.  The timer may be disabled by setting it to zero.

 The recommended default for the Dead timer is 4 times the value of
 the Keepalive timer used by the remote peer.  This means that there
 is never any risk of congesting TCP with excessive Keepalive
 messages.
(2) As long as a keepalive has not been responded
to, a PCEP speaker MUST NOT send another one. TCP is a reliable
protocol and will deliver the outstanding keepalive when it can.It is
not lost, and there is no need to resend it. All that sending more
keepalives does when there is no response if fill up the socketsend
buffer.
The filling up of the socket send buffer (which might happen through
repeated sends of a Keepalive message that cannot be delivered)seems alittle improbable. The PCEP messages are four bytes long and willcarrythe normal TCP/IP headers. It is true that an implementation thatoptsto not receive Keepalives (or one that must send them far moreoftenthan it must receive them) runs the risk of not noticing a failedTCPconnection and continuing to send Keepalives until TCP itselfeventually
reports the problem. It seems  to us that this is an extreme fringe
condition that can be protected against by proper configurationof the
protocol where the issue is  believed to be real.
This depends on the keepalive frequency and the send buffer in use,aswell as the duration of a connectivity disruption. I agree thatthis canbe engineered away for most rational cases, but why do so when theissuecan be completely eliminated through a more traditional keepalivescheme?
We hope that the explanaiton of the Dead Timer in the new text(above) explains how this is covered.
Section 7.3., paragraph 10:
Keepalive (8 bits): maximum period of time (in seconds) betweentwo
consecutive PCEP messages sent by the sender of this message.  The
minimum value for the Keepalive is 1 second. When set to 0,once thesession is established, no further Keepalive messages are sentto the
remote peer.  A RECOMMENDED value for the keepalive
frequency is 30 seconds.
DISCUSS: 1 second is extremely short. If there is a requirementthatPCEP detect failures on such short timescales and should failover in
some way, TCP is the wrong underlying transport protocol. If that's
the motivation, PCEP should use SCTP, which was specificallydesigned
for this case. Otherwise, I'd suggest 30 seconds as a reasonable
minimum to use, and something larger as a recommended default.
The timer range is designed to *allow* an operator to handle aspecialcase deployment where a very short timer is needed, but*recommends* adefault value of 30 seconds. Further, it allows the operator toset a
much larger value.

I don't see this as any issue at all and no reason to change the
transport protocol.
I understand that 1 is the minimum and 30 the default. Under what
conditions would 1 second be a reasonable interval? If failover onshort
timescales is the goal, SCTP provides mechanisms for that.
I hope that the suggested text above answers why we want a time-out.I.e. not for failover.
I have discussed with the editors whether we should make the minimumtimer value 10 seconds. They are not so keen. They point out:
- Radically smaller keepalive timers are recommended by default inseveral protocols (e.g., 5ms in RFC3209).
- The Dead Timer is recommended to be 4 times the keepalive timer.
However, they say that if you still feel strongly about this and youcan show a reason why a keepalive of 1 second would be damaging tothe network or the PCE/PCC they will look again at increasing theminimum legal keepalive timer value.
Section 9.1., paragraph 1:
PCEP uses a well-known TCP port. IANA is requested to assign aportnumber from the "System" sub-registry of the "Port Numbers"registry.
DISCUSS: Why is a system port (0-1023) required, wouldn't aregistered
port (1023-49151) suffice?
OK. This has changed to request a Registered Port.

 " PCEP will use a registered TCP port to be assigned by IANA"

And similar changes in various places.
Section 10.1., paragraph 1:
It is RECOMMENDED to use TCP-MD5 [RFC1321] signature option to
provide for the authenticity and integrity of PCEP messages.  This
will allow protecting against PCE or PCC impersonation and also
against message content falsification.
DISCUSS: Given all the issues with the continued use of TCP-MD5,I'm
not convinced that we really want to recommend its use for a new
protocol. Wouldn't draft-ietf-tcpm-tcp-auth-opt be the preferred
alternative? Or TLS, since confidentiality is of key importance
according to Section 10.2. (Also, nit, TCP-MD5 protects TCPsegments
and not PCEP messages.)
As I say, the security issues are still pending and will beaddressed in a future revision.
Section 4.2.1., paragraph 4:
Successive retries are permitted but an implementation should make
use of an exponential back-off session establishment retryprocedure.
s/should make/SHOULD make/
Section 4 provides an architectural overview and not normativeprotocol
definition. The use of RFC2119 language would be  inappropriate.
OK, but in that case please describe this somewhere in the normative
part - I couldn't find it there.
Right.
This has been added to seciotn 6.2.

"Successive retries are permitted but an implementation SHOULD make
use of an exponential back-off session establishment retryprocedure."
Section 4.2.2., paragraph 4:
Once the PCC has selected a PCE, it sends the PCE a pathcomputationrequest to the PCE (PCReq message) that contains a variety ofobjectsthat specify the set of constraints and attributes for the pathto be
computed.
Can a PCC send a second path computation request over the same TCP
connection to a PCE when the answer to an earlier one is still
outstanding? Can multiple TCP connections exist between the samePCC
and PCE?
Yes, multiple PCEP requests may be outstanding from the same PCCat the
same time.
No, multiple 'parallel' TCP connections must not be used, and aspecificerror code exists to accompany the rejection of the secondconnection.
OK. Could you explicitly say so in the document? I gathered that was
implicitly what the connection handling was supposed to result in;
spelling it out wood be good IMO.
Added to 4.2.1

 Only one TCP connection on the PCEP port can exist between a pair of
 PCEP peers at any one time.

Added 4.2.3
 Multiple path computation requests may be outstanding from one PCC to
 a PCE at any time.
Section 4.2.4., paragraph 1:
There are several circumstances in which a PCE may want tonotify aPCC of a specific event. For example, suppose that the PCEsuddenlygets overloaded, potentially leading to unacceptable responsetimes.
Can such notifications occur at any time, i.e., while anothermessageis being sent? If so, how are they framed within the TCP bytestream?
I meant if they can happen while another message is being sent, andifthe notification is interleaved into the TCP byte stream(requiring someapplication-level framing) or if the application will queue ituntil anongoing transmission has ended. I gather it's the latter that issupposedto happen. Could you make this explicit in the document, i.e.,that PCE
messages are transmitted over TCP in a sequential order  without the
possibility for interleaving?
Added to Section 4.2, after the list of messages...

 Each PCEP message is regarded as a single transmission unit and parts
 of messages MUST NOT be interleaved. So, for example, a PCC
 sending a PCReq and wishing to close the session, must complete
 sending the request message before starting to send a Close message.
Section 7.3., paragraph 13:
A sends an Open message to B with Keepalive=10 seconds and
Deadtimer=30 seconds. This means that A sends Keepalivemessages (or
ay other PCEP message) to B every 10 seconds and B can declare the
PCEP session with A down if no PCEP message has been receivedfrom A
within any 30 second period.
[Editors: Please note s/ay/any/]
Fixed
I'd be nice if the example followed the recommended values/fomulas
above and used Keepalive=30 and DeadTimer=4*Keepalive (orwhatever the
defaults will be after addressing my comments above.)
Yes, this is a good point.
It should be easy to change this to 30 and 120 seconds.
This was actually chnaged to 10 and 40 seconds. This shows settingthe Keepalive (rather than using the default) and setting theDeadtimer tro 4*Keepalive
Section 7.3., paragraph 14:
SID (PCEP session-ID - 8 bits): unsigned PCEP session number that
identifies the current session.  The SID MUST be incremented each
time a new PCEP session is established and is used for logging and
troubleshooting purposes. There is one SID number in eachdirection.
What's the start value? Is it incremented for each connection toany
PCEP peer or only for connections to the same PCEP peer? Does it
matter when SID rolls over? The document doesn't discuss what theSID
is used for at all.
I asked similar questions during WG last call.

The answers are:
Start where you like, the value is not important for the protocol.
The requirement is that the SID is 'sufficiently different' to avoid
confusion between instances of sessions to the same peer.
Thus, "incremented" is more like implementation advice than a strict
definition. In particular, incremented by 255 would be fine :-)
However, the usage (for logging and troubleshooting) mightsuggest that
incrementing by one is a helpful way of looking at things.
SID roll-over is not particularly a problem.
Implementation could use a single source of SIDs across allpeers, orone source for each peer. The former might constrain theimplementationto only 255 concurrent sessions. The latter potentially requiresmore
state.
Thanks for the clarification. It might be useful if a bit of this
explanation was added to the document. Also, please have it saythat theSID SHALL only be used for logging and troubleshooting, in orderto avoid
having implementors start using it creatively.
This was addressed to handle Magnus's comments. You should find allof the necessary text in Section 7.3.
Section 7.4.1., paragraph 14:
Request-ID-number (32 bits).  The Request-ID-number value combined
with the source IP address of the PCC and the PCE address uniquely
identify the path computation request context. The Request-ID-numberMUST be incremented each time a new request is sent to the PCE.The
value 0x0000000 is considered as invalid.  If no path computation
reply is received from the PCE, and the PCC wishes to resend its
request, the same Request-ID-number MUST be used.  Conversely,
different Request-ID-number MUST be used for different requestssent
to a PCE.  The same Request-ID-number MAY be used for path
computation requests sent to different PCEs.  The path computation
reply is unambiguously identified by the IP source address of the
replying PCE.
It's redundant to identify requests by source and destination IP
address, given that those are constant for requests going over the
same TCP connection. Likewise, replies are implicitly identified by
the TCP connection they arrive over.
OK. The text could have said that the requests are uniquelyidentifiedby the combination of TCP connection and request ID. Since, asyou pointout, TCP connection is isomorphic to source/dest IP addresses,the text
is accurate and not redundant.
OK
We understood "OK" to mean "no change needed."
Section 8.3., paragraph 1:
PCEP includes a keepalive mechanism to check the liveliness ofa PCEP
peer and a notification procedure allowing a PCE to advertise its
congestion state to a PCC.
s/congestion/overload/ for consistency, here and throughout the
document
Yes. Good catch.
OK
Done in 8.3 and in some key places in the document.

Do you feel that this needs to be done universally?
Section 9.1., paragraph 1:
PCEP uses a well-known TCP port. IANA is requested to assign aportnumber from the "System" sub-registry of the "Port Numbers"registry.
Does "uses a well-known TCP port" mean that messages from the PCCtothe PCE must come from that registered source port, or can theycomefrom any port? (The former implies that only a single PCEPconnection
can exist between a PCC and a PCE. It also weakens security a bit,
because an attacker doesn't need to guess the source port anymore.)
Yes.
"Yes" as in "source port MUST be the PCE port"? (As must thedestinationport, obviously.) If so, you should make this explicit in thedocument,
because the default behavior of operating systems is to  dynamically
assign a random high port number as a source port, unless  an app
specifically requests otherwise.
In Section 5, text added:

 All PCEP message MUST be sent using the registered TCP port for
 the source and destination TCP port.
Only one connection between a PCC and a PCE at any time. No need for
more than one has been identified. It might be claimed that two PCC
processes might exist on a single host/router, but no usagescenario has
been found. Further, the text currently bans a second  simultaneous
connection between peers.
This came up earlier in this email and has been made explicit.
Security is weaker and stronger.
Weaker as you point out.
Stronger because of the use of a system port as previouslydescribed.
Disagree that a system port is a security mechanism, as per above.
OK. We have moved to a Registered Port, anyway.
Section 10.3.1., paragraph 2:
o A PCE should avoid promiscuous TCP listens for PCEP TCPconnection
establishment.  It should use only listens that are specific to
authorized PCCs.
Authorized by what? TCP has no feature to restrict listens based on
credentials.
We will address this as part of the security work.
Section 10.3.1., paragraph 4:
o  The use of access-list on the PCE so as to restrict access to
 authorized PCCs.
Is redundant with the first bullet (or I don't understand what it
means).
We will address this as part of the security work.
Appendix A., paragraph 46:
If the system receives an Open message from the PCEP peer beforethe
expiration of the OpenWait timer, the system first examines all of
its sessions that are in the OpenWait or KeepWait state. Ifanother
session with the same PCEP peer already exists (same IP address),
then the system performs the following collision resolution
procedure:
The goal of this procedure seems to be to guarantee that there isonly
a single active PCEP connection between two peers, but it's
cumbersome. It'd be much easier to require a peer to not initiate a
connection to a peer it already has one established with, and to
require it to immediately close a new TCP connection coming from a
peer it has an active PCEP connection with. This handleseverything at
the TCP layer without needing to involve the PCEP state machine.
I agree that this non-normative appendix seems to includeoverkill for
an unlikely scenario.
The editors believe that this situation may be more than unlikely.Especially when PCEP session are triggered by automatic procedures.They point out that this procedure has been implemented multipletimes without any problems, so we are keeping it in.
Hope this answers everything apart from the security issues.

Cheers,
Adrian


_______________________________________________
Pce mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/pce

Re: [Pce] Your Discusses and Comments on draft-ietf-pce-pcep

Reply via email to