Hi Lars,
We have tried to split our work on this I-D into security and
everything else. We have not handled security yet and are about to
do some work with a Cisco security expert soon. But in this revision
(v14) we have tried to cover all of the other points.
Please find below, a summary of your Discusses and Comments with a
note on what we have changed. If you are running wdiff, please
compare with the v12 revision.
Thanks,
Adrian
Section 6.3., paragraph 0:
6.3. Keepalive Message
DISCUSS: I have a few suggestions on improving the keepalive
mechanism.
You have raised this as a Discuss, but you phrase it as
"suggestions."
Please clarify whether for you it is imperative that the Keepalive
process is updated as you suggest, or whether these may be
treated as
suggestions.
I should have phrased this better. My discuss is on the currently-
specified keepalive mechanism, which keeps generating keepalives
absent
any indication that the peer is currently receiving them and in a
way
that is inefficient compared to a traditional dummy request- response
exchange that triggers when the deadtimer expires. (TCP guarantees
delivery, so sending keepalives at a rate higher than the deadtimer
doesn't add value - they can't get lost.)
In order to make this discuss actionable, I've suggested an
alternative
design, but I'd be fine with any other alternative that doesn't
suffer
from these issues. (Another such example would be to use TCP-level
keepalives, which although not part of the spec are implemented by
all
platforms I know.)
I'm wondering if our different views stem from different views on
what
this mechanism is supposed to be good for. In my mind, keepalives are
meant to let an end-point eventually discover non-responding
connections.
Is PCEP using keepalives for purposes that require more timely
responses,
such as failover? If so, TCP may simply be the wrong underlying
protocol
for PCEP, and SCTP may be a better match, given its design goal of
supporting fail-over.
(1) Both the PCC and PCE are allowed to send keepalives
according to their timers. This wastes bandwidth. The reception
of a
keepalive request from the other end should restart the keepalive
timer on the receiving end - the reception is an indication that
the
peer is alive. This means that keepalives will usually only be sent
from one end (the one with the shorter keepalive timer) and
responded
to by the other end.
Please note that Keepalive messages are not responded.
They are sent to the receiver according to the frequency
specified by
the receiver. Thus the Keepalive may be unidirectional, or may be
very
unbalanced according to the requirements of the peers.
Thanks for the clarification. I misunderstood this when reading
the spec.
I still believe that this keep-alive design has undesirable
features. A
more traditional empty request-response exchange doesn't suffer from
these drawbacks.
We decided this point needed more explanation of keepalives, and so
we have reworked the text considerably to give us...
4.2.1. Initialization Phase
The initialization phase consists of two successive steps (described
in a schematic form in Figure 1):
1) Establishment of a TCP connection (3-way handshake) between the
PCC and the PCE.
2) Establishment of a PCEP session over the TCP connection.
Once the TCP connection is established, the PCC and the PCE (also
referred to as "PCEP peers") initiate PCEP session establishment
during which various session parameters are negotiated. These
parameters are carried within Open messages and include the Keepalive
timer, the Deadtimer and potentially other detailed capabilities and
policy rules that specify the conditions under which path computation
requests may be sent to the PCE. If the PCEP session establishment
phase fails because the PCEP peers disagree on the session parameters
or one of the PCEP peers does not answer after the expiration of the
establishment timer, the TCP connection is immediately closed.
Successive retries are permitted but an implementation should make
use of an exponential back-off session establishment retry procedure.
Keepalive messages are used to acknowledge Open messages, and once
the PCEP session has been successfully established.
Only one PCEP session can exist between a pair of PCEP peers at any
one time. Only one TCP connection on the PCEP port can exist between
a pair of PCEP peers at any one time.
Details about the Open message and the Keepalive message can be found
in Section 6.2 and Section 6.3 respectively.
+-+-+ +-+-+
|PCC| |PCE|
+-+-+ +-+-+
| |
| Open msg |
|-------- |
| \ Open msg |
| \ ---------|
| \/ |
| /\ |
| / -------->|
| / |
|<------ Keepalive|
| --------|
|Keepalive / |
|-------- / |
| \/ |
| /\ |
|<------ ---------->|
| |
Figure 1: PCEP Initialization phase (initiated by a PCC)
(Note that once the PCEP session is established, the exchange of
Keepalive messages is optional)
4.2.2. Session Keepalive
Once a session has been established, a PCE or PCC may want to know
that its PCEP peer is still available for use.
It can rely on TCP for this information, but it is possible that the
remote PCEP function has failed without disturbing the TCP
connection. It is also possible to rely on the mechanisms built into
the TCP implementations, but these might not provide sufficiently
timely notifications of failures. Lastly, a PCC could wait until it
has a path computation request to send and use its failed
transmission or the failure to receive a response as evidence that
the session has failed, but this is clearly inefficient.
In order to handle this situation, PCEP includes a keepalive
mechanism based on a Keepalive timer, a Dead timer, and a Keepalive
message.
Each end of a PCEP session runs a Keepalive timer. It restarts the
timer every time it sends a message on the session. When the timer
expires, it sends a Keepalive message. Other traffic may serve as
Keepalive (see Section 6.3).
The ends of the PCEP session also run Dead timers, and they restart
them whenever a message is received on the session. If one end of
the session receives no message before the Dead timer expires, it
declares the session dead.
Note that this means that the Keepalive message is unresponded and
does not form part of a two-way keepalive handshake as used in some
protocols. Also note that the mechanism is designed to reduce to a
minimum the amount of keepalive traffic on the session.
The keepalive traffic on the session may be unbalanced according to
the requirements of the session ends. Each end of the session can
specify (on an Open message) the Keepalive timer that it will use
(i.e., how often it will transmit a Keepalive message if there is no
other traffic) and a Dead timer that it recommends its peer to use
(i.e., how long the peer should wait before declaring the session
dead if it receives no traffic). The session ends may use different
Keepalive timer values.
The minimum value of the Keepalive timer is 1 second, and it is
specified in units of 1 second. The recommended default value is 30
seconds. The timer may be disabled by setting it to zero.
The recommended default for the Dead timer is 4 times the value of
the Keepalive timer used by the remote peer. This means that there
is never any risk of congesting TCP with excessive Keepalive
messages.
(2) As long as a keepalive has not been responded
to, a PCEP speaker MUST NOT send another one. TCP is a reliable
protocol and will deliver the outstanding keepalive when it can.
It is
not lost, and there is no need to resend it. All that sending more
keepalives does when there is no response if fill up the socket
send
buffer.
The filling up of the socket send buffer (which might happen through
repeated sends of a Keepalive message that cannot be delivered)
seems a
little improbable. The PCEP messages are four bytes long and will
carry
the normal TCP/IP headers. It is true that an implementation that
opts
to not receive Keepalives (or one that must send them far more
often
than it must receive them) runs the risk of not noticing a failed
TCP
connection and continuing to send Keepalives until TCP itself
eventually
reports the problem. It seems to us that this is an extreme fringe
condition that can be protected against by proper configuration
of the
protocol where the issue is believed to be real.
This depends on the keepalive frequency and the send buffer in use,
as
well as the duration of a connectivity disruption. I agree that
this can
be engineered away for most rational cases, but why do so when the
issue
can be completely eliminated through a more traditional keepalive
scheme?
We hope that the explanaiton of the Dead Timer in the new text
(above) explains how this is covered.
Section 7.3., paragraph 10:
Keepalive (8 bits): maximum period of time (in seconds) between
two
consecutive PCEP messages sent by the sender of this message. The
minimum value for the Keepalive is 1 second. When set to 0,
once the
session is established, no further Keepalive messages are sent
to the
remote peer. A RECOMMENDED value for the keepalive
frequency is 30 seconds.
DISCUSS: 1 second is extremely short. If there is a requirement
that
PCEP detect failures on such short timescales and should fail
over in
some way, TCP is the wrong underlying transport protocol. If that's
the motivation, PCEP should use SCTP, which was specifically
designed
for this case. Otherwise, I'd suggest 30 seconds as a reasonable
minimum to use, and something larger as a recommended default.
The timer range is designed to *allow* an operator to handle a
special
case deployment where a very short timer is needed, but
*recommends* a
default value of 30 seconds. Further, it allows the operator to
set a
much larger value.
I don't see this as any issue at all and no reason to change the
transport protocol.
I understand that 1 is the minimum and 30 the default. Under what
conditions would 1 second be a reasonable interval? If failover on
short
timescales is the goal, SCTP provides mechanisms for that.
I hope that the suggested text above answers why we want a time-out.
I.e. not for failover.
I have discussed with the editors whether we should make the minimum
timer value 10 seconds. They are not so keen. They point out:
- Radically smaller keepalive timers are recommended by default in
several protocols (e.g., 5ms in RFC3209).
- The Dead Timer is recommended to be 4 times the keepalive timer.
However, they say that if you still feel strongly about this and you
can show a reason why a keepalive of 1 second would be damaging to
the network or the PCE/PCC they will look again at increasing the
minimum legal keepalive timer value.
Section 9.1., paragraph 1:
PCEP uses a well-known TCP port. IANA is requested to assign a
port
number from the "System" sub-registry of the "Port Numbers"
registry.
DISCUSS: Why is a system port (0-1023) required, wouldn't a
registered
port (1023-49151) suffice?
OK. This has changed to request a Registered Port.
" PCEP will use a registered TCP port to be assigned by IANA"
And similar changes in various places.
Section 10.1., paragraph 1:
It is RECOMMENDED to use TCP-MD5 [RFC1321] signature option to
provide for the authenticity and integrity of PCEP messages. This
will allow protecting against PCE or PCC impersonation and also
against message content falsification.
DISCUSS: Given all the issues with the continued use of TCP-MD5,
I'm
not convinced that we really want to recommend its use for a new
protocol. Wouldn't draft-ietf-tcpm-tcp-auth-opt be the preferred
alternative? Or TLS, since confidentiality is of key importance
according to Section 10.2. (Also, nit, TCP-MD5 protects TCP
segments
and not PCEP messages.)
As I say, the security issues are still pending and will be
addressed in a future revision.
Section 4.2.1., paragraph 4:
Successive retries are permitted but an implementation should make
use of an exponential back-off session establishment retry
procedure.
s/should make/SHOULD make/
Section 4 provides an architectural overview and not normative
protocol
definition. The use of RFC2119 language would be inappropriate.
OK, but in that case please describe this somewhere in the normative
part - I couldn't find it there.
Right.
This has been added to seciotn 6.2.
"Successive retries are permitted but an implementation SHOULD make
use of an exponential back-off session establishment retry
procedure."
Section 4.2.2., paragraph 4:
Once the PCC has selected a PCE, it sends the PCE a path
computation
request to the PCE (PCReq message) that contains a variety of
objects
that specify the set of constraints and attributes for the path
to be
computed.
Can a PCC send a second path computation request over the same TCP
connection to a PCE when the answer to an earlier one is still
outstanding? Can multiple TCP connections exist between the same
PCC
and PCE?
Yes, multiple PCEP requests may be outstanding from the same PCC
at the
same time.
No, multiple 'parallel' TCP connections must not be used, and a
specific
error code exists to accompany the rejection of the second
connection.
OK. Could you explicitly say so in the document? I gathered that was
implicitly what the connection handling was supposed to result in;
spelling it out wood be good IMO.
Added to 4.2.1
Only one TCP connection on the PCEP port can exist between a pair of
PCEP peers at any one time.
Added 4.2.3
Multiple path computation requests may be outstanding from one PCC to
a PCE at any time.
Section 4.2.4., paragraph 1:
There are several circumstances in which a PCE may want to
notify a
PCC of a specific event. For example, suppose that the PCE
suddenly
gets overloaded, potentially leading to unacceptable response
times.
Can such notifications occur at any time, i.e., while another
message
is being sent? If so, how are they framed within the TCP byte
stream?
I meant if they can happen while another message is being sent, and
if
the notification is interleaved into the TCP byte stream
(requiring some
application-level framing) or if the application will queue it
until an
ongoing transmission has ended. I gather it's the latter that is
supposed
to happen. Could you make this explicit in the document, i.e.,
that PCE
messages are transmitted over TCP in a sequential order without the
possibility for interleaving?
Added to Section 4.2, after the list of messages...
Each PCEP message is regarded as a single transmission unit and parts
of messages MUST NOT be interleaved. So, for example, a PCC
sending a PCReq and wishing to close the session, must complete
sending the request message before starting to send a Close message.
Section 7.3., paragraph 13:
A sends an Open message to B with Keepalive=10 seconds and
Deadtimer=30 seconds. This means that A sends Keepalive
messages (or
ay other PCEP message) to B every 10 seconds and B can declare the
PCEP session with A down if no PCEP message has been received
from A
within any 30 second period.
[Editors: Please note s/ay/any/]
Fixed
I'd be nice if the example followed the recommended values/fomulas
above and used Keepalive=30 and DeadTimer=4*Keepalive (or
whatever the
defaults will be after addressing my comments above.)
Yes, this is a good point.
It should be easy to change this to 30 and 120 seconds.
This was actually chnaged to 10 and 40 seconds. This shows setting
the Keepalive (rather than using the default) and setting the
Deadtimer tro 4*Keepalive
Section 7.3., paragraph 14:
SID (PCEP session-ID - 8 bits): unsigned PCEP session number that
identifies the current session. The SID MUST be incremented each
time a new PCEP session is established and is used for logging and
troubleshooting purposes. There is one SID number in each
direction.
What's the start value? Is it incremented for each connection to
any
PCEP peer or only for connections to the same PCEP peer? Does it
matter when SID rolls over? The document doesn't discuss what the
SID
is used for at all.
I asked similar questions during WG last call.
The answers are:
Start where you like, the value is not important for the protocol.
The requirement is that the SID is 'sufficiently different' to avoid
confusion between instances of sessions to the same peer.
Thus, "incremented" is more like implementation advice than a strict
definition. In particular, incremented by 255 would be fine :-)
However, the usage (for logging and troubleshooting) might
suggest that
incrementing by one is a helpful way of looking at things.
SID roll-over is not particularly a problem.
Implementation could use a single source of SIDs across all
peers, or
one source for each peer. The former might constrain the
implementation
to only 255 concurrent sessions. The latter potentially requires
more
state.
Thanks for the clarification. It might be useful if a bit of this
explanation was added to the document. Also, please have it say
that the
SID SHALL only be used for logging and troubleshooting, in order
to avoid
having implementors start using it creatively.
This was addressed to handle Magnus's comments. You should find all
of the necessary text in Section 7.3.
Section 7.4.1., paragraph 14:
Request-ID-number (32 bits). The Request-ID-number value combined
with the source IP address of the PCC and the PCE address uniquely
identify the path computation request context. The Request-ID-
number
MUST be incremented each time a new request is sent to the PCE.
The
value 0x0000000 is considered as invalid. If no path computation
reply is received from the PCE, and the PCC wishes to resend its
request, the same Request-ID-number MUST be used. Conversely,
different Request-ID-number MUST be used for different requests
sent
to a PCE. The same Request-ID-number MAY be used for path
computation requests sent to different PCEs. The path computation
reply is unambiguously identified by the IP source address of the
replying PCE.
It's redundant to identify requests by source and destination IP
address, given that those are constant for requests going over the
same TCP connection. Likewise, replies are implicitly identified by
the TCP connection they arrive over.
OK. The text could have said that the requests are uniquely
identified
by the combination of TCP connection and request ID. Since, as
you point
out, TCP connection is isomorphic to source/dest IP addresses,
the text
is accurate and not redundant.
OK
We understood "OK" to mean "no change needed."
Section 8.3., paragraph 1:
PCEP includes a keepalive mechanism to check the liveliness of
a PCEP
peer and a notification procedure allowing a PCE to advertise its
congestion state to a PCC.
s/congestion/overload/ for consistency, here and throughout the
document
Yes. Good catch.
OK
Done in 8.3 and in some key places in the document.
Do you feel that this needs to be done universally?
Section 9.1., paragraph 1:
PCEP uses a well-known TCP port. IANA is requested to assign a
port
number from the "System" sub-registry of the "Port Numbers"
registry.
Does "uses a well-known TCP port" mean that messages from the PCC
to
the PCE must come from that registered source port, or can they
come
from any port? (The former implies that only a single PCEP
connection
can exist between a PCC and a PCE. It also weakens security a bit,
because an attacker doesn't need to guess the source port anymore.)
Yes.
"Yes" as in "source port MUST be the PCE port"? (As must the
destination
port, obviously.) If so, you should make this explicit in the
document,
because the default behavior of operating systems is to dynamically
assign a random high port number as a source port, unless an app
specifically requests otherwise.
In Section 5, text added:
All PCEP message MUST be sent using the registered TCP port for
the source and destination TCP port.
Only one connection between a PCC and a PCE at any time. No need for
more than one has been identified. It might be claimed that two PCC
processes might exist on a single host/router, but no usage
scenario has
been found. Further, the text currently bans a second simultaneous
connection between peers.
This came up earlier in this email and has been made explicit.
Security is weaker and stronger.
Weaker as you point out.
Stronger because of the use of a system port as previously
described.
Disagree that a system port is a security mechanism, as per above.
OK. We have moved to a Registered Port, anyway.
Section 10.3.1., paragraph 2:
o A PCE should avoid promiscuous TCP listens for PCEP TCP
connection
establishment. It should use only listens that are specific to
authorized PCCs.
Authorized by what? TCP has no feature to restrict listens based on
credentials.
We will address this as part of the security work.
Section 10.3.1., paragraph 4:
o The use of access-list on the PCE so as to restrict access to
authorized PCCs.
Is redundant with the first bullet (or I don't understand what it
means).
We will address this as part of the security work.
Appendix A., paragraph 46:
If the system receives an Open message from the PCEP peer before
the
expiration of the OpenWait timer, the system first examines all of
its sessions that are in the OpenWait or KeepWait state. If
another
session with the same PCEP peer already exists (same IP address),
then the system performs the following collision resolution
procedure:
The goal of this procedure seems to be to guarantee that there is
only
a single active PCEP connection between two peers, but it's
cumbersome. It'd be much easier to require a peer to not initiate a
connection to a peer it already has one established with, and to
require it to immediately close a new TCP connection coming from a
peer it has an active PCEP connection with. This handles
everything at
the TCP layer without needing to involve the PCEP state machine.
I agree that this non-normative appendix seems to include
overkill for
an unlikely scenario.
The editors believe that this situation may be more than unlikely.
Especially when PCEP session are triggered by automatic procedures.
They point out that this procedure has been implemented multiple
times without any problems, so we are keeping it in.
Hope this answers everything apart from the security issues.
Cheers,
Adrian