I did review the draft-ietf-ipsecme-failure-detection before the WG
meeting and some of the comments I have here already have tickets so
no need to add them second time:
----------------------------------------------------------------------
Comments to draft-ietf-ipsecme-failure-detection:
Section 1:
"However, in many cases the rebooted peer is a
VPN gateway that protects only servers, "
What is that supposed to mean?
Section 2:
"Those "at least several minutes" are a time during part of
which both peers are active, but IPsec cannot be used."
Not true! It is the time during one of the peer is active and
another one is rebooting and the rebooting device might even
get up before the time runs out as is described next few paragraphs.
I suggest removing whole sentence.
Section 2:
"[RFC5996] does not mandate any time limits, but it is
possible that the peer will start liveness checks even before
the other end is sending INVALID_SPI notification, as it
detected that the other end is not sending any packets anymore
while it is still rebooting or recovering from the situation."
I think "but it is possible that the peer will start ..." is
wrong, more like "good implementation will start ...".
If implementation supports black hole detection there is no
point of doing that with long timeouts, as I said in our
implementation that specific timeout is 10 seconds (i.e.
around 20 times RTT which means with normal TCP etc traffic it
never triggers, but will trigger very quickly after other end
goes silent).
Section 3:
I still think the protocol would be much easier to implement
if we limit the QCD Token Taker role for initiator and Token
maker role for responder.
There is no point of making the protocol very generic, as
implementation are not going to implement features before
there is real use scenario for it. This means even if document
describes how it can be done it does not help as
implementations do not support it. If someone finds real use
scenario where it is needed for responder for being token
taker then writing new specification for that is way faster
than to get the implementations modified.
I have not yet seen use scenario for that where QCD would help
(meaning there are other already standardized ways in IKEv2
which are faster and more efficient implemented in
implementations).
Section 4.2:
"The QCD_TOKEN notification is related to the IKE SA and MUST
follow the AUTH payload and precede the Configuration payload
and all payloads related to the child SA."
RFC5996 removed payload ordering restrictions, so why are we
adding them back here? I suggest removing the whole paragraph.
Section 5.2:
I would remove this whole section.
Section 7:
I would remove this whole section. It was good to be there,
but I do not think we need it anymore. At least section 7.4 is
still completely wrong and is already covered by the section
2.
Section 8:
"Before establishing a new IKE SA using Session Resumption, a
client should ascertain that the gateway has indeed failed.
This could be done using either a liveness check (as in RFC
5996) or using the QCD tokens described in this document."
How do you use QCD tokens to ascertain that the gateways has
indeed failed. If you receive QCD token then you know that
other end is dead, but to receive QCD token the active
operation you do is to send liveness check. I think this
sentence requires some rewrite.
Section 8:
Example is wrong. The
HDR, {} -->
<-- HDR, N(QCD_TOKEN)
should be
HDR, SK{} -->
<-- HDR, N(INVALID_IKE_SPI),
N(QCD_TOKEN)
Section 9.1:
"Implementing the "token maker" side of QCD makes sense for
IKE implementation where protected connections originate from
the peer, such as inter-domain VPNs and remote access
gateways. Implementing the "token taker" side of QCD makes
sense for IKE implementations where protected connections
originate, such as inter-domain VPNs and remote access
clients."
So token maker and toker are both used "where protected
connections originate"? What is the difference? This text
requires clarifications.
Section 9.1:
"To clarify the this discussion:"
^^^^^^^^
Section 9.1:
"o For inter-domain VPN gateway it makes sense to implement
both roles, because it can't be known in advance where the
traffic originates."
I do not really see that. For Inter-Domain VPN gateways there
is two possibilities: symmetric or asymmetric initiation.
I.e. in asymmetric situation only one end can initiate
connections (for example because it is behind NAT or similar
or because the HQ VPN server is always configured to be
responder). In that case the Inter-Domain VPN case is similar
to the remote-access client / gateway case, i.e. the
"initiator end of Inter-Domain VPN gateway" is same as
"remote-access client" and "Responder end of the Inter-Domain
VPN Gateway" is same as "remote-access server".
For symmetric situations where either end can initiate
connections there are better and faster ways to handle things,
as I have already described earlier.
Section 10.1:
"Specifically, if one taker does not properly secure the QCD
tokens and an attacker gains access to them, this attacker
MUST NOT be able to guess other tokens generated by the same
maker."
Is bit misleading, as for attacker it is trivial to get large
amount of tokens. It just need to send one faked IKE SA packet
to token maker with random IKE SPIs to get valid token for
that IKE SPI pair.
Section 10.3:
"An attacker may try to attack QCD if the generation algorithm
described in Section 5.1 is used."
I do not think there is that big difference between 5.1 and
5.2 in here. The 5.2 will limit the dictionary for one IP
address, but as it is already impossibly large it does not
matter. I would suggest removing the reference to 5.1 in first
sentece.
Section 10.4:
Needs also comment that the load balancer switch demuxing MUST
stay stable. I.e. it can never change. Especially it cannot
change even when one devices goes off-line. Also there MUST
NOT be a way to bypass the load balancer using whether methods
possible (including tunneling packets in some other tunneling
protocolos, adding routing headers etc). I would add even more
warning that this setup is extremly dangeours. Luckily section
10.2 already forbids this:
"This document does not specify how a load sharing
configuration of IPsec gateways would work, but in order to
support this specification, all members MUST be able to tell
whether a particular IKE SA is active anywhere in the cluster.
One way to do it is to synchronize a list of active IKE SPIs
among all the cluster members."
--
[email protected]
_______________________________________________
IPsec mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/ipsec