Hi,
while performing stress tests, we ran into the problem, that is concerned with
the way cookies are handled in RFC7296.
The problem is that if network conditions are bad (high probability for packets
to be delayed, reordered or lost),
then some SAs are not established with AUTHENTICATION_FAILED error. Consider
the following scenario:
Initiator
Responder
1. HDR, SAi, KEi, Ni -->
2. --> HDR,
SAi, KEi, Ni
(server
has a large number of half-open SAs, so it responds
with cookie,
that is
generated using cookie secret K1)
2. (message is delayed in the network) ... <-- HDR, N(COOKIE1)
3. (client retransmits its original message)
HDR, SAi, KEi, Ni -->
4. --> HDR,
SAi, KEi, Ni
(server
has a large number half-open SAs, so it responds
with cookie,
but the
cookie generation secret has been already changed,
and becomes K2,
that
results in generating a new cookie for the same
request, COOKIE2)
5. <-- HDR,
N(COOKIE2)
6. HDR, N(COOKIE2) <--
7. (client receives message and retransmits its request with COOKIE2)
HDR, N(COOKIE2), SAi, KEi, Ni -->
8. --> HDR,
N(COOKIE2), SAi, KEi, Ni
(server
verifies COOKIE2, it's OK, since K2 is still the
current secret)
9. (message is delayed in the network) ... <-- HDR, SAr, KEr, Nr
10. HDR, N(COOKIE1) <--
11. (eventually the delayed first message from the server with COOKIE1 reaches
the client. Since the client doesn't know
that COOKIE1 is stall, it decides that it's fresher than COOKIE2, because it
receives this message later, so the client replaces
cookie
and resends its initial request with COOKIE1)
HDR, N(COOKIE1), SAi, KEi, Ni --> (this message get lost)
12. HDR, SAr, KEr, Nr <--
(eventually delayed server's response reaches the client. At this point the
client thinks that IKE_SA_INIT is completed and starts
IKE_AUTH)
What is interesting in the above diagram: both the client and the server have
eventually completed IKE_SA_INIT,
but they have different opinions on what IKE_SA_INIT message from initiator to
responder contains.
The client thinks that the server has responded to its most recently sent
message HDR, N(COOKIE1), SAi, KEi, Ni,
while the server has never received it and in fact has responded to HDR,
N(COOKIE2), SAi, KEi, Ni.
As a result - while calculating AUTH payload they will have different inputs to
it and authentication will fail.
Despite this diagram looking artificial, we did observe a noticeable number of
these errors
during real stress tests (up to 5% of SAs failed with this error in bad network
conditions).
What's particularly unfortunate with this:
1. The bad network conditions may happen as a result of DDoS attack, which also
may cause cookie logic to be triggered on the
server.
So, the two pre-conditions - bad network and server under attack are
coupled.
2. The most disappointing thing for me is that despite bad network conditions,
peers did manage to complete initial IKE exchanges,
only to get "authentication failed" result.
3. For customers this looks surprising - they have valid credentials, but from
time to time they receive "authentication failed"
diagnostics without any clue why this happens.
The root of the problem is that IKE_SA_INIT request may be retransmitted with
different content (different cookies)
and the peers have no means to be sure that they have send/receive identical
messages. And later these
possibly different messages are used in AUTH payload calculation.
I believe that the proper solution would be to exclude cookie from the AUTH
payload calculation.
It is verified by the responder using cookie generation secret and it is not
concerned with a client
(the client did not generate it, just echoes it back). However, this solution
is obviously
incompatible with RFC7296, so this is not an option.
Any opinions? Should this problem be addressed by the WG or ignored?
Regards,
Valery.
_______________________________________________
IPsec mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/ipsec