Michael Richardson writes:
> I think that this is the best process;  I think we need to think about the
> problem deeper.  It would be nice if it could be made to work; but I suspect
> that may be equivalent to the CAPTCHA problem.

I do think the problem is easier than that...

I think the solution is combination of the protocol and the policy
rules. If we make some assumptions:

1) Gateway is NOT constrained device. It is device that can store some
data (few tens of megabytes of data when under attack is possible).

2) Client can be constrained device, which do not have lots of CPU
power or memory. If it is really constrained device, then it usually
isn't interactive connection, so even if it needs to wait for a while
to get connection that is ok. 

3) Client can also be smartphone, i.e. device which have quite a lot
of CPU power and/or memory, but does not want to use it as it would
increase the power usage so much that the battery life will be
shortened.

4) Client can also be desktop, i.e. device with quite a lot of CPU
power and/or memory, and who is not afraid of using those :-)

5) Each connections are usually quite long lived, i.e. devices make
one connection to the gateway, and keep that connection up all the
time, or at least very long time. If they move they use MOBIKE to keep
IKEv2 up and running even when they change IP-addresses. If they go
sleep for long time or disconnect from the network they will use
session resumption to come back.

6) Gateway can use IKEv2 redirect to distribute the attackers, i.e. it
could even use some cloud service which provides first level
protection, i.e. first redirect client during IKE_SA_INIT to cloud
service with LOTS of computing power. Then the cloud service will
authenticate client but might not authenticate the server (for example
using one way authentication). When the server then redirect client
back to real server using redirect, it will provide some kind of
VIP-pass cookie, that allows client to get past the queue... Or if the
SGW is willing to give authentication information to the cloud, the
server authentication can be done normally in cloud service, and
resumption token given to the client, and then client can be
redirected back to the original server. There are probably some open
issues how this all is done and how to combine redirect, resumption
and all those, but the protocol elements are there (except VIP-pass
cookie).

7) Botnets have huge amount of CPU power and lots of memory, but still
limited number of distinguished IPv4-addresses or IPv6-prefixes (it
might be millions, but most likely around thousands or tens of
thousands IP-addresses).

So I think those above make it easier than the captcha problem...

Also the gateway can blacklist all failed attempts by clients, i.e. do
not accept new connections from the same IP-address for some amount of
seconds, or move them to end of queue.

This assume that the blacklist for each failed attempt by IP-address
would be few tens of megabytes of memory, if attacker has about
million IP-addresses/prefixes. Note, that we can also make the prefix
length adjustable, i.e. assume attackers are not too close (related to
the IP-addresses) to the real users, and instead of /64 we could use
/56 or /48 for IPv6-addresses and /28 or /24 for IPv4 addresses. This
would reduce the size of the blacklist table. 

I would expect the processing rules being so that we collect new
connections for some amount of time, for example 100 ms, then we sort
the new connection attemps using special rules for that. We would move
everybody having valid resumption ticket or VIP-pass to the front,
then we would move everybody with IP-address blacklisted to the end
(ordered by number of failed attempts), and finally sort rest using
some kind of sorting order which would include whether they have
cookie or not, how long puzzle they have solved etc.

Now after we have the sorted list, we take first connection from the
list, start processing it, after that take next etc. While we are
processing current queue, we collect next batch of request. Then after
interval is passed again (i.e. after 100 ms), we throw the old queue
away (or keep the resumption and VIP-pass clients still in queue),
create new queue, and start over.

So I think the solution is something we can get working, and it will
be combination of differnet protocols we already have, and some new
protocols like the puzzle, and then it also includes description how
to combine all of those.
-- 
[email protected]

_______________________________________________
IPsec mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/ipsec

Reply via email to