Re: [IPsec] [Lwip] draft-ietf-lwig-minimal-esp shepherd writeup

2021-03-21 Thread Paul Wouters

On Sun, 21 Mar 2021, Daniel Migault wrote:

(replying to some issues here, but also added a full review of the document)

Side note: I am bit confused why this document would not be a document
from the IPsecME WG ? I know we talked about this before? Did we decide
against adoption at IPsecME ? Can the authors, WG chairs of IPsecME or
the responsible AD shed some light on the history here?

In general, this draft is very "wordy" because it is trying to steer
itself around a lot of problems, without making firm decisions. But
the point of an RFC is that it should make clear decisions that
implementers can adopt clearly. As such, I'm not in favour of this
draft. I believe I stated this before?


[1] 
https://github.com/mglt/draft-mglt-lwig-minimal-esp/commit/47f1351b1928ba687af18e75e253e98720448e8e
On Sat, Mar 20, 2021 at 5:12 AM Mohit Sethi M 
 wrote:
  I am now preparing the shepherd writeup for draft-ietf-lwig-minimal-esp.
  I wanted to clarify and double check a few things:

  - If the SPI is not random and is chosen by some application specific
  method -> it can reveal the application using ESP.


It is correct that the use of non random SPI may have some privacy impacts and 
one of these impacts is that in some cases, a SPI may be used to track an 
application. Note that our intention was to make it
clear that when SPI are non randomly generated, there are some privacy 
implications to consider as well as that randomly generated SPI is preferred. 


At the time I also mentioned one attack against IKE that was twarted by
having 4 random bytes as SPI. It remains dangerous to change this
property of ESP, and I recommended to not do that.

https://access.redhat.com/blogs/product-security/posts/sloth

But it seems that although my comments caused the draft to be modified,
it still allows non-random SPIs:

   However, for some constrained nodes, generating and handling 32 bit
   random SPI may consume too much resource, in which case SPI can be
   generated using predictable functions or end up in a using a subset
   of the possible values for SPI.  In fact, the SPI does not
   necessarily need to be randomly generated.  A node provisioned with
   keys by a third party - e.g. that does not generate them - and that
   uses a transform that does not needs random data may not have such
   random generators.  However, nonrandom SPI and restricting their
   possible values MAY lead to privacy and security concerns.  As a
   result, this alternative should be considered for devices that would
   be strongly impacted by the generation of a random SPI and after
   understanding the privacy and security impact of generating nonrandom
   SPI.

So I feel I raised a security issue, and the text just copied my concern
but still basically states implementations MAY do this. I believe this
is wrong.


Note that the draft defined one (common way) to generate the SPI value that is 
using a random generator to generate this SPI value. All other means fall into 
the category of using deterministic functions.
This does not necessarily mean that a fix of predefined SPI will necessarily be 
used. This includes for example the fact that only 2**16 or 2**24 values may be 
candidates. The case where one device has a
very limited number of SPI is quite extreme. In any case, it should be 
estimated how much the SPI leaks more information than the IP destination and 
the use of IPsec as well as the pattern associated with
the traffic.


I'm not concerned about privacy. As you stated, it is usually pretty
clear what an IoT device is based on where it connects to. I am far
more concerned about security.


However, for some constrained nodes, generating and handling 32 bit random SPI 
may
consume too much resource, in which case SPI can be generated using
predictable functions or end up in a using a subset of the possible values for 
SPI.


If such a device cannot generate 4 random bytes, how is it performing a
DiffieHellman key exchange? Or is it presumed that IKE is done
elsewhere? In which case "elsewhere" can generate 4 random bytes.

What about IVs ? If you cannot generate 4 bytes of random, how it is
going to generate the IVs required for ESP?


In fact, the SPI does not necessarily need to be randomly generated.


Yes it is does, see the above link on an attack against IKE where the
randomized SPI made offline attacks impossible and online attacks
impractical.


A node provisioned with keys by a third party - e.g. that does not generate 
them - and that uses a transform that does not need random data may not have 
such random generators.


There is a strong move to AEADs, and it would be foolish to limit IoT to
things like AES-CBC because of the IV generation.


  - When sequence numbers are time -> won't it reveal the time at which
  the packet was sent.


First the use of time is primarily driven to have a always increasing function, 
more than the value of the time itself. This could be used with a 

Re: [IPsec] [Lwip] draft-ietf-lwig-minimal-esp shepherd writeup

2021-03-21 Thread Daniel Migault
Hi Mohit,

Thanks for the review. Please find inline my responses. I have included
your comments as well as additional nits in [1]. As soon as we believe the
version addresses your concerns a new version will be  published.

Yours,
Daniel

[1]
https://github.com/mglt/draft-mglt-lwig-minimal-esp/commit/47f1351b1928ba687af18e75e253e98720448e8e

On Sat, Mar 20, 2021 at 5:12 AM Mohit Sethi M  wrote:

> I am now preparing the shepherd writeup for draft-ietf-lwig-minimal-esp.
> I wanted to clarify and double check a few things:
>
> - If the SPI is not random and is chosen by some application specific
> method -> it can reveal the application using ESP.
>

It is correct that the use of non random SPI may have some privacy impacts
and one of these impacts is that in some cases, a SPI may be used to track
an application. Note that our intention was to make it clear that when SPI
are non randomly generated, there are some privacy implications to consider
as well as that randomly generated SPI is preferred.

In general an application rarely selects the SPI value to be used. Instead,
the system is rather in charge of applying the security policies and
selects the SPI according to its implementation. Suppose a system is
running X applications and uses Y > X SPI that are not randomly generated
out of the 2**32 possible values. The X applications may be tunneled over
one security association. In that case, the traffic of a specific
application X0 will not be identified from the traffic of the other
applications. So in order to identify one application with an SPI value,
the security association needs to be set for that application specifically.
This may happen in some cases where the device is only running one
application and with a very limited number of SPI. In that case, the
distribution of SPI may have some values that are over-represented.

Note that the draft defined one (common way) to generate the SPI value that
is using a random generator to generate this SPI value. All other means
fall into the category of using deterministic functions. This does not
necessarily mean that a fix of predefined SPI will necessarily be used.
This includes for example the fact that only 2**16 or 2**24 values may be
candidates. The case where one device has a very limited number of SPI is
quite extreme. In any case, it should be estimated how much the SPI leaks
more information than the IP destination and the use of IPsec as well as
the pattern associated with the traffic.
Typically a destination to www.mytemperature.com every 5 minutes with a
fixed size is likely to reveal the presence of a temperature sensor,
independently of the SPI value.

As a conclusion, I am inclined to say, there are some cases when using
nonrandomly generated  SPI over 32 bytes may reveal the presence of a given
application. However, when this occurs, other conditions need to be met. It
seems to me the document mentions clearly that privacy implication needs to
be considered when these alternative methods are considered. If there is
anything that appears not to be clear, I am happy to clarify it.


>
> - I assume a resource-constrained device would not have many inbound
> connections. Would it make sense to generate a byte of randomness
> instead of entire 32-bit SPI? At least some APIs allow asking for a byte
> of randomness (randomByte()). This is assuming an upper limit on the
> number of inbound connections.
>


The text opposes the 32 bit random SPI versus other ways to generate the
SPI. The alternative you propose falls into that category. It seems to me
that the confusion may come from discussions where we discussed the use of
a fixed small number of SPIs. This specific case has been generalized to
any subset of the 2**32 possible SPIs. I mention the text below from the
current draft that I think should address your concern, but I am fine
making it clearer.

"""
However, for some constrained nodes, generating and handling 32 bit random
SPI may
consume too much resource, in which case SPI can be generated using
predictable functions or end up in a using a subset of the possible values
for SPI.
In fact, the SPI does not necessarily need to be randomly generated.
A node provisioned with keys by a third party - e.g. that does not generate
them - and that uses a transform that does not need random data may not
have such random generators.
However, non random SPI and restricting their possible values MAY lead to
privacy and
security concerns.
As a result, this alternative should be considered
for devices that would be strongly impacted by the generation of a
random SPI and after understanding the privacy and security impact of
generating non random SPI.
"""


>
> - When sequence numbers are time -> won't it reveal the time at which
> the packet was sent.
>
> 
First the use of time is primarily driven to have a always increasing
function, more than the value of the time itself. This could be used with a
clock that is 2 years back in the past or in the