Please find my response to your comments. The current version of the file
integrates the language changes as well as changes to address the concerns
of this thread:



Section 2:
>>> It suggests a partial SPI match can be used, based on the assumption that
>>> the SPI number is known to have mostly zeros because the device only uses
>>> a hardcoded limited set (eg 257 to 260). While this is true for the
>>> outbound
>>> SPI, this may not be true for the inbound SPI, especially if the peer is
>>> not
>>> a "minimal ESP" device but a regular multipurpose OS. I think some
>>> clarification
>>> is needed for this minimum implementation optimization.
>> I probably missed the comment. I understand that a partial SPI match
>> would mean that only a subset of bytes will be considered, but I cannot
>> find text that suggests it. I do not also see any mention of special values
>> for the SPI. I also understand from the comment that the text is suggesting
>> some SPI values but I cannot find what text suggests it - at least for a
>> very small subset.
This was based on:
>     The index may be based on
>     the full 32 bits of SPI or a subset of these bits.
> If you index it on a partial length, you need to have generated these as well 
> with reduced length or else this operation would not provide unique indexes.
And you are not generating the peer's SPI, so those indexes need to use the full 32 bit too. 
> full 32 bit too.
> Perhaps you mean to say that for indexing your own generated SPIs, you might 
> know you only need to look at a subset of bytes?
Perhaps you can clarify the indexing sentence?
The SPI field (4 bytes) is decoupled between the SPI value (3 bytes) and a rekey index (1 byte).
rekey index (1 byte).

Here is the text that I believe addresses your concern:
 Alternatively, some constrained devices will not implement IKEv2 or
Minimal IKEv2 and as such will not be able to manage a roll-over between
two distinct SAs. In addition, some of these constrained devices are also
likely to have a limited number of SAs - likely to be indexed over 3 bytes
only for example. One possible way to enable a rekey mechanism with these
devices is to use the SPI where for example the first 3 bytes designates
the SA while the remaining byte indicates a rekey index.

SPI numbers can be used to implement tracking the inbound SAs when rekeying
is taking place. When rekeying a SPI, the new SPI could use the SPI bytes
to indicate the rekeying index.

Section 2.1:
SPI that are not randomly generated over 32 bits may lead to privacy and security concerns.
>>> privacy
>>>        and security concerns.
>>> The "may lead to security concerns" would be something that at the very
>>> least needs
>>> to be understood and specified in the Security Considerations section.
>>> If it is too
>>> difficult to determine the concerns, perhaps this optimization should be
>>> removed from
>>> the draft.
As a result, the use of alternative designs requires careful security and privacy reviews.
>>> security
>>>        and privacy reviews.
>>> If it is known this proposal requires careful security reviews, were
>>> these done? If
>>> so, why not replace this warning of danger with the actual output of
>>> those reviews?
>>> If reviews were not done, it would imply this document hasn't fully
>>> worked out its
>>> Security Considerations.
The concerns are explicitly mentioned in this section with what needs to be considered.
>> be considered.
> This is mostly a language problem then. When a document states " the use
> of alternative designs requires careful security and privacy reviews",
> it implies the reader needs to do these themselves and these were not part
> of the document. I had given some language improvement suggestions
> separately that might cover this? If not, please clarify this issue.
current text is:

 As a result, the use of alternative designs requires careful security and
privacy reviews.

SPI can typically be used to implement a key update
>>> What is a "key update" in this context? It seems this section is
>>> suggesting to use
>>> part of the SPI octet space to signal things to another part of the code
>>> on the device?
>>> If so, would that code part then clear out those overloaded SPI octets
>>> or would they go
>>> (unencrypted!) over the network for everyone to see?
The text says:
>> """
>> SPI can typically be used to implement a key update with the SPI
>> indicating the key is being used.
>> For example, a SPI might be encoded with the Security Association
>> Database (SAD) entry on a subset of bytes (for example 3 bytes), while
>> the remaining byte indicates the rekey index.
>> """
The last byte of the SPI indicates the key index and as the SPI is unencryted.
>> unencryted.
I know what the text says, but that text is not clear. Please clarify or expand what "key update" is.
> expand what "key update" is.

The text here seems clearer:
 Alternatively, some constrained devices will not implement IKEv2 or
Minimal IKEv2 and as such will not be able to manage a roll-over between
two distinct SAs. In addition, some of these constrained devices are also
likely to have a limited number of SAs - likely to be indexed over 3 bytes
only for example. One possible way to enable a rekey mechanism with these
devices is to use the SPI where for example the first 3 bytes designates
the SA while the remaining byte indicates a rekey index.

SPI numbers can be used to implement tracking the inbound SAs when rekeying
is taking place. When rekeying a SPI, the new SPI could use the SPI bytes
to indicate the rekeying index.

>>>        While the use of randomly generated SPIs may reduce the leakage or
>>>        privacy of security related information by ESP itself, these
>>>        information may also be leaked otherwise.
>>> This is not a strong argument. This sentence and the entire paragraph
>>> really seem to
>>> want to say something like "if you can see the network packets, the
>>> information
>>> leak would already be present by seeing the encrypted traffic,
>>> irrespective of
>>> whether the SPI is truly random or selected in a way that identifies the
>>> manufacturer"
The text says random SPI does not guarantee privacy sensitive information is not leaked.
>> is not leaked.
> That sentence (and the entire paragraph) don't really make any concrete
> good statement. I tried to paraphrase a better one above. If you think I
> misinterpreted your words,
> please provide a clearer text for the draft.

 I think the text has been removed.

>>>        The security of all data
>>>        protected under a given key decreases slightly with each message
I do not know of a generic claim like this for ESP. Can a reference be provided?
>>> provided?
>> It seems like a generic property for cryptographic keys. Quoting the
>> security consideration of RFC4106:
>> """
>> The other consideration is that, as with any encryption mode,
>> the security of all data protected under a given security
>> association decreases slightly with each message.
>> """
> Ok, it is a bit of an open door, and not really information that an
> implementer can do anything with, but consider this issue resolved.
In general, rekeying is done to avoid decrypting previous traffic in case of a key compromise.
>>> of a key compromise.
>>> Or perhaps you mean the limits of algorithms like AES_CBC (or 3DES) with
>>> respect to
>>> birthday and collision attacks? eg the commonly used maximum of 2^32-1
>>> crypto operations
>>> (which is not the same as maximum packets)
>>> In these cases, the SN is only relevant for very high speed links, eg
>>> gbps and would never
>>> apply to an IoT device that requires minimal ESP.
>> There is also the case, where an IoT device is provisioned with a
>> lifetime key in which case the key will be used as long as the device is
>> alive. This addresses a comment from Nancy as far as I remember and I agree
>> with her.
> If you send 1 packet per second, and use the SN to represents seconds
> since SA established, you have 136 years for your constrained device.
> Anyway, this discussion does not matter as it was just my text trying to
> explain the previous item, which I said can be considered resolved.

As noted in the TSVART review:
>>>        Also, for devices that spend significant time sleeping, the SN
>>>        would jump hugely on first waking. That shouldn't require any
>>>        larger window (unless a stale packet from prior to the sleep was
>>>        only released after a new packet on waking). But the receiver
>>>        would need to be able to somehow detect massive jumps in the high
>>>        order bits that are not communicated in the SN field.
>>> Perhaps the document can add more specific detail on how to use the
>>> commonly
>>> implemented time values into valid SNs that avoid ESN issues ?
>>> The implementation itself will depend on the application - especially if
>> other mechanisms than those implemented by the incrementation counter are
>> needed. Maybe you can let me know what additional text you expect ?
> Think of an implementer that is not aware of what application will run on
> this constrained device with IPsec stack. How should they implement SN's?
> That is the advise that is missing in this document.

I added:
The RECOMMENDED way for multipurpose ESP implementation is to increment a
counter for
each packet sent.

>> Of course, one should only consider use of a clock to generate SNs if the
>> application will inherently ensure that no two packets with a given SA are
>> sent with the same time value.
>> Note however that standard receivers are generally configured with
>> incrementing counters and, if not appropriately configured, the use of a
>> significantly larger SN difference may result in the packet out of the
>> receiver's windows and that packet being discarded.
>> """
While this is a slight hint to an implementer, it is not enough for them. Why not give them better advise?
> Why not give them better advise?

This seems heavily dependent on the application. What stronger advice would you have in mind?
would you have in mind ?

so the constrained device may not proceed to such checks
>>> The language issue here inverts the meaning. What is meant is "so the
>>> constrained device
>>> may omit such checks"
I agreed and changed it.
Resolved.
>>> [8]
TFC has not yet being widely adopted for standard ESP traffic.
>>> It is widely implemented (eg in Linux). I agree that using it seems rare.
>>> I am not convinced the reason for this is as is written. The issue I
>>> think
>>> more relates to deciding to what size to pad. The easiest is to use the
>>> MTU,
>>> but due to various encapsulation techniques (ESPinUDP, PPP-OE) it is not
>>> always
>>> clear what the MTU of the IPsec link is. And path MTU discovery with
>>> IPsec does
>>> not really work in practice.
> Note that by comment was regarding the "adopted". This can be interpreted
> as "implemented" or "used".
> Perhaps just change adopted to deployed? And change "not yet" as I presume
> you have no expection
> for this to change in the future ?

I added your suggestion:

TFC has been widely implemented but it is not widely deployed for ESP

>>> But if the application/device tends to send packets between 1 and say
>>> 125 bytes,
>>> it could always pad to 125 to not leak any information by packet size.
>>> The question
>>> on when to do this or not really depends on the traffic being protected.
>>> And if this
>>> the case, then it might be best to let the IKEv2 negotiation determine
>>> whether or not
>>> to use this - just like regular use of TFC.
>>> Regardless, TFC is optional and a minimum implementation can just omit
>>> it. Since
>>> this document would also be combined with efforts reducing sending bytes
>>> to
>>> preserve energy, it would make sense to avoid using TFC padding.
>>> Especially for sensors
>>> that for example just always send a one byte temperature value to begin
>>> with.
>>>        Such information could be used by the attacker in case a
>>> vulnerability is
>>>        disclosed on the specific device.
>>> I don't think "vulnerability" here is the issue. It could lead to
>>> exposing the size
>>> of the original packet being protected by IPsec, which could (or could
>>> not) leak
>>> information to an observer on the network.
>> What I meant is that traffic shaping can be used to identify the device,
>> which could be useful when such a device is known to be vulnerable.
>> The vulnerability comes after traffic shaping has been performed (see also
>> first section).
> Then just omit the "in case a vulnerability is disclosed". Just say the
> information could be used.
> Separate note, I personally find "traffic shaping" a confusing term as it
> seems to be used for different things.
>  I believe that providing an example on how the information can be used is
rather clarifying. The current text says:
Such information can be used, for example, by an attacker in case a
vulnerability is known for the specific device or application.
traffic shaping has been removed.

a minimal ESP implementation may not generate such dummy packet.
I think what is meant is "MUST NOT generate".
>> MUST NOT requires RFC4303 to be updated and is probably too strong. I
>> have been asked to remove all normative text.
change "may not" to "may simple choose to never generate"
We have the text you proposed:

An implementation can omit ever generating and sending dummy packets.

>>> The Next Header Section is better named Dummy Packet. While it discusses
>>> the mandatory
>>> Next Header field, it really only states not to send Dummy Packets. But
>>> it almost reads
>>> as if the Next Header can be ignored or omitted.
>> This requires changing ESP and the text actually requires the Next Header
>> field to be read.
>> """
>> For interoperability, a minimal ESP implementation must discard dummy
>> packets without indicating an error.
>> """
> I am not asking you to rename a term from RFC 4303. I am asking you to
> rename the section, as the section is
> really about the Next Header field value 59 only. (I did so in the
> language changes I sent you separately)
I took your wording.

>>>        4.  Avoid Padding by sending payload data which are aligned to
>>>            the cipher block length - 2 for the ESP trailer.
>>> Isn't this advise just moving the padding from the IPsec layer to the
>>> application
>>> layer? Eg the packet size or energy use would not be different if one
>>> implements
>>> this advise?
>> You are correct, sending padding versus application bytes does not change
>> power consumption. However, not sending padding bytes ensures you only
>> carry application information - and as such do not have an extra radio
>> frame to carry a padding.
> but if "only application data" includes "padded application data", then
> there would not be an extra radio frame? The block cipher size needs to be
> a certain size. If there is not enough
> application data, it is padded to make it so. If an application needs to
> send 14 bytes, I don't think it matters that it will add 2 dummy bytes at
> the application layer to "avoid padding" in
> the IPsec code. And what is send over the wire is the same regardless the
> solution, the 16 byte block size ?
> Perhaps rephrase this to say the application should attempt to avoid
> sending a number of bytes that is just over the blocksize, but then again
> that advise is kind of odd too, since
> the application should just not send more data than needed. Eg it could
> use compression if that would be more energy efficient than sending more
> radio frames. And that advise
> is already true for everything the application does, irrespective of
> IPsec. So the question remains, what is this statement trying to accomplish
> for a minimum ESP implementer to do?
>  The intention is to recommend maximizing the information bits sent, when
padding represent no additional information. In that sense CBC is likely
not to be recommended.

>>> Would it be useful to be able to signal a "mininum ESP" via IKEv2? I can
>>> imagine a simple
>>> Notify could be used to signal this. A peer receiving this could then
>>> ensure it is
>>> behaving in a "minimum ESP" compatible way even if it is a multi-purpose
>>> OS.
No. minimal ESP is compatible with the standard ESP as mentioned in the abstrct / intro:
>> abstrct / intro:
> Yes, but the other end might use padding or dummy packets and you could
> tell the other end to please not do that because it still takes energy to
> receive and process these.
This seems to change the scope of minimal ESP.... probably good for an
extra minimal ESP... but as long as IKEv2 is supported, it might support
also these two features. That said, the removal of padding is probably
feasible using diet-esp.

>>> There is a bit excessive and inconsistent linking to RFC 4303 throughout
>>> the document. I think on first use of ESP the RFC can be referenced, but
>>> further the document can just talk about ESP without keeping links to
>>> RFC 4303.
>>> (I also thought there should not be any links in the abstract?)
>>> The document should maybe mention IPsec v3 is meant for "ESP". IPsec v3
>>> is a superset
>>> of IPsec v2. There is no compatibility issue because the "new" things in
>>> v3 are
>>> all negotiated via IKE.
>> The document is limited to ESP RFC4303.
>>> I don't understand "a form of partial sequence integrity", as integrity
>>> is a boolean - it passes or fails. I don't understand "partial"
>>> "a form of partial sequence integrity" is the abstract of RFC4303 which
>> defines ESP.
>> RFC4303:
>> """
>>  ESP is used to provide confidentiality, data origin authentication,
>> connectionless
>>    integrity, an anti-replay service (a form of partial sequence
>>    integrity), and limited traffic flow confidentiality.
>>  """
> In that case I would recommend using "an anti-replay service" without the
> term in brackets.
>> "it becomes crucial" is a bit weak. I would say it must be guaranteed
>>> that ESP on IoT remains interoperable with currently deployed ESP.
>>>        This may raise some privacy issues as an
>>>        observer is likely to be able to determine the constrained
>>> devices of
>>>        the network.
>>> This text might be better placed in a Privacy Considerations section.
>>> Privacy concerned are exposed within the SPI section. Splitting the
>> section into multiple sections will make the document harder to read.
>>> The term "traffic shaping" is used in the document to refer to traffic
>>> being
>>> padded (padding or TFC). Perhaps my personal exposure to Linux has
>>> caused me
>>> to think of "traffic shaping" to mean to control the speed or flow of
>>> traffic,
>>> and not meaning "modifying traffic size".
>>> Traffic shaping does not mean modifying traffic size.  I see traffic
>> shaping being used to describe optimization as well inferring activity from
>> the traffic.
> I still advise against using this term, or to define the term as you use
> it somewhere in the introduction section.
> Paul

