[IPsec] Re: Fwd: New Version Notification for draft-klassert-ipsecme-wespv2-00.txt
On Wed, Jun 05, 2024 at 04:44:25PM +0200, Boris Pismenny wrote: > Hi, > > I have a few questions and a few comments on how > to make this more hardware friendly, like PSP. > Note that making it hardware friendly will likely > improve highly tuned software performance too. > > **Questions**: > (1) What is the purpose of matching the WESP2 next > header and the ESP next header? won't it be better > to remove/replace the ESP trailer's next header > and padding? WESPv2 acts a a wrapper around ESP, we can't change anything on the ESP protocol itself. Also, this matching is just done if the peers agree to show parts of the inner headers by using a nonzero cryptoffset. We don't want to leak the next header if cryptoffset is zero. > > **Comments for hardware friendliness**: > (1) Variable padding (especially per-packet) is bad for > hardware. It should not change on per packet basis. We placed this in the draft: The chosen padding size SHOULD NOT change for a given Child SA. (Authors note: Should the padding size be negotiated?) So maybe it can be even negotiated to meet the requirements of both perrs. > Is there any data on the benefit of aligning to 16B? On some archtitectures SIMD and AVX instructions perform better with proper alignment. The 16 byte are mentioned because the ciphertext is already aligned on a 8 byte boundary, so you should reach at least 16 bytes if padding is used. > >From a hardware perspective the preferences regarding > padding are: (a) remove it entirely; (b) use a constant > amount of padding that is negotiated per-flow; or > (c) add a field that explicitly indicates the length of > padding in the packet. We should have (b), we need (c) and we can discuss about (a). Let's wait on other opinions. > > (2) Parsing is easier when optional/variable length > fields appear at the end rather than the beginning. > For example, if you could move ESP closer to the > start of the header that would be good. What do you mean by that? We could make the FID always present if that would help. > > (3) Fields that should be match-able by hardware > should appear at the beginning. The FID, for example, > seems like a field that should go to the beginning. I've commentd on this in the other mail. > > (4) All fields that can be constant per-flow should > be pre-negotiated, such that it is possible to assume > that all packets of a flow have the same flags. For > example, the FID and padding length should be decided > at negotiation and if they're unused then they should > be prohibited for all packets of this flow. Agree with that. Steffen ___ IPsec mailing list -- ipsec@ietf.org To unsubscribe send an email to ipsec-le...@ietf.org
[IPsec] Re: Fwd: New Version Notification for draft-klassert-ipsecme-wespv2-00.txt
On Thu, Jun 06, 2024 at 01:54:30PM +, Doyle, Stephen wrote: > > (3) Fields that should be match-able by hardware should appear at the > > beginning. The FID, for example, seems like a field that should go to the > > beginning. > > +1 for this. > > With the current WESPv2 header format, an intermediate device that wishes to > perform ECMP using the FID will have difficulty finding the FID. For example, > in the presence of padding, what is the calculation that locates the start of > the FID field? The HdrLen gives the offset to the beginning of the "Rest of > Payload Data (i.e. past the IV, if present ...)" but the size of the IV is > algorithm dependant and won't be known to an intermediate device and so the > intermediate device can't use the HdrLen field to find the end of the header > and work backwards to the FID field. It would be much easier if either the > FID was after the initial 4 bytes of fields. Or if for some reason the > padding field needs to come before the FID field, have a field that specifies > the padding length to simplify the packet parsing. > Intermediate devices should indeed be able to find the FID. I've just overlooked that this is not possible with the current format. The FID is at the end, because then it is directly followed by the ESP SPI which is also something that intermediate devices might be interested in. If the FID is in front, then there might be a gap between two interesting fields if padding is present. So yes, we need some padlen field. In particular because we also have problems to find the start of the ESP header without that. Would it be still beneficial for hardware to place the FID in front if we have a padlen field? Steffen ___ IPsec mailing list -- ipsec@ietf.org To unsubscribe send an email to ipsec-le...@ietf.org
[IPsec] Fwd: New Version Notification for draft-klassert-ipsecme-wespv2-00.txt
Hi, we just published a new draft defining Wrapped Encapsulating Security Payload v2 (WESPv2). It is designed to overcome limitations of the ESP protocol to expose flow information to the network in a transparent way. It introduces a flow identifier field that can be used to cary flow information, such as 'anti replay subspaces', 'VPN IDs' etc. To preserve the usecase of the original WESP protocol (and to align with Google PSP), it also defines a Crypt Offset to allow intermediate devices to read some header bytes at the beginning of the inner packet. It also defines optional padding to align the cipertext to the need of the peers. Steffen - Forwarded message from internet-dra...@ietf.org - Date: Tue, 28 May 2024 01:55:54 -0700 From: internet-dra...@ietf.org To: Antony Antony , Steffen Klassert Subject: New Version Notification for draft-klassert-ipsecme-wespv2-00.txt A new version of Internet-Draft draft-klassert-ipsecme-wespv2-00.txt has been successfully submitted by Steffen Klassert and posted to the IETF repository. Name: draft-klassert-ipsecme-wespv2 Revision: 00 Title:Wrapped ESP Version 2 Date: 2024-05-28 Group:Individual Submission Pages:12 URL: https://www.ietf.org/archive/id/draft-klassert-ipsecme-wespv2-00.txt Status: https://datatracker.ietf.org/doc/draft-klassert-ipsecme-wespv2/ HTML: https://www.ietf.org/archive/id/draft-klassert-ipsecme-wespv2-00.html HTMLized: https://datatracker.ietf.org/doc/html/draft-klassert-ipsecme-wespv2 Abstract: This document describes the Wrapped Encapsulating Security Payload v2 (WESPv2) protocol, which builds on the Encapsulating Security Payload (ESP) [RFC4303]. It is designed to overcome limitations of the ESP protocol to expose flow information to the network in a transparent way and to align the cipher text to the needs of the sender and receiver. To do so, it defines an optional Flow Identifier where flow specific information can be stored. It also defines a Crypt Offset to allow intermediate devices to read some header bytes at the beginning of the inner packet. In particular, this preserves the original use-case of WESP [RFC5840]. Optional padding can be added for cipher text alignment. The IETF Secretariat - End forwarded message - ___ IPsec mailing list -- ipsec@ietf.org To unsubscribe send an email to ipsec-le...@ietf.org
Re: [IPsec] [Last-Call] Tsvart last call review of draft-ietf-ipsecme-multi-sa-performance-06
Hi, On Wed, Apr 17, 2024 at 12:13:19PM +, Marcus Ihlar wrote: > I think it would be sufficient to include a paragraph that mentions that this > solution can introduce packet reordering and variable delays and that packet > scheduling/load-balancing implementations should take this into > consideration. Without going into details on how to solve it. I'm not sure if I get your point. Can you elaborate on why this can introduce packet reordering and variable delays. Thanks, Steffen ___ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec
Re: [IPsec] Warren Kumari's Discuss on draft-ietf-ipsecme-multi-sa-performance-06: (with DISCUSS and COMMENT)
Hi, thanks for your review! On Fri, Apr 26, 2024 at 02:38:27PM -0700, Warren Kumari via Datatracker wrote: > Warren Kumari has entered the following ballot position for > draft-ietf-ipsecme-multi-sa-performance-06: Discuss > > When responding, please keep the subject line intact and reply to all > email addresses included in the To and CC lines. (Feel free to cut this > introductory paragraph, however.) > > > Please refer to > https://www.ietf.org/about/groups/iesg/statements/handling-ballot-positions/ > for more information about how to handle DISCUSS and COMMENT positions. > > > The document, along with other ballot positions, can be found here: > https://datatracker.ietf.org/doc/draft-ietf-ipsecme-multi-sa-performance/ > > > > -- > DISCUSS: > -- > > Be ye not afraid -- see > https://www.ietf.org/about/groups/iesg/statements/handling-ballot-positions/ > on > handling ballots, especially DISCUSS ballots... > > A DISCUSS is a request to have a discussion, and this is especially true in > this case, because my mental model is somewhat hazy here... > > The document talks about negotiating multiple "Child SAs with the same Traffic > Selectors". In my mental model, this looks sort of analogous to multiple > parallel paths. The document doesn't seem to discuss how implementations > should > share traffic across these "paths" -- should it do something like ECMP and > hash > to try and keep packets in a flow tied to the same > CPU? Yes, packets of the same flow are tied to the same CPU. Modern NICs do a 'RSS hash' (RSS - Receive Side Scaling) over the headers and pin packets to CPUs based on the hash. Usually the NIC does this hashing based on L3/L4 headers by default. Until now, IPsec implementations had to disable this feature because this creates reorder if different inner flows match the same Child SA. With this draft, each CPU has its own Child SA, so that reorder problem goes away and we can use the RSS hash for IPsec. Also in the other direction, when receiving IPsec we had before just one Child SA per Traffic Selector. So all packets had the same SPI and could not be parallelized using a RSS hash. Now, each CPU has its own Child SA and with that its own SPI what makes RSS hasing possible. We were asked to leave out 'implementation details', so the draft does not say much about that. > Or is this something that is done automatically by the OS already (e.g > because the existing L3 logic would just view these as parallel links) and it > already knows what to do? Or is this something that IPSEC implementations > handle themselves? Or is my mental model so broken that my question doesn't > even make sense? Your mental model seems to be quite OK. > The document also says that "if an implementation finds it needs to encrypt a > packet but the current CPU does not have the resources to encrypt this packet, > it can relay that packet to a specific CPU that does have the capability to > encrypt the packet, although this will come with a performance penalty.". > Cool... but does this lead to the potential of out of order packets? Is that > what the "this will come with a performance penalty" is implying (in which > case > I'd suggest being a bit more explicit). The cited sentence has to be reworded, it is missleading. 'Not having the resources' means here 'there is currently no Chlid SA installed on that particular CPU'. To avoid packet loss, packets can be redirected to a different CPU until the SA comes up. Just move packets to a different CPU if the current CPU is overloaded would indeed introduce uncontrolled reorder. Maybe the sentence can be reworded like this: "if an implementation finds it needs to encrypt a packet but the current CPU does not have a CPU specific Child SA to encrypt this packet, it can relay that packet to another CPU that does have a Child SA in place to encrypt the packet, although this will come with a performance penalty." Steffen ___ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec
Re: [IPsec] IPR confirmations for draft-ietf-ipsecme-multi-sa-performance
Same here, I am not aware of any IPR and willing to be listed as author. Steffen On Fri, Mar 15, 2024 at 07:35:59AM +1000, Paul Wouters wrote: > I am not aware of any IPR, willing to be listed as author. > > Paul > > Sent using a virtual keyboard on a phone > > > On Mar 15, 2024, at 03:55, Tero Kivinen wrote: > > > > Are any authors of the draft-ietf-ipsecme-multi-sa-performance (or > > anybody else) aware of any IPRs related to this draft? > > > > Are authors willing to be listed as authors? > > > > I will require response from author, and also updated version of the > > draft based on my review comments, before I will hit publication > > requested. > > -- > > kivi...@iki.fi ___ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec
Re: [IPsec] I-D Action: draft-he-ipsecme-vpn-shared-ipsecsa-00.txt
On Mon, Mar 11, 2024 at 11:36:03AM -0400, Paul Wouters wrote: > On Mon, 11 Mar 2024, Panwei (William) wrote: > > > Indeed, splitting the 32-bit SPI into two sub-fields, the VPN ID sub-field > > and SPI sub-field, may also be one option. This solution doesn't need to > > change the ESP packet format, but it also has some disadvantages. > > The first one is the scalable issue. 256 VPN IDs may be enough for use for > > current RAN Sharing scenario, but when considering the service slicing > > feature, thousands and even more VPNs will be needed in the future. So, > > it's better to assign 16 bits to the VPN ID sub-field. Therefore, the SPI > > sub-field will be trenched to 16 bits, which means the available SPIs are > > 64k. This can have a negative impact on the expansion of usable scenarios > > in the future. > > The second problem is the high possibility of packet disorder. Although all > > VPNs share one actual SA and the sender assigns sequence numbers in > > sequence to all the traffic no matter which VPN they belong to, different > > VPNs will use different SPIs in the ESP packets. This will interfere with > > the load balance process of the on-path routers because they usually look > > at the SPI field when doing the hash. This may lead to packet disorder at > > the IPsec receiver. > > Therefore, we currently still prefer a separate field representing the VPN > > ID. But we are open to more discussions and future changes. > > Thanks, those arguments are clear. Perhaps Steffen can take these into > consideration as well when thinking about ESPv4 / WESP :) The root cause what this draft tries to solve is that we need a possibility to expose information of the inner packet flow to the outer packet. Basically this is the same problem, the sequence number subspaces draft tries to solve. So instead of solving the same problem for every single usecase in a diffetent way, it would be nice to have a generic solution for that problem. I'm thinking of some 'Flow Identiyer' field that can be used for all this cases. The biggest problem here is that some usecases need to have this information transparent to the outer network. For instance the sequence number subspaces information is needed by the receiving NIC to do RSS properly. For that reason negotiating the new field does not help. The NIC can't know what you have negotiated, so the new field is useless in that case. Unfortunately ESP does not have a version number, so we can't change it in any transparent way. The only possibility for ESP is to use the SPI because this is already used as the ESP 'Flow Identifyer'. At the last IETF meeting I proposed to use an updated WESP to do changes that need to be transparent to the network. WESP has a version numbers and can be adjusted in a transparent way. I'm currently preparing a WESPv2 draft that I plan to publish after the next IETF meeting. Some people here on the list have seen it already, that's why Paul mentioned me in his mail. I know that switching to another protocol is always a pain, but I don't see other options do do that right. Steffen ___ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec
Re: [IPsec] RFC 4303 ESN and replay protection entanglement
On Wed, Jan 03, 2024 at 03:40:52PM -0500, Paul Wouters wrote: > > > In RFC 4303 Section 3.3.3 states: > >Note: If a receiver chooses to not enable anti-replay for an SA, then >the receiver SHOULD NOT negotiate ESN in an SA management protocol. >Use of ESN creates a need for the receiver to manage the anti-replay >window (in order to determine the correct value for the high-order >bits of the ESN, which are employed in the ICV computation), which is >generally contrary to the notion of disabling anti-replay for an SA. > > This might have been good advise at the time (2005) but might not work > so well anymore these days. > > A 100gbps card produces 100 * 1024 * 1024 *1024 / 8 / 1500 = 8947848 > packets per second. > > Without ESN, the IPsec SA has to rekey before it hits 2^32 (4294967296) > packets. > > So that gives the IPsec SA a lifetime of 4294967296 / 8947848 = 480 > seconds before it needs to have been rekeyed to precent running out of > sequence numbers. > > The Nvidia ConnectX-7 can do 400gbps, so that would be a rekey within > every 120 seconds without ESN. I think this shows there is a use case > for ESN even if one would want no replay protection. I think it would > make sense to NOT disable ESN when replay detection is disabled. I guess the question is, why would you want to have ESN (or sequence numbers in general) if the receiver does not check it? Of course, we need to make sure that we don't transmit more than 2^64 packets with the same key. But that counter can be local to the sender in this case. > But the question is how do current IPsec stacks keeps track of > the 64bit ESN numbers without the replay-window code. I believe, > but I'm not certain, that the Linux stack might not support ESN with > replay-window=0. Linux does not support that, as this is not possible as long as the sender includes the high-order bits implicitly into the ICV computation. > I believe the nvidia driver also does not support this > for their packet offload engine. So current implementations might not > be able to support this. > > Provided the stacks would do something to support ESN without doing > full replay protection, would it make sense to update this advise from > RFC 4303? RFC 4303 Section 2.2.1 states: If a combined mode algorithm is employed, the algorithm choice determines whether the high-order ESN bits are transmitted or are included implicitly in the computation. So it might be possible to transmit the full 64 bit sequence number. But again, why would that make sense? The receiver will just ignore it if replay protection is disabled. Btw. The fact that ESN is contrary to disabling anti-replay means also that the advice to disable replay protection if you have problems with reordering due to multicore processing, QoS etc. is not such a good one. Steffen ___ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec
Re: [IPsec] Fwd: New Version Notification for draft-colitti-ipsecme-esp-ping-00.txt
On Fri, Jul 28, 2023 at 08:20:03PM +0200, Antony Antony wrote: > On Tue, Jul 25, 2023 at 07:06:47PM -0700, Lorenzo Colitti wrote: > > Dear ipsec WG, > > > > When working on a VPN implementation we found that it's very difficult to > > rely on IPv6 ESP packets because many networks drop them, so even if IKE > > negotiation succeeds, the data plane might be broken. Worse, this can > > happen on migrate, blackholing an existing session until the problem is > > detected and fixed with another migration. > > > > In many cases, I think a simple "pre-flight check" to see if ESP is > > supported on a given network path could solve this problem. So after a few > > conversations with folks here I put together this draft. It provides the > > equivalent of an ESP ping packet. Comments and feedback appreciated. > > Thanks Lorenzo for this ID. > I believe this is a great idea. Perhaps we could consider allowing a > non-zero ESP payload size? This would facilitate correlating responses upon > arrival at the sender. Then I would add an ESP message, format similar to > ICMP message. For instance, incorporating an identifier, like ICMP ping has, > would enable initiating multiple ESP pings from the same client and > receiving corresponding responses without mixing them up. > > We could utilize common denominator payloads resembling ICMP and ICMPv6 ECHO > and ECHO responses, as defined in rfc4443#section-4.1 and rfc792. And may be > add a couple ESP specific values, especially for encrypted message use case > proposed bellow. > > Additionally, it would be advantageous to support ESP Ping using encrypted > ESP messages too. This would be especially useful to send ESP pings from an > IPsec gateway that may or may not have an IP address from the tunnel range > negotiated by it. I really like to see that encrypted ESP Ping too. With that, it would be easy to test liveness of particular tunnels and it might be even possible to do authenticated PMTU probes on a tunnel. ___ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec
[IPsec] Fwd: New Version Notification for draft-mrossberg-ipsecme-multiple-sequence-counters-00.txt
Hi, we just published a new informal problem statement draft (draft-mrossberg-ipsecme-multiple-sequence-counters-00.txt) about ESP sequence number problems when using multiple CPU cores, QoS etc. At the last working group meeting in London, it was quite some interest to work on a re-design of ESP to make it fit to the multi-cpu case, QoS classes, HW offloads, multipath, multicast, etc. This is a first approach to describe the problems we have with the current ESP protocol. Comments welcome! Steffen - Forwarded message from internet-dra...@ietf.org - Date: Mon, 27 Feb 2023 23:14:14 -0800 From: internet-dra...@ietf.org To: Michael Pfeiffer , Michael Rossberg , Steffen Klassert Subject: New Version Notification for draft-mrossberg-ipsecme-multiple-sequence-counters-00.txt A new version of I-D, draft-mrossberg-ipsecme-multiple-sequence-counters-00.txt has been successfully submitted by Steffen Klassert and posted to the IETF repository. Name: draft-mrossberg-ipsecme-multiple-sequence-counters Revision: 00 Title: Problem statements and uses cases for lightweight Child Security Associations Document date: 2023-02-27 Group: Individual Submission Pages: 15 URL: https://www.ietf.org/archive/id/draft-mrossberg-ipsecme-multiple-sequence-counters-00.txt Status: https://datatracker.ietf.org/doc/draft-mrossberg-ipsecme-multiple-sequence-counters/ Html: https://www.ietf.org/archive/id/draft-mrossberg-ipsecme-multiple-sequence-counters-00.html Htmlized: https://datatracker.ietf.org/doc/html/draft-mrossberg-ipsecme-multiple-sequence-counters Abstract: IKE SAs may have one or more child SAs that are used for traffic protection. This document collects arguments for (and against) having more fine-grained sub-child-SAs. They can be used to separate data streams for various technical reasons but share the same security properties and traffic selectors. This shall allow for a more flexible use of IPsec in multiple scenarios. The IETF Secretariat - End forwarded message - ___ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec
Re: [IPsec] Disabling replay protection
On Tue, Feb 21, 2023 at 12:45:27PM -0500, Benjamin Schwartz wrote: > On Mon, Feb 20, 2023 at 4:58 PM Michael Richardson wrote: > > > Tero Kivinen wrote: > > > I mean what should other end do if the other end says he will not > > > do anti-replay checks? > > > > Not send unique relay values in the ESP. > > > > Yes but mostly for AH. My goal is related to draft-xu-risav, which would > benefit from the ability to repeat sequence numbers in AH when replay > protection is not required. > > Reusing sequence numbers is extremely unsafe in ESP. Most notably, AES-GCM > fails entirely and **leaks the shared secret** if a nonce is ever reused > [1]. That depends on how you create your Nonce. If you use the sequence numbers as the IV, then yes. But you are free to implement any other method as long as the IV (and with that the Nonce) does not repeat (RFC 4106). So in theory, you can do that with ESP too. Steffen ___ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec
Re: [IPsec] Virtual interim about re-designing ESP?
On Tue, Nov 22, 2022 at 05:16:08PM -0500, Daniel Migault wrote: > I support Bob's suggestion. > I also believe that multicore will be addressed by design. I do want to > have some mechanisms like [1] to be included by design. That said, I would > like [1] to start on ESPv3 and take the output back to ESPv-4 as opposed to > waiting for ESP-v4. I disagree. If we touch ESP, everything should be on the table and we should consider that all together. ___ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec
Re: [IPsec] Virtual interim about re-designing ESP?
On Tue, Nov 22, 2022 at 04:15:54PM -0500, Paul Wouters wrote: > speaking with no hats on. > On Mon, Nov 21, 2022 at 7:47 AM Steffen Klassert < > steffen.klass...@secunet.com> wrote: > > > Is there interest in doing a virtual interim to discuss an ESP re-design? > > > > I am very interested. It is a problem that we should fix sooner rather > than later. > > > First things to clarify would be: > > > > - Does the working group agree to the need of an ESP re-design? > > > > I would call it an update, not redesign, but yes :) I don't mind how we call it, as long as we do it :) > > > > - Who is interested to work on that? > > > > I am as individual and wearing my $dayjob head and wearing my libreswan hat. > > > > - What are the problems to solve? > > > > This would be an important list to create, but I think we won't be able to > do it at an interim > without some prep work. I think that should happen on the list ? I think we need to agree on which problems needs to be addressed. This can happen on the list, but we should create a 'problem statement' document based on this. The document can be the scope on what we plan to work (and what not). ___ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec
Re: [IPsec] IPsecME WG Adoption call for draft-pwouters-ipsecme-multi-sa-performance
On Tue, Nov 22, 2022 at 04:58:55PM -0500, Daniel Migault wrote: > This draft is missing an important part which is the actual negotiation > of the multiple SAs. A peer willing to set these multiple SAs will have to > negotiate them anyway. Some implementations can > handle parallel CREATE_CHILD_SA others cannot and the negotiation of > multiple SAs might take a very long time, at least a time that is not > acceptable to high performance tunnels. Since these child SAs need to be > created, the one willing to the multiple SAs can simply start and stop when > the responder says stop. In terms of IKEv2 the gains are minimal. The > document may add a mechanism similar to address that: > https://datatracker.ietf.org/doc/draft-mglt-ipsecme-multiple-child-sa/ I'm one of the authors of the above mentioned draft and the draft we are discussing here. Speaking as an author of the above mentioned draft: This draft was a first attempt so solve the multi cpu SA case. The mechanism to install all child SAs once that was used there was seen as as too complex, given that the number of cpus are not too high. So it should be possible to either create separate parallel child SAs, or creating them on demand when traffic pops up an a certain cpu. The draft we discuss here takes this into account and reduces the complexity to a minimum. > However, draft-ponchon-ipsecme-anti-replay-subspaces addresses all of these > issues nicely and provides a much more scalable solution. It basically > makes -IMO - both -multiple-child-sa and -multi-sa-performance obsolete. I disagree here. The multi-sa-performance draft just adds two IKE notifications, so no achitectural changes. This is the 'low hanging fruit', it can be done independent of any changes to ESP. The anti-replay-subspaces draft does architectural changes to ESP, this needs more work. > My suggestion is that -multi-sa-performance is being moved to experimental > and almost shipped as it is so the work being achieved is documented. This > has been some interesting work, but today, I would like the group to spend > more cycles on draft-ponchon-ipsecme-anti-replay-subspaces that I consider > more promising. I already proposed to work on a ESP-v4 version, and this draft should definitely be considered there. But the discussion about ESP-v4 should be open, and not focused on this particular proposal. ___ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec
[IPsec] Virtual interim about re-designing ESP?
Hi, at the last working group meeting in London, it was quite some interest to work on a re-design of ESP to make it fit to the multi-cpu case, QoS classes, HW offloads etc. We already have some proposals that try to solve related problems in different ways: IETF 108: https://datatracker.ietf.org/meeting/108/materials/slides-108-ipsecme-proposed-improvements-to-esp-01 IETF 115: https://www.ietf.org/archive/id/draft-ponchon-ipsecme-anti-replay-subspaces-00.txt The Google PSP Security Protocol (PSP) is another new 'ESP like' protocol. There is some interest to standardize PSP, so the issues that are solved there should also be considered when designing a new ESP version. Most concepts that are used in PSP are taken from IPsec ESP, so IMO this should be integrated into the IPsec protocol suite. Is there interest in doing a virtual interim to discuss an ESP re-design? First things to clarify would be: - Does the working group agree to the need of an ESP re-design? - Who is interested to work on that? - What are the problems to solve? - How should the problems be solved? Please let me know if there is interest, Steffen ___ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec
[IPsec] Discussion about solving ESP limitations with parallel processing, handling QoS classes etc.
Hi, over the last years, quite some work was done from different parties to overcome some limitations of ESP to handle parallel datapaths, QoS classes etc. Chronologically ordered, we have: November 2019: https://datatracker.ietf.org/doc/html/draft-mglt-ipsecme-multiple-child-sa-00 That was replaced in November 2020 by: htpps://datatracker.ietf.org/doc/draft-pwouters-multi-sa-performance/ At IETF 108 in July 2020 there was this proposal: https://datatracker.ietf.org/meeting/108/materials/slides-108-ipsecme-proposed-improvements-to-esp-01 October 2022: https://www.ietf.org/archive/id/draft-ponchon-ipsecme-anti-replay-subspaces-00.txt Aditionally, Google published the PSP Security Protocol (PSP) for datacenters in April 2022: https://github.com/google/psp All these proposals try to solve related problems in different ways. They all have pros and cons, but the number of proposals shows that there is a real need to solve these problems better sooner than later. So instead of creating even more proposals, we maybe should take a step back and try to do a clear problem statement. Based on that we then can rethink about possible solutions. The next possibiltiy to sit together for an 'in person' discussion would be at the IETF Meeting in London. Is there anyone interested in a sidemeeting about that topic? Steffen ___ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec
Re: [IPsec] Discussion of draft-pwouters-ipsecme-multi-sa-performance
Hi Valery, On Fri, Oct 21, 2022 at 05:06:44PM +0300, Valery Smyslov wrote: > > > > The percpu SAs don't need locking as long as you can make sure that > > it is never ever accessed by a remote cpu. To guarantee this, something > > that does the 'dirt work' is needed. In our case that would be the > > fallback SA. > > Then how per-SAs are installed? Doesn't it require some locking? Yes, the percpu SAs can be completely lockless if you have the fallback SA. All other solutions I've seen so far require to implement locking for the percpu SAs too. That's my whole point :-) ___ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec
Re: [IPsec] Discussion of draft-pwouters-ipsecme-multi-sa-performance
Hi Valery, On Mon, Oct 17, 2022 at 05:10:32PM +0300, Valery Smyslov wrote: > > > > > > > I could guess that the fallback SA *does* require locks. > > > > > > It also seems to me. So I see no difference if the packet > > > can be re-steered to a different CPU, in any case we'll have > > > performance penalty. > > > > The fallback SA needs locking, as it can be used from any cpu. > > But with the current approch, this is the only one that needs > > locking. > > Then my next question is - how the sending side decides > whether to one of use per-CPU SAs or the fallback SA? > My guess that the packet is handled by some kernel thread > (i.e. by some CPU), so once this CPU figures out that > it doesn't have an SA - I assume it uses the fallback SA then. > Is it right? Right. > If so, then why it cannot hand over the packet > to the CPU that for sure has the needed SA (this CPU can > be indicated in the stub SA entry)? Moving in flight packets to a different cpu is always a pain. >From an implementation point of view, that should be avoided whenever possible. Aside from the overhead it creates, we also have to care about corener cases. For example, how do you make sure that the SA on this remote cpu is still there when the packet arrives on it? Also, where would that stub SA entry be located, in a percpu or in a global SAD? If it is an a percpu SAD, then remote cpus must update all stub SA entries when a percpu SA comes up or goes down. So we would need to lock the percpu SADs. If it is in a global SAD, then why not just negotiate keymat and use it (as a fallback)? Instead of having a stub SA entry, I could think of encoding the information which cpu has a valid SA to the policy. The policy is global anyway. But then we still need to re-steer the packets, what I really dislike. Is it just that you don't like that the fallback SA MUST (or maybe SHOULD) be always up, or is it that you don't want to negotiate this additional SA at all? Another possibility would be to use the same keymat on all percpu SAs, as it was proposed at the discussion we had at our 'IPsec coffee hour' last time. In that case you have a valid SA on all cpus with a single negotiation. But hat would require a change to ESP what this proposal don't need. > In both cases some > locking is required - does the latter case require much more locking? The percpu SAs don't need locking as long as you can make sure that it is never ever accessed by a remote cpu. To guarantee this, something that does the 'dirt work' is needed. In our case that would be the fallback SA. Steffen ___ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec
Re: [IPsec] Discussion of draft-pwouters-ipsecme-multi-sa-performance
Hi Valery, thanks for yor feedback! Some comments inline. On Tue, Oct 11, 2022 at 05:37:29PM +0300, Valery Smyslov wrote: > Hi all, > > as I promised at the last IETF meeting, this is my review of the > draft-pwouters-ipsecme-multi-sa-performance draft. > This is not a formal review of the document, but rather some speculations on > how the solution may be simplified. > Sorry that it took so long and please consider this as an invitation for > discussion. > > I think that the performance problem is real and the document is a good > starting point for solving it. > That said, I think that the approach in the draft is a little bit too > complicated. Since there are implementations > of the draft out there, it is possible that I missed some details concerned > with kernel internals limitations, > but in my opinion the solution can be simplified. > > First, I think that the approach of having multiple IPsec SAs with identical > selectors, each associated with > its own CPU, is the right one for solving the performance problem. > > My main problem with the draft is the concept of "Fallback SA". This SA is > treated specially in the draft, > which I don't think is necessary. For example, it must always be up so that > the outgoing packet can > always be sent in case per-CPU SA does not exist. Why other existing per-CPU > SAs cannot be used > for this purpose? Argued about that in my other mail. > Another thing that I think is unnecessary is the CPU_QUEUES notify. > Apart from indicating that the SA is a Fallback SA (which I think is not > needed), I cannot > see any usefulness of this notify. In particular - it contains the minimum > number of parallel > SAs peer wants to have - so for the receiving side its information makes no > difference. > For example, I want 3, you wants 5, we end up with 5 (as the bigger), but in > fact I can always send > TS_MAX_QUEUE after creating 3. No difference what values are indicated, they > can be arbitrary. Sure, you can do that. The intention behind this was to decide if both peers can benefit from doing a pcpu SA setup. Consider a setup where one peer has 1000 cpus and the other peer has just 2. If the one with 1000 cpus tries to install a SA for each cpu, the other ends SAD lookup becomes very inefficient. On the other hand, each peer needs to be able to install at least as many SAs as it has cpus. Otherwise some cpus have to use always the fallback SA or need to re-steer flows to other cpus. That is inefficient too, so there should be a way to detect if the difference is too big to use pcpu SAs efficent for both sides. So if the other end asks for a too big CPU_QUEUES number you can just say, no let's just use one SA for all the traffic. > I'm also not convinced that CPU_QUEUE_INFO is really needed, it mostly exists > for debugging purposes (again if we get rid of Fallback SA). And I don't > think we need > a new error notify TS_MAX_QUEUE, I believe TS_UNACCEPTABLE can be used > instead. Ok. > So, in my understanding, the architecture for multi-SA protocol could be as > follows. > An IPsec endpoint supporting this feature has several CPUs (or several cores, > but let's call them CPUs). > There is some mechanism that dispatches outgoing packets to different CPUs in > some fashion > (randomly or with round-robin algorithm or using some affinity). There also > is some mechanism that > deterministically dispatches incoming ESP packets to CPUs based on some > information > from the packets, most probably from the content of SPI. > > With these kernel features in mind the following IPsec architecture could be > implied. > The SPD is global for all CPUs, while SAD is split into several copies, so > that each CPU has > its own SAD. We also need to introduce a special entry in the SAD - "stub > SA", that > only constitutes of a selector and has no associated SA. > > When there is an outgoing packet on the initiator, then it is handled by one > of CPUs. > This CPU checks its own SAD and founds no SA that matches packet selector, so > it checks the SPD and finds a rule saying that this packets with this selector > must be protected with ESP. Then this CPU requests IKE for creating the ESP > SA. > IKE performs the needed actions and as a result it creates a pair of ESP SAs. > This is all usual actions with no deviation from any ordinary IPsec > implementation. > > The difference is that then IKE installs this pair of SAs only to the SAD for > that very > CPU that requested its creation. Note, that an SPI for the incoming ESP SA > should be selected in such a way, that the mechanism steering incoming packets > to an appropriate CPU must correctly steer this SPI to the CPU that this SA > is installed for. > All other SADs (for the rest CPUs) are populated with a stub SA entry, having > the > same selector and a pointer to the CPU that have real SA installed. That would require a write to all remote pcpu SADs, so the percpu
Re: [IPsec] Discussion of draft-pwouters-ipsecme-multi-sa-performance
Hi, On Tue, Oct 11, 2022 at 07:14:32PM +0300, Valery Smyslov wrote: > Hi Michael, > > > Valery Smyslov wrote: > > > My main problem with the draft is the concept of "Fallback SA". This > > SA > > > is treated specially in the draft, which I don't think is > > > necessary. For example, it must always be up so that the outgoing > > > packet can always be sent in case per-CPU SA does not exist. Why other > > > existing per-CPU SAs cannot be used for this purpose? > > > > Because the point of the per-CPU CAs is that they are local to the CPU and > > so > > they do not require locks to acces/update. > > True. > > > I could guess that the fallback SA *does* require locks. > > It also seems to me. So I see no difference if the packet > can be re-steered to a different CPU, in any case we'll have > performance penalty. The fallback SA needs locking, as it can be used from any cpu. But with the current approch, this is the only one that needs locking. If you try to re-steer packets to a different cpu, you need to do lookups in the SAD from remote cpus to find a cpu that has a SA that can process the packet, what in turn would require to lock per percpu databases too. > > > > affinity). There also is some mechanism that deterministically > > > dispatches incoming ESP packets to CPUs based on some information from > > > the packets, most probably from the content of SPI. > > > > It needs to be deterministrically by SPI, or you get no locality. > > Yes, SPI is an obvious choice. Theoretically there may be others > (e.g. use UDP encapsulation and send each ESP from a different port). Well, on the incomming side you can always distinguish the flows as long as the sending side used different SAs. So getting parallelism there is easy by using SPI, UDP encapsulation port, or whatever. > > > > With these kernel features in mind the following IPsec architecture > > > could be implied. The SPD is global for all CPUs, while SAD is split > > > into several copies, so that each CPU has its own SAD. We also need to > > > introduce a special entry in the SAD - "stub SA", that only > > constitutes > > > of a selector and has no associated SA. > > > > The read-only copy of the SPD can be replicated per-CPU, with the counters > > being updates by RCU. I don't understand your stub SA use. > > Because we need to indicate somewhere that SA with identical traffic selectors > exists for another CPU, but not for this one. This is dynamic information > and it cannot reside in SPD. We use the fallback SA as a 'stub one'. The difference is that we look it up in the global SAD and actually use it because it has key material negotiated. > > > This way the new SAs are created dynamically and treated equally - > > they > > > all live their own life - are re-keyed or even deleted if they are > > idle > > > for a long time. > > > > If there are SAs which are being used more than others, than there is > > something wrong. > > My point is that it doesn't matter how they are used - they > live their own life. Generally with a good enough randomizing > algorithm all of them should be used roughly equally. We don't require a fallback SA in the draft, we just recommend to have one. So in that sense, once created, all SAs live their own live. Steffen ___ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec
[IPsec] Announce: IPsec workshop, London, November 3th - 4th 2022
Hi, we plan to continue our IPsec workshop series this year after we had to stop it for two years due to COVID-19. The workshop will take place in London, from November 3th to 4th 2022. Some background about the event: The IPsec workshop is organized by the 'IPsec and Network Security Association' and was held first time in 2018. It started as the 'Linux IPsec workshop', but it became clear rather soon that there is a need to connect the Linux and IETF IPsec community. So topics are not limited to the Linux implementation anymore. Some information about the past workshops can be found here: https://linux-ipsec.org/conferences/ The schedule for the workshop is still in the works, so we still don't have a full topic list. Some topics are: - pCPU IPsec support in Linux / strongswan / libreswan (see https://datatracker.ietf.org/doc/html/draft-pwouters-multi-sa-performance) - Is there a way to handle a parallel datapath inside a single child SA? - How to do QoS in combination with an anti-replay window right? - IPTFS IPsec support in Linux /strongswan / libreswan (see https://datatracker.ietf.org/doc/html/draft-ietf-ipsecme-iptfs) - Statefull IPsec datapath hardware offload - PQ IKEv2 interop testing - Linux IPsec forwarding fastpath In case you are early in London and think you can contribute, just let me know and I provide more details as soon as availabe. Steffen ___ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec
Re: [IPsec] leading versus trailing ICV
On Thu, Jul 30, 2020 at 10:13:57PM -0400, William Allen Simpson wrote: > The comments thus far seem to be mixed. This is a perennial topic. > We spent much time on it in PIPE/SIPP/IPv6. > > We agreed on leading for AH and trailing for ESP. > > When I wrote the KA9Q NOS code implementing Van Jacobson's packet > buffers that eventually was ported to Linux by Alan Cox, the code knew > it had an incoming Ethernet or PPP frame, and offset the head on a > 16-bit or 32-bit boundary as needed with enough space at the tail for > all trailing bytes. On Linux, it dependes a bit on the NIC driver how that is handled. The default headroom is max(32 bytes, L1_CACHE_BYTES), space for a trailer is not reserved. > The IP header was always on a 64-bit boundary. > Hopefully, that code is still present. The default alignment of the IP header is on a 16 byte boundary. It looks like this: NET_IP_ALIGN(2) + ethernet_header(14) + IP_header(20/40) + L4_header(8) Architectures can change the 2 byte NET_IP_ALIGN if they prefer DMA alignment over IP alignment. > In modern CPUs, there's always an issue with cache lines. But for a > parallel implementation, it really isn't going to matter. The CPU > that finishes last and needs to check the ICV isn't particularly > likely to be the CPU that processed the initial header anyway. While that would be possible for some algorithms, I've never seen that a single cipher request is handled by multiple CPUs. I guess that would lead to cacheline bouncing, and for GCM an atomic synchronization of the counter would be needed. Usually parallelization is achieved by using AVX registers/instructions where multiple cipher blocks can be handled simultaneously with a single instruction. So it might make sense to have the ICV at the end because it is likely cache hot when needed. ___ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec
Re: [IPsec] My comments on "Proposed improvements to ESP"
On Wed, Jul 29, 2020 at 03:57:01PM +0300, Tero Kivinen wrote: > Scott Fluhrer \(sfluhrer\) writes: > > As for the idea of moving the integrity check value before the > > encapsulated packet, well, that idea might help on your platform; > > however it strikes me that the advantage would likely be fairly > > platform dependent. > > Yes. In several kernel implementations the packets are formed as mbufs > which are linked list of pieces of packets, and in that kind of > environments there is easy ways to add/remove bytes to/from start or > to/from the end without needing to copy anything. On Linux both can happen, packets can be constructed with a scatter-gather list or as a single linear buffer. In both cases we have headroom to add headers, but tailroom is not guaranteed. So in most cases we either need to allocate a new scatter-gather list entry for the trailer, or we have to allocate a bigger buffer and to copy the packet data. But yes, it depends on the platform. > In case you have single big buffer for the whole packet then removing > stuff from the end or adding stuff to the end is possible without any > copying provided you have enough space in your buffer. Adding stuff to > the beginning of the packet might require you to copy the whole > packet... We need space to add the headers at the beginning anyways. With the trailer, we need space on both ends. > > Also the need to read (while receiving packet) the ICV is only after > you have already decrypted the whole packet (if using AEAD), and want > to verify that the value matches. When the ICV is at the end of packet > that means it might already be in the cpu cache, as cpu read the last > bytes of the packet and the cache line might be big enough to include > ICV. If we need to go back to the beginning to read the ICV it might > not be in the cache anymore. Right, this is true indeed. ___ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec
Re: [IPsec] Teaser for pitch talk at IETF 108
On Wed, Jul 29, 2020 at 04:22:15PM +0300, Tero Kivinen wrote: > Steffen Klassert writes: > > > > A secret salt in the nonce would be a new requirement anyway. > > I've checked RFC 4106 (ESP for GCM) and RFC 7634 (ESP for > > ChaCha20-Poly1305), both don't require a secret salt. > > It is true that they do not need secret salt, but they do have > unpredictable salt, which is created by the key derivation step. My > understanding was that this proposal did get rid of that salt too: Yes, this proposal removes the unpredictable salt. I did not say it explicitely, but that was part of my critism on how they create the IV in my original mail. ___ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec
Re: [IPsec] Teaser for pitch talk at IETF 108
Hi Valery, a few comments inline. On Tue, Jul 28, 2020 at 11:13:33AM +0300, Valery Smyslov wrote: > Hi, > > a few thoughts about this proposal. > > > * 64 bit sequence counters in each header to ease protocol handling and > > allow for > > replay protection in multicast groups > > This would simplify replay protection logic on receiver, but will waste > 4 bytes on the wire for unicast SAs. We have already the option to send the high sequence number bits when a combined mode algorithm is used. RFC 4303, Section 2.2.1. says: If a combined mode algorithm is employed, the algorithm choice determines whether the high-order ESN bits are transmitted or are included implicitly in the computation. We could just give multicast the same option if it wants to use replay protection. > > * Removing the trailer to ease segment & fragment handling and alignment > > I was told by Linux kernel people that having trailer in ESP is a headache > for them. > However, this simplification has its cost: > > 1. Ciphers that require padding cannot be used. I admit that CBC and the like > are out of fashion today, but I don't know which cipher modes will be in > fashion tomorrow > and what requirement for padding they will have. > 2. No Next Header field eliminates transport mode (BTW, widely used for > multicast!) > and makes it difficult to implement TFC (you can add TFC padding, but > you can't send > dummy packets and you can't use IPTFS) Instead of a trailer, an (optional) encrypted header could be used for transport mode, IPTFS etc. That could be | Pad Length | Next Header | Pad to align payload | > > > * Implicit IVs in spirit of RFC 8750 removing the need for AAD > > I don't consider removing AAD is a benefit, since all the AEAD schemes I'm > aware of > allow to have AAD. On the other hand, implicit IV is only applicable > to some transforms. I'm not only talking about non counter-based AEAD ciphers, > (like CBC), but even for counter-based AEAD a situation is possible when > there is a need > for IV to be somehow structured and not be a plain counter (e.g if you > implement key trees). Right, we should always have the option to include an explicit IV as the IV construction depends on the used algorithm. > > > Further details and benchmark results may be found in a paper preprint [1] > > and a > > presentation [2] we held with at the Linux IPsec Workshop. > > A few more considerations. > > It seems that performance of this proposal depends on ICV size > for the plaintext to be properly aligned. > If ICV is 16 bytes, then plaintext is ideally aligned on 32 bytes, > but if one use 12 byes ICV (e.g. ENCR_AES_GCM_12) > then the plaintext is aligned on 4 bytes, that is even worse > than ESP, where it is for most AEAD transforms aligned on 16 bytes. We could pad a 12 byes ICV up to 16 bytes, but I have to admit that this might not be the best option. > Since this proposal allows only tunnel mode, it will have > larger overhead for small packets. This is partially > compensated by having IV to be implicit... > > And about security. In order to have IV combined with > ESN and sub-windows identifiers this proposal removes secret > salt from the nonce. This may have impact on security. > I'm not a cryptographer, but I believe the impact is not negligible. > On the last CFRG session a draft draft-wood-cfrg-aead-limits was > discussed that calculates limits of data to be safely encrypted > by various AEAD ciphers. The authors claimed that having > secret salt in the nonce increases this limits in case of multi-user > attacks and that the results in the draft are calculated > for this case. If plain AEAD ciphers (with no secret salt) are used > the limits are lower. A secret salt in the nonce would be a new requirement anyway. I've checked RFC 4106 (ESP for GCM) and RFC 7634 (ESP for ChaCha20-Poly1305), both don't require a secret salt. But I'm not sure if the IV construction in this proposal would be always a good choice. As far as I understand, Sender ID is only used with multicast, so will be most likely zero on unicast. Also the replay window ID will have a lot of zero bits on unicast (given that most nodes have much less than 2^16 cpus these days). Steffen ___ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec
Re: [IPsec] Early Allocation Request for IPTFS_PROTOCOL IP protocol number.
On Sun, Jun 07, 2020 at 09:43:41PM -0400, Michael Richardson wrote: > > Steffen Klassert wrote: > > This alterative usecase tries to solve the 'small packet' tunneling > > problem. Sending small packets over a tunnel usually creates quite a > > lot of overhead, as each packet needs to get it's own tunnel header > > etc. For IPsec, the situation is even worse as a cpu intensive crypto > > operation has to be applied for each of these small packets. With the > > IPTFS_PROTOCOL payload type, we could group small packets and send them > > into one big packet over the tunnel. This can avoid tunneling overhead > > because we need only one tunnel header for multiple packets. Also this > > method would be very data and instruction cache effective because > > multiple packets are processed together. The good thing is that the > > Linux forwarding path can already provide packets chains (GRO), so we > > would just need to take these packets chains and put them into big > > tunnel packets with IPTFS_PROTOCOL payload type. As a side effect, > > having IPTFS_PROTOCOL as a general purpose tunnel payload, it might be > > easier to argue for a new IP protocol number allocation. > > Does your use case include situations where this is not an IPsec tunnel? My main usecase would be an IPsec tunnel, but the same can work for other tunnel types too. If we have an IP protocol number, it is just easy to use it outside of IPsec world, but I don't have a strong opinion on that. Steffen ___ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec
Re: [IPsec] Early Allocation Request for IPTFS_PROTOCOL IP protocol number.
On Tue, Jun 02, 2020 at 11:56:48AM -0400, Christian Hopps wrote: > > On Jun 2, 2020, at 10:21 AM, Tero Kivinen wrote: > > Christian Hopps writes: > > > I would assume those questions are going to be asked from chairs or > > area directors during the process anyways, so we need to have good > > answers to them ready (and for me it would be quite hard to explain > > why we cannot use warpped ESP, or dummy packet trick as I think we can > > do those and we do not need IP protocol number). > > > > Note, that if the answer is going to be that we want to use this also > > when we are not using IPsec, then this is even bigger can of worms, as > > that would most likely mean that this work does not belong to the > > IPsecME working group, but should be part of completely different > > area... > > As I mentioned above, people have already expressed interest in possibly > using the IPTFS framing outside of IPsec for some of its positive non-IPsec > properties. This doesn't mean we have to boil the ocean and standardize the > framing outside of IPsec, it just means we should be considerate about the > possible re-use while we do our work. I'm one of those who was interested in a different usecase for the IPTFS payload. My plan was to present about that at the Linux IPsec workshop this year. Unfortunately, this workshop was canceled because of COVID-19 outbreak, so I never got feedback about the idea. Below is the Abstract of this presentation that sketches the idea (just for the case somebody finds it usefull). - An alternative usecase for IPTFS_PROTOCOL payload type tunnels. This alterative usecase tries to solve the 'small packet' tunneling problem. Sending small packets over a tunnel usually creates quite a lot of overhead, as each packet needs to get it's own tunnel header etc. For IPsec, the situation is even worse as a cpu intensive crypto operation has to be applied for each of these small packets. With the IPTFS_PROTOCOL payload type, we could group small packets and send them into one big packet over the tunnel. This can avoid tunneling overhead because we need only one tunnel header for multiple packets. Also this method would be very data and instruction cache effective because multiple packets are processed together. The good thing is that the Linux forwarding path can already provide packets chains (GRO), so we would just need to take these packets chains and put them into big tunnel packets with IPTFS_PROTOCOL payload type. As a side effect, having IPTFS_PROTOCOL as a general purpose tunnel payload, it might be easier to argue for a new IP protocol number allocation. Steffen ___ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec
Re: [IPsec] Some thoughts regarging draft-hopps-ipsecme-iptfs-01
On Mon, Dec 02, 2019 at 10:57:59AM -0500, Christian Hopps wrote: > > On Dec 2, 2019, at 9:11 AM, Steffen Klassert > > wrote: > > On Mon, Dec 02, 2019 at 06:22:26AM -0500, Christian Hopps wrote: > >>> On Dec 2, 2019, at 3:01 AM, Steffen Klassert > >>> wrote: > >>> On Thu, Nov 28, 2019 at 04:49:36PM +0300, Valery Smyslov wrote: > >>> > >>>> 4. I'd like to see more text in the draft regarding reassembling of > >>>> incoming packets. > >>> > >>> Yes, I think some words on how to reassemble the fragments are really > >>> needed. > >>> > >>>> It seems to me that it can be done pretty easy by linking the > >>>> reassembly logic > >>>> with replay protection window. > >>> > >>> While it looks like doing the reassembling based on ESP sequence numbers > >>> might be an easy approach, it could be also dangerous. > >>> > >>> Consider a system that encapsulates two flows on different cpus > >>> with the same SA. This system can TX packets in the following > >>> order: > >>> > >>> TX cpu0 inner flow0 SA0: > >>> > >>> Offset: 0 Offset: 100 > >>> [ ESP1 (1500) ][ ESP3 (1500) ] > >>> [--800--][--800--][-1400---] > >>> > >>> -- > >>> TX cpu1 inner flow1 SA0: > >>> Offset: 0Offset: > >>> 100 > >>> [ ESP2 (1500) ][ ESP4 > >>> (1500) ] > >>> [--800--][--800- > >>> -][1400] > >>> > >>> > >>> On the receive side, it is not that clear how to reassemble the fragments > >>> from ESP3 and ESP4 into the fragments from ESP1 and ESP2. Maybe some > >>> packet ID in the IP-TFS header could help to identify related fragments. > >> > >> Indeed the code mustn't fragment this way. :) > >> > >> We could add a bit of text that one should avoid this mistake. > > > > I'm not so sure whether the receiver should rely on that > > the sender did the fragmentation right. I think a packet > > ID, like IP fragments have, would just solve the problem. > > The implementation would not even need to care about this > > multicore race then. > > The receiver can do any number of wrong things with what it sends, but I'd > normally call those bugs. :) Yes, that's true. But if the protocol allows to do things wrong, it is a bug in the protocol :) Maybe you can just make it clear at the sender side by saying something like 'Fragments must be sent ordered and ESP encapsulated with consecutive sequence numbers.' > > Technically though, attaching a packet ID to the fragments to allowing > sending them in any order saves only a little on code complexity (i.e., not > using an ordering queue) on the sender side; I hoped to avoid such an ordering/serialization queue as I fear this will become a bottleneck. With the current design, I don't see how to do this without a queue. I know, it is an implementation detail, but implementation matters too :) > however, it seems to add a disproportionate amount of complexity to the > receiver/reassembly (which could e.g., be aggregating VPN server). Yes, if you know that the next fragment comes with the next ESP packet, things are much easier. So we have the complexity either at the sender or the receiver side, not so sure what performs better. > Adding a packet ID also means that you can't just chain the inner traffic > buffers together to form the IP-TFS payload as you must now insert an extra > header between each of the inner packets, this is going to affect performance > and memory use on whitebox/software based deployments as well as reduce > available bandwidth on the tunnel. Good point. You can always chain extra headers and inner packets with a scatter-gather list, but it will have some performance impact. > I think we should try and keep fragmentation and reassembly as simple as > possible so that it is easy to implement and get right. I absolutely agree here. > Having coded this using just the ESP sequence numbers to correct-order the > received packets, I can say it's an easy-to-moderate complex function that > performs well, it took a few iterations on the code to get it right and well
Re: [IPsec] Some thoughts regarging draft-hopps-ipsecme-iptfs-01
On Mon, Dec 02, 2019 at 06:22:26AM -0500, Christian Hopps wrote: > > On Dec 2, 2019, at 3:01 AM, Steffen Klassert > > wrote: > > On Thu, Nov 28, 2019 at 04:49:36PM +0300, Valery Smyslov wrote: > > > >> 4. I'd like to see more text in the draft regarding reassembling of > >> incoming packets. > > > > Yes, I think some words on how to reassemble the fragments are really > > needed. > > > >>It seems to me that it can be done pretty easy by linking the > >> reassembly logic > >>with replay protection window. > > > > While it looks like doing the reassembling based on ESP sequence numbers > > might be an easy approach, it could be also dangerous. > > > > Consider a system that encapsulates two flows on different cpus > > with the same SA. This system can TX packets in the following > > order: > > > > TX cpu0 inner flow0 SA0: > > > > Offset: 0 Offset: 100 > > [ ESP1 (1500) ][ ESP3 (1500) ] > > [--800--][--800--][-1400---] > > > > -- > > TX cpu1 inner flow1 SA0: > > Offset: 0Offset: > > 100 > > [ ESP2 (1500) ][ ESP4 > > (1500) ] > > [--800--][--800- > > -][1400] > > > > > > On the receive side, it is not that clear how to reassemble the fragments > > from ESP3 and ESP4 into the fragments from ESP1 and ESP2. Maybe some > > packet ID in the IP-TFS header could help to identify related fragments. > > Indeed the code mustn't fragment this way. :) > > We could add a bit of text that one should avoid this mistake. I'm not so sure whether the receiver should rely on that the sender did the fragmentation right. I think a packet ID, like IP fragments have, would just solve the problem. The implementation would not even need to care about this multicore race then. Steffen ___ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec
Re: [IPsec] Some thoughts regarging draft-hopps-ipsecme-iptfs-01
Hi Valery, On Mon, Dec 02, 2019 at 11:28:16AM +0300, Valery Smyslov wrote: > Hi Steffen, > > > > It seems to me that it can be done pretty easy by linking the > > > reassembly logic > > > with replay protection window. > > > > While it looks like doing the reassembling based on ESP sequence numbers > > might be an easy approach, it could be also dangerous. > > > > Consider a system that encapsulates two flows on different cpus > > with the same SA. This system can TX packets in the following > > order: > > > > TX cpu0 inner flow0 SA0: > > > > Offset: 0 Offset: 100 > > [ ESP1 (1500) ][ ESP3 (1500) ] > > [--800--][--800--][-1400---] > > > > -- > > TX cpu1 inner flow1 SA0: > > Offset: 0Offset: > > 100 > > [ ESP2 (1500) ][ ESP4 > > (1500) ] > > [--800--][--800- > > -][1400] > > > > > > On the receive side, it is not that clear how to reassemble the fragments > > from ESP3 and ESP4 into the fragments from ESP1 and ESP2. Maybe some > > packet ID in the IP-TFS header could help to identify related fragments. > > I'm probably missing something here, but I think that sending side assigns > every outgoing IP packet to some SA. Then the packet is added to the ESP > message > (that may already contain previous packets). If the packet cannot fit into the > left space, it is split and the rest of the packet is sent in the next > ESP message of the same SA. All packets are sent over the same SA, but on different cpus. This means that the 'rest' might not be in the next ESP message. The other cpu could have TXed some ESP packets before, it is a race. In this example, flow0 is encapsulated on cpu0, flow1 is encapsulated on cpu1, both on the same SA. ESP1 contains flow0, but ESP2 contains flow1. The 'rest' from flow0 is encapsulated in ESP3, the 'rest' from flow1 is encapsulated in ESP4. So I think it is not clear how to do a correct reassembling here. Steffen ___ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec
Re: [IPsec] Some thoughts regarging draft-hopps-ipsecme-iptfs-01
On Thu, Nov 28, 2019 at 04:49:36PM +0300, Valery Smyslov wrote: > Hi, > > after reading through draft-hopps-ipsecme-iptfs-01 I have some thoughts. > > 1. I think it's a wrong decision to support tunnel mode ESP only. IP-TFS for > transport mode ESP > is equally important because one of the widely used scenario is to > combine general purpose > tunneling (like GRE) with transport mode ESP. In this case traffic > flowing over such SA > will in fact be tunnel traffic from several hosts, but the SA is created > in transport mode. > For this reason I think that IP-TFS must support transport mode SA either. I'd like to agree here. It does not add much more complexity and there are valid usecases for transport mode (and even for BEET mode). > 4. I'd like to see more text in the draft regarding reassembling of incoming > packets. Yes, I think some words on how to reassemble the fragments are really needed. > It seems to me that it can be done pretty easy by linking the reassembly > logic > with replay protection window. While it looks like doing the reassembling based on ESP sequence numbers might be an easy approach, it could be also dangerous. Consider a system that encapsulates two flows on different cpus with the same SA. This system can TX packets in the following order: TX cpu0 inner flow0 SA0: Offset: 0 Offset: 100 [ ESP1 (1500) ][ ESP3 (1500) ] [--800--][--800--][-1400---] -- TX cpu1 inner flow1 SA0: Offset: 0Offset: 100 [ ESP2 (1500) ][ ESP4 (1500) ] [--800--][--800- -][1400] On the receive side, it is not that clear how to reassemble the fragments from ESP3 and ESP4 into the fragments from ESP1 and ESP2. Maybe some packet ID in the IP-TFS header could help to identify related fragments. Steffen ___ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec