William Allen Simpson writes: > > So it might make sense to have the ICV at the end because it is > > likely cache hot when needed. > > But after removing padding for these stream algorithms, then the ICV is > very likely not aligned. For zero-copy RDMA, it is rather inconvenient. > And the IP header cache lines are likely still hot. > > Anyway, that's why I'd like to consider at least a negotiated option -- > as long as it's possible to implement efficiently in Linux and others. > We need to hear from more implementers.
I think it would be a bad idea to have such option. Yes, it might offer small gains on environments where the option is picked exactly when both ends like it, but when one ends pick it in a way which other end does not like, we usually end up big performance penalties. And also it again multiples the testing effort as now you need to add to your test suites this combined with all posible ciphers etc. This is was one of the main problems with IKEv1, there were so many different combinations of different options that to be able to test all of them required thousands or tens of thousands of test cases. Example of similar optimzation causing problems was found during interop events when some version (I think it was Linux) was sending fragmented packets in reverse order, so the first network packet sent/received was the last fragment. The idea was that when receiver saw that last fragment it can immediately know how big the final packet will be and it can allocate big enough buffer for the packet. Then when you combined that with IPsec with per flow policy, meaning each TCP/UDP flow might be using different SA, that meant that SGW required to store all fragments in memory until it got the last packet from the network, which was the first fragment, and only after that it could check whether this packet is allowed to pass, and which SA it needs to use. Then it sent the fragments out, but there was lots of added latency because of this. Actually I think there were also implementations which did not even store the later fragments, they simply checked the later fragments, and found out that they have not seen the first fragment that would allow them to be passed, so they dropped them. The SGW of course then sent the frames out in order, so that the receiving SGW can efficently do exit tunnel checks (i.e., check from the first fragment that this packet should be allowed, and match later fragments with same fragment id to that, and allow them to passed), and it then did not need to do same buffering. Of course the final destination host now did not benefit at all from this optimization as it was negatied by the SGWs in the middle. So immediately when the options to use start to depend on the platform, implementation etc it gets harder and harder to find out which will be the optimal combination between two devices, and they might end up using suboptimal feature set. And all of those add more complexity and combinations that needs to be tested. Of course if this option is added, it should be something like IPCOMP or transport mode, meaning it is off by default, and everybody MUST implement case where it is not enabled, then you would only implement (and propose) it on cases where it actually benefits you, and everybody else would never ever implement it. -- [email protected] _______________________________________________ IPsec mailing list [email protected] https://www.ietf.org/mailman/listinfo/ipsec
