Eric, To be honest, it’s always kinda bugged me that SSL/TLS uses a 5-byte header, coming from my embedded network system background.
I’ve had dealings with cryptographic hardware (SafeNet at Altiga, and Cavium Nitrox at Cisco), and they have to go through hoops to deal with unaligned access. The Cavium Nitrox chip is able to do TLS record processing on it’s own (it’s not just simple crypto), and network processors like aligned data. Embedded systems don’t have the luxury of mbuf-type of buffer scheme (as you describe for NSS). Many have ethernet-frame sized buffers in locked/pinned memory that read in a whole ethernet frame, and then strip off headers by advancing pointers into the frame. This minimizes copies, and the goal is to have a zero-copy network stack. Once the 5-byte TLS header is reached, the data to be decrypted is no longer aligned, and this requires handling unaligned access, or copying the memory to an aligned buffer; both of which hurt performance. If the cryptography is to be offloaded to a co-processor, depending on the chip, the encrypted portion must be aligned, and thus must be copied. This is a common architecture for Cisco IOS, Catalyst and other high-performance routers and VPN servers (e.g. Cisco ASA). Other references regarding aligned vs. unaligned: While this reference is old (~6 years), this guy tested aligned vs. unaligned access on Intel x86 architecture, and found that there was a difference in performance. http://www.alexonlinux.com/aligned-vs-unaligned-memory-access Intel recommends data alignment in structures (which is basically what a record_header is) for performance/cache reasons: https://software.intel.com/en-us/articles/coding-for-performance-data-alignment-and-structures Old Reference from IBM (2005) regarding aligned vs. unaligned access: http://www.ibm.com/developerworks/library/pa-dalign/ So while RISC processors suffer the most from unaligned data, x86 processors can suffer from performance issues as well. -- -Todd Short // [email protected]<mailto:[email protected]> // "One if by land, two if by sea, three if by the Internet." On Nov 17, 2015, at 4:09 PM, Eric Rescorla <[email protected]<mailto:[email protected]>> wrote: Can you expand on the alignment issues some more? Thinking through how typical software stacks work, you first read the record header into some buffer to figure out how long the record is and then you read the record payload into a buffer, which may not even be the same buffer. For instance, in NSS, there are two buffers, one for the header and one for the body: https://dxr.mozilla.org/mozilla-central/source/security/nss/lib/ssl/ssl3gthr.c#53<https://urldefense.proofpoint.com/v2/url?u=https-3A__dxr.mozilla.org_mozilla-2Dcentral_source_security_nss_lib_ssl_ssl3gthr.c-2353&d=CwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=QBEcQsqoUDdk1Q26CzlzNPPUkKYWIh1LYsiHAwmtRik&m=lkp-Abz8z3roFbGavfNCWmtzPXMDjErHNGNHTxYBzh4&s=p38_5XK14ePbhLx1x1rLlrxoFuOENyPWGV68GFYZgnY&e=> https://dxr.mozilla.org/mozilla-central/source/security/nss/lib/ssl/sslimpl.h#377<https://urldefense.proofpoint.com/v2/url?u=https-3A__dxr.mozilla.org_mozilla-2Dcentral_source_security_nss_lib_ssl_sslimpl.h-23377&d=CwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=QBEcQsqoUDdk1Q26CzlzNPPUkKYWIh1LYsiHAwmtRik&m=lkp-Abz8z3roFbGavfNCWmtzPXMDjErHNGNHTxYBzh4&s=sxL_eMfoN0otMi3nYvGkzcIdwM-orYF6xwF3POGP4DI&e=> In a system like this, there's nothing stopping the body from being 4-byte aligned, whatever the alignment of the header is. I'd be interested in hearing about the design you have in mind. -Ekr On Tue, Nov 17, 2015 at 12:59 PM, Short, Todd <[email protected]<mailto:[email protected]>> wrote: I would say that 32-bits would be optimal, since that is the typical word-size of processors that need alignment. 2-bytes isn’t much better than 5-bytes in this regard. -- -Todd Short // [email protected]<mailto:[email protected]> // "One if by land, two if by sea, three if by the Internet." On Nov 17, 2015, at 3:45 PM, Daniel Kahn Gillmor <[email protected]<mailto:[email protected]>> wrote: On Tue 2015-11-17 12:09:30 -0500, Eric Rescorla wrote: The concern here is backward compatibility with inspection middleboxes which expect the length field to be in a particular place. We agreed in Seattle to wait for early deployment experience before modifying the header to move the length. In particular, if we're going to make a change to the TLS record header, the change would be to remove the version and the type entirely, leaving only two octets of length on each record. Is a two-octet offset going to be problematic? --dkg
_______________________________________________ TLS mailing list [email protected] https://www.ietf.org/mailman/listinfo/tls
