Hi Boris,

> (1)

In my experience, DTLS packages mainly contain a single application data
records, not more. So that limitation should not cause too much pain.

> The purpose of this format is to be used in closed networks, such as
high performance computing clusters

> (2)

If your network is closed, I guess you don't need to consider NATs and
so just use the old RFC6347 record without CID.

If you want to use CID, then the situation is split into receiving and
sending. For the received records your own peer defines the length.
If your peer uses a "fixed cid lenght", then the hw don't need to deal
with the variable length when receiving.
For sending the cid length may be an additional parameter for the
encryption function. That may save the additional lookup, at least at
the hw-encryption level.

> (3)

Even if it's possible, its rarely used in practice. In my experience,
it's more for very small messages, which would otherwise be too easy to
be detected, e.g. a CoAP ACK with 4 bytes. So my forecast will be, that
the performance will not be affected too much by padding.

best regards
Achim



Am 28.08.23 um 12:49 schrieb Boris Pismenny:
Hello,

*
*

I work for NVIDIA on accelerating DTLS (and QUIC) encryption in
hardware. We find that the following DTLS record aspects make
acceleration less efficient:

(1) multiple encrypted records per-packet;

(2) variable length headers; and

(3) variable length padding.

To make the protocol more hardware friendly, I would like to propose a
negotiable record format that will improve acceleration efficiency. The
purpose of this format is to be used in closed networks, such as high
performance computing clusters, and not necessarily for the Internet, so
it should be disabled by default.

*
*

Next, I explain in more detail the problem with each protocol aspect:

(1) When there are multiple encrypted records in a single packet,
hardware must perform multiple (de)encryption operations to process the
packet. This is particularly challenging for match and action pipeline
designs (such as P4) that are otherwise very suitable for packet-based
encryption. Multi-record packets are rare in the data exchange phase
which is when hardware is involved, and it would greatly simplify
hardware to avoid checking for the multi-record case.

*
*

(2) variability in the packet length and connection ID fields of DTLS1.3
requires hardware to support a number of possible formats for each
protected connection, the additional match operations to identify the
length of variable fields is unnecessarily costly.

*
*

(3) variable padding makes it hard to efficiently identify the real
content type at the trailer and the trailer's length in general. Since
there is no explicit padding length field, hardware needs to
sequentially go through the padding bytes at the trailer, which
increases latency.

*
*

All of the above are desirable features in many cases, but in
high-performance computing environments they add unnecessary flexibility
at the cost of performance. Hence, I'd like to gather feedback on a
proposal for a simplified negotiable packet format for such
environments. The proposed negotiated format will:

(1) limit the protocol to one record per-packet after the initial
handshake (epoch>=3);

(2) fix header field lengths; and

(3) eliminate packet padding.

*
*

Best,

Boris


_______________________________________________
TLS mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/tls

_______________________________________________
TLS mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/tls

Reply via email to