It's great to see the good discussion happening here.  I'll note a couple
things online, and also that given the discussion of the tradeoffs that go
into making these decisions, Tom is probably right that shoving this into
an I-D would be helpful.

On Wed, Aug 15, 2018 at 05:35:28PM +0000, Kent Watsen wrote:
> 
> Below is an updated version of some text that we might roll into
> a statement or an I-D of some sort.  Kindly review and provide 
> suggestions for improvement, or support for the text as is, if
> that is the case.  ;)
> 
> This update accommodates comments from:
>   - Wesley Eddy & David Black
>      - removed "layers of functionality" verbiage
>      - moved footnote into the body of the document (this had
>        a cascading effect, and why it looks so different now)
>   - Joe Touch
>      - keepalives should occur at *all* layers that benefit
>      - keepalives at a layer should be suppressed in the 
>        presence of sufficient traffic from higher layers
>      - keepalives at a layer should not be interpreted as
>        implying state at any other layer
> 
> This update does not accommodate comments from:
>   - Michael Abrahamsson & Tom Herbert
>      - no statement added to promote TCP keepalives
>         * note: I believe this to be unnecessary because 
>           the current text doesn't ever say to not use TCP.
>      - no statement added for tuning params (e.g., timeouts).
>         * note: we could add this, but it will increase the
>           scope of the document - do we want to do this?
> 
> Cheers!
> Kent
> 
> 
> ===== START =====
> 
> # Connection Strategies for Long-lived Connections
> 
> A networked device may have an ongoing need to interact with a remote
> device. Sometimes the need arises from wanting to push data to the
> remote device, and sometimes the need arises from wanting to check if
> there is any data the remote device may have pending to deliver to
> it.
> 
> There are two fundamental network connection strategies that can be
> used to accomplish this goal: 1) a single long-lived connection and
> 2) a sequence of short-lived connections.
> 
> A single long-lived connection is most common, as it is
> straightforward to implement and directly answers the question of 
> if the "connection" is established. However, long-lived connections
> require more system resources, which may affect scalability, and
> require the initiator of the connection to periodically test the
> aliveness of the remote device, discussed further in the next 
> section.
> 
> A sequence of short-lived connections is less common, as there is an
> additional implementation effort, as well as concerns such as: 1) the
> delay of the remote device needing to wait until the connection is
> reestablished in order to deliver pending data, and 2) the additional
> latency incurred from starting new connections, especially when
> cryptology is involved. However, short-lived connections do not

(nit: "cryptography" is probably better than "cryptology")

> require keepalives and are arguably more secure, as each device is
> forced to re-authenticate the other and reload all related
> access-control policies on each connection.
> 
> For networking sessions that are primarily quiet, and the use case
> can cope with the additional latency of waiting for and starting new
> connections, it is RECOMMENDED to use a sequence of short-lived
> connections, instead of maintaining a single long-lived connection
> using aliveness checks.
> 
> 
> # Keepalives for Persistent Connections
> 
> When the initiator of a networking session needs to maintain a
> long-lived connection, it is necessary for it to periodically test
> the aliveness of the remote device. In such cases, it is RECOMMENDED
> that the aliveness check happens at the highest protocol layer
> possible that is meaningful to the application, in order to maximize
> the depth of the aliveness check.
> 
> For example, for an HTTPS connection to a simple webserver,
> HTTP-level keepalives would test more layers of functionality than
> TLS-level keepalives. However, for a webserver that is accessed via a
> load-balancer that terminates TLS connections, TLS-level aliveness
> checks may be the most meaningful check that can be performed.
> 
> More generally, it is RECOMMENDED that applications be able to
> perform the aliveness checks at all protocol levels that benefit, but
> suppress the aliveness checks at lower protocol layers from occurring
> when there is sufficient activity at higher protocol layers.
> Keepalives at a layer SHOULD NOT be interpreted as implying state at
> any other layer.

What's going on here in the last sentence is probably a bit subtle -- a
keeaplive both does not indicate "real" protocol activity but also can
serve to exercise the lower protocol layers (and, even, per the previous
sentence, suppresses their keepalives).  Though I'm not sure it's correct
to change this to just "at higher layers", since (as was pointed out
downthread), different layers may have different requirements on what to
track, and do not necessarily have good interfaces to communicate with each
other).

-Benjamin

> In order to ensure aliveness checks can occur at any given protocol
> layer, it is RECOMMENDED that protocol designers always include an
> aliveness check mechanism in the protocol and, for client/server
> protocols, that the aliveness check can be initiated from either
> device, as sometimes the "server" is the initiator of the underlying
> networking connection (e.g., RFC 8071).
> 
> Some protocol stacks have a secure transport protocol layer (e.g.,
> TLS, SSH, DTLS) that sits on top of a cleartext protocol layer (e.g.,
> TCP, UDP). In such cases, it is RECOMMENDED that the aliveness check
> occurs within protection envelope afforded by the secure transport
> protocol layer; the aliveness checks SHOULD NOT occur via the
> underlying cleartext protocol layer, as an adversary can block
> aliveness check messages in either direction and send fake aliveness
> check messages in either direction.
> 
> 

Reply via email to