Folks- I am appending what I sent Martin & Gorry this morning. I looked quite quickly as Martin was looking for quick input. I am happy to iterate if things aren't all that understandable.
allman
[Quick hit.]
I agree with Martin's DISCUSS and Gorry's notes. A couple more
things here ...
> This spec looks a mess!
A generous reading is that this is most of the problem. I think
maybe there is some intent in here that isn't stated very well. It
needs to be explicit and not as sloppy.
A few specific things (in addition to what Gorry said, which I
absolutely agree with):
- "Though timer values are the choice of the implementation,
mishandling of the timer can lead to serious congestion
problems"
+ Gorry flagged this and I am flagging it again. If this is
something that can lead to serious problems, let's not just
leave it to "choice of the implementation". Especially if we
have some idea how to make it less problematic.
- "Implementations SHOULD use an initial timer value of 100 msec
(the minimum defined in RFC 6298 [RFC6298])"
+ I wrote RFC 6298 and I have no idea where this is coming from!
+ Even if this value of 100msec is OK for DTLS it shouldn't lean
on RFC 6298 because RFC 6298 doesn't say that is OK. I.e.,
the parenthetical is objectively wrong.
+ RFC 6298 says the INITIAL RTO should be 1sec (point (2.1) in
section 2). RFC 8961 affirms this and also says the INITIAL
RTO should be 1sec (requirement (1) in section 4).
- "Note that a 100 msec timer is recommended rather than the
3-second RFC 6298 default in order to improve latency for
time-sensitive applications."
+ Again, this mis-states RFC 6298, which says the initial RTO is
1sec (not 3sec). (Previous to RFC 6298 the initial RTO was
3sec, which is probably where the notion comes from. Most of
the purpose of RFC 6298 was to drop the initial RTO to 1sec.)
+ This is a statement of desire, not any sort of principled
justification for using 100msec. At the least this should be
much better argued.
+ To me 100msec feels much too close to the RTT of some network
paths to be appropriate here. To be clear, deviations from
RFC 8961 that gather consensus are fine, but you should say
why that deviation is OK. And, I'd think the further you
deviate the more you need to say (for me). I.e., dropping
from 1sec to 900msec may not be that big of an issue. But,
dropping to 1/10-th of the guideline and to something pretty
close to not rare RTTs should require some care and some
discussion, IMO.
+ And, I am not trying to be a picky protocol lawyer and say
this document "didn't check the RFC 8085 / RFC 8961 box".
Rather, RFC 8085 & 8961 say things for a reason and I don't
think we should implicitly ignore them because they come from
experience on how to do these sorts of things.
- "The retransmit timer expires: the implementation transitions to
the SENDING state, where it retransmits the flight, resets the
retransmit timer, and returns to the WAITING state."
+ Maybe this is spec sloppiness, but boy does it sound like the
recipe TCP used before VJCC to collapse the network. I.e.,
expire and retransmit the window. Rinse and repeat. It may
be the intention is for backoff to be involved. But, that
isn't what it says.
- “When they have received part of a flight and do not immediately
receive the rest of the flight (which may be in the same UDP
datagram). A reasonable approach here is to set a timer for 1/4 the
current retransmit timer value when the first record in the flight
is received and then send an ACK when that timer expires.”
+ Where does 1/4 come from? Why is it "reasonable"? This just
feels like a complete WAG that was pulled out of the air.
And, +1 on all the flight size stuff Martin mentioned.
allman
signature.asc
Description: OpenPGP digital signature
_______________________________________________ TLS mailing list [email protected] https://www.ietf.org/mailman/listinfo/tls
