> On 9 Nov 2023, at 19:14, Vladimir Homutov <v...@inspert.ru> wrote: > >> On Thu, Oct 26, 2023 at 03:08:55AM +0400, Sergey Kandaurov wrote: >>> # HG changeset patch >>> # User Vladimir Khomutov <v...@wbsrv.ru> >>> # Date 1697031803 -10800 >>> # Wed Oct 11 16:43:23 2023 +0300 >>> # Node ID 9ba2840e88f62343b3bd794e43900781dab43686 >>> # Parent 1f188102fbd944df797e8710f70cccee76164add >>> QUIC: fixed handling of PTO counter. >>> >>> The RFC 9002 clearly says in "6.2. Probe Timeout": >>> ... >>> As with loss detection, the PTO is per packet number space. >>> That is, a PTO value is computed per packet number space. >>> >>> Despite that, current code is using per-connection PTO counter. >>> For example, this may lead to situation when packet loss at handshake >>> level will affect PTO calculation for initial packets, preventing >>> send of new probes. >> >> Although PTO value is per packet number space, PTO backoff is not, >> see "6.2.1 Computing PTO": >> >> : When ack-eliciting packets in multiple packet number spaces are in flight, >> the >> : timer MUST be set to the earlier value of the Initial and Handshake packet >> : number spaces. > > And I read this fragment as: > - there are multiple timer values (i.e. per packet number space) > (and value is pto * backoff) > - we have to choose the earliest value > > The ngx_quic_pto() has nothing that depends on packet number space > (with the minor exception that we add max_ack_delay at application level > after the handshake) > So pto_count is the only thing that can make timer values to be > different in different packet number spaces.
While ngx_quic_pto() doesn't depend on space (except max_ack_delay), it is currently implemented such way that it returns just duration, and it is the caller's obligation to apply it to the time of the last ack-eliciting packet (earliest among spaces) to get pto timeout. This is done in ngx_quic_set_lost_timer() and ngx_quic_pto_handler(), and seemingly missed for path validation in ngx_quic_set_path_timer() and for 3*PTO connection tear down in ngx_quic_close_connection(). Probably ngx_quic_pto() should be refactored to include this "duration => pto_timeout" calculation. This won't even break its signature since it's already passed ctx, which seems to be enough to get the earliest ack-eliciting packet among spaces. > >> >> But: >> >> : When a PTO timer expires, the PTO backoff MUST be increased <..> >> >> : This exponential reduction in the sender's rate is important because >> consecutive >> : PTOs might be caused by loss of packets or acknowledgments due to severe >> : congestion. Even when there are ack-eliciting packets in flight in >> multiple >> : packet number spaces, the exponential increase in PTO occurs across all >> spaces >> : to prevent excess load on the network. For example, a timeout in the >> Initial >> : packet number space doubles the length of the timeout in the Handshake >> packet >> : number space. > > yes, this really looks like contradiction. > At least I don't understand how it is possible to have PTO value > different by packet number space given the way we calculate it. > >> Even if that would be proven otherwise, I don't think the description >> provides detailed explanation. It describes a pretty specific use case, >> when both Initial and Handshake packet number spaces have in-flight packets >> with different PTO timeout (i.e. different f->last). Typically they are >> sent coalesced (e.g. CRYPTO frames for ServerHello and (at least) >> EncryptedExtensions TLS messages). >> In interop tests, though, it might be different: such packets may be >> sent separately, with Handshake packet thus having a later PTO timeout. >> If such, PTO timer will first fire for the Initial packet, then for >> Handshake, >> which will result in PTO backoff accumulated for each packet: >> >> t1: <- Initial (lost) >> t2: <- Handshake (lost) >> t1': pto(t1) timeout >> <- Initial (pto_count=1) >> t2': pto(t2) timeout >> <- Handshake (pto_count=2) >> t1'': pto(t1') timeout >> <- Initial (pto_count=3) >> >> So, I would supplement the description with the phrase that that's >> fair typically with uncoalesced packets seen in interop tests, and >> that the same is true vice verse with packet loss at initial packet >> number space affecting PTO backoff in handshake packet number space. >> >> But see above about PTO backoff increase across all spaces. > > I tend to think that it is better to leave things as is. > maybe RFC needs some better wording in this case. > > I've checked ngtcp2 and msquic and it it looks like both > handle pto counter per-connection too; > (see pto_count in ngtcp2 and QUIC_LOSS_DETECTION.ProbeCount in msquic) > Ok, thanks for deep looking into this. >>> >> >> This part of the change to reset pto_count >> for duplicate ACK or ACK for non-ack-eliciting frame >> contradicts the OnAckReceived example in RFC 9002, >> although I didn't found a format text in the RFC itself: >> >> OnAckReceived(ack, pn_space): >> ... >> // DetectAndRemoveAckedPackets finds packets that are newly >> // acknowledged and removes them from sent_packets. >> newly_acked_packets = >> DetectAndRemoveAckedPackets(ack, pn_space) >> // Nothing to do if there are no newly acked packets. >> if (newly_acked_packets.empty()): >> return >> >> // Update the RTT if the largest acknowledged is newly acked >> // and at least one ack-eliciting was newly acked. >> ... >> >> // Reset pto_count ... >> >> From which it follows that pto_count is reset >> (and RTT updated) for newly ack'ed packets only. > > yes, agreed. > the main issue is tracking of in-flight PING frames. >> >> I think the better fix would be to properly track in-flight PING frames. >> Moreover, the current behaviour of not tracking PING frames in ctx->sent >> prevents from a properly calculated PTO timeout: each time it is calculated >> against the original packet (with increasingly receding time to the past) >> that triggered the first PTO timeout, which doesn't result in exponentially >> increased PTO period as expected, but rather some bogus value. > > Agree. I've created a patch that fixes this. > (Will reply separately.) -- Sergey Kandaurov _______________________________________________ nginx-devel mailing list nginx-devel@nginx.org https://mailman.nginx.org/mailman/listinfo/nginx-devel