Comments on QLOG drafts

Hugo Landau Thu, 18 Jan 2024 10:11:46 -0800

Hi,

I've completed a review of the QLOG drafts and have some comments. Hope these
are helpful.


Comments on draft-ietf-quic-qlog-main-schema.md
===============================================

    ## Custom fields\
 
Trailing backslash in this section title

Comments on draft-ietf-quic-qlog-quic-events.md
===============================================

connectivity:connection_closed:

  This is missing the frame type field in the QUIC CONNECTION_CLOSE
  frame which is rather useful information.

  connection_code can be a string or a uint32, but QUIC uses a variable
  length integer here and thus this can be up to 62 bits. Same for the
  application_code.

  It is up to the decoder to infer by presence of connection_code or
  application_code to infer whether this was an application or transport
  CONNECTION_CLOSE. But this requires it to log such a value. It would
  be nice if this can be communicated without necessarily knowing the
  actual value. Consider either adding a field to distinguish between
  transport and app errors or replacing the "error" field of "trigger"
  with "app_error" and "transport_error".

connectivity:connection_id_updated:
 
  It would be nice to be able to optionally log CID-associated sequence
  numbers and the type of CID source (ODCID, Initial CID,
  preferred_address transport parameter, NCID frame, etc.)

connectivity:spin_bit_updated:
 
  This is very QUIC-specific, so I question the value of it being defined
  as a generic connectivity event.

connectivity:mtu_updated:
 
  I suppose it is a bit of an asinine remark, but IPv6 jumbograms mean
  theoretically you could do things with datagrams greater than 64k.
  uint32?

quic:version_information:
   
  Maybe clarify that if the client receives a version negotiation packet
  with no viable overlap, it logs a version_information event with no
  chosen_version?

quic:alpn_information:
 
  RFC 7301 states that ALPN protocol strings are byte strings, not text.
  How should ALPN protocol strings which are not valid UTF-8 or not
  valid text be logged?

quic:parameters_set:
 
  As things stand, there is no facility for logging the reception of unknown
  transport parameters by an endpoint. Since transport parameters are used to
  negotiate extensions to QUIC, having logged data about the presence of
  transport parameters sent by a peer but not understood, and the consequential
  ability to get information about the prevalence of any given unsupported
  transport parameter in incoming traffic, seems like it would be quite useful.

  I think this event needs more clarify in terms of the distinction between
  attempting to negotiate a feature and it being enabled. It should be possible
  to log the feature set a local or remote endpoint is requesting and it
  should be possible to log the actually negotiated feature set. Possibly
  this can be resolved via simple clarification. If a client connects to
  a server, does it start by emitting parameters_set (owner=local) with (e.g.)
  max_datagram_frame_size set, and then when the server responds without
  it, emit parameters_set (owner=remote) without it? I guess this allows
  the necessary information to be inferred.

  If the absence of a field indicates it was not negotiated after receiving
  transport parameters from a server, this creates a problem if this event has
  to be used to log other parameters determined at different times (which seems
  plausible for `tls_cipher`, `early_data_enabled`, etc.), as then logging
  those parameters could be misinterpreted as a sign that the transport
  parameters have been received and a feature has not been negotiated.
  Either the cause of a parameters_set event should be clarified (maybe a 
trigger
  field) or possibly this should be split into different event types.

  I think it would be valuable to have the ability to log information
  such as spin bit policy (e.g. enabled, disabled due to random chance,
  disabled administratively), independent of the actual spin_bit_updated
  event).

quic:packet_sent:

  The meaning of `is_coalesced` should be clarified. I assume this is
  set to true unless a packet is both the first and last packet in a
  datagram. However that is not necessarily obvious when a packet is
  being generated; another option is to define this as being true for
  every packet but the first packet in a datagram.

quic:packet_received:
 
  See comment on is_coalesced above

quic:packet_dropped:
 
  What is the intended trigger category here when an EL is discarded and
  buffered packets are discarded as a consequence? `decryption_failure`
  could match but feels subtly different, as I guess that covers the
  case where a packet is received after an EL is discarded. IMO it would
  be better to have a different trigger category for this, and add
  language clarifying this case.

quic:packet_acked:

  `PacketNumberSpace.application_data`, as this is by far the most prevalent 
packet
  number space a typical QUIC connection will use.
 
  Editorial note: The usage of the . notation
  `PacketNumberSpace.application_data` seems slightly novel here. Just
  write "application_data"?

quic:datagrams_sent:

  It is a bit minor, but the optional logging of the non-ECN bits in the ToS 
byte
  (as a separate field) might be useful. Implementations which don't do anything
  special here would omit it, and specifying a field would indicate that the
  implementation has requested a specific ToS from the OS.

  I also think it would be potentially valuable if an implementation could
  note if it has requested 'Don't Fragment' treatment from an OS, but since
  an implementation can be expected to be consistent about this, this doesn't
  seem like something which should be logged on a per-datagram basis, perhaps
  elsewhere.

quic:datagrams_received:

  Comments on ToS above also apply here

quic:stream_state_updated:

  I think this needs to be clarified a little that an event logged here with 
both
  a `stream_id` field and a `stream_side` field is expressing a state for that
  component of the stream and from the perspective of the specified vantage 
point
  (also, what about the network vantage point?); whereas an event logged here
  with a `stream_id` field but no `stream_side` field is expressing a state for
  both stream components as a whole.

  If my interpretation here is correct, I think it would also help to note that
  the valid values of StreamState depend on the value *and* presence of
  stream_side.

quic:frames_processed:

    The `frame_processed` event can be used to signal internal state change not
    resulting directly from the actual `parsing` of a frame (e.g., the frame
    could have been parsed, data put into a buffer, then later processed, then
    logged with this event).
 
  Editorial note: use of backticks around parsing seems off here.

     The `packet_received` event can convey all constituent frames. It is not
     expected that the `frames_processed` event will also be used for a
     redundant purpose. Rather, implementations can use this event to avoid
     having to log full packets or to convey extra information about when frames
     are processed (for example, frame processing is deferred for any reason).
 
  Editorial note: add "if" before "frame processing is deferred for any reason"

quic:stream_data_moved:
 
  A brief explanation of each from/to value here and the intended meaning would
  be a big improvement. Also clarify that raw here refers to the referenced
  stream application data (and does not include frame headers etc.).

  Consider having a way to report when the end-of-stream (FIN) condition is
  reported to an application. One option here is to have an optional "fin" field
  here.

  There does not appear to be any way to communicate when a stream reset event 
is
  delivered to an application in an equivalent way, which would be useful.

quic:datagram_data_moved:
 
  As with stream_data_moved, clarify meaning of the from/to fields here.

security:key_updated:
 
  Clarify use of the generation field here, which is introducing a new concept
  not directly touched on by the QUIC RFCs, which just deal in the Key Phase 
bit.
  Provide examples for the first few generation numbers (for example, presumably
  the first 1-RTT EL key has a generation of 0). Also clarify how this interacts
  with the temporal behaviour of the key update process (since old keys are
  retained to handle reordered packets). Presumably once a new key exists, it is
  logged with a new generation number and the old key is eventually given a
  key_discarded event with the old generation number. But this should be
  clarified.

security:key_discarded:
 
  As above.

recovery:parameters_set:

    do, for some reason, change these parameters during execution, MAY emit the
    `parameters_set` event twice.
 
  Twice? Seems like this should say "more than once".
  Maximum datagram size could change repeatedly.

  A minor point, but I suppose it may be useful to have a field for a name of a
  congestion control algorithm here, defined as "newreno" / text (with "newreno"
  being the algorithm specified in RFC 9002).

recovery:congestion_state_updated:
 
  It is a little bit inconsistent here for ECN to be uppercase when the
  "ecn" field in other events is lowercase.

recovery:loss_timer_updated:
 
  Probably have a little bit of text clarifying the meaning of
  `timer_type` here.

AckFrame:
 
  It is worth noting that this doesn't allow a QLOG implementation to
  log whether a peer included ECN data in an ACK frame without actually
  including that data. i.e., a QLOG decoder has to infer presence of ECN
  from the presence of the ect1/ect0/ce fields. Perhaps this is too
  minor to address, but I thought I'd point it out.

ResetStreamFrame:

  error_code should be uint64 here.

StopSendingFrame:

  error_code should be uint64 here.

CryptoFrame:
 
  StreamFrame has an optional raw field here, so CryptoFrame should too.

StreamFrame:
 
  This format has ambiguity as to whether explicit length and/or offset
  fields were used. It is potentially useful to know this, or the
  numeric value of the stream type field.

  Being able to know the header length for overhead calculation would
  also be valuable IMO.

  It may be worth explicitly clarifying that it is not possible to use
  this object to log information about a stream frame without logging
  the value of the FIN bit, as the presence or absence of this bit is
  inferred from presence or absence of the FIN field.

ConnectionCloseFrame:

  Is there a reason error_code is limited to uint32 here when
  error_code_value is correctly specified as uint64? Understanding some
  of the rationale behind the encoding of error codes in this
  specification would be helpful.

Yours,
Hugo Landau

Comments on QLOG drafts

Reply via email to