Hi,
I've completed a review of the QLOG drafts and have some comments. Hope these
are helpful.
Comments on draft-ietf-quic-qlog-main-schema.md
===============================================
## Custom fields\
Trailing backslash in this section title
Comments on draft-ietf-quic-qlog-quic-events.md
===============================================
connectivity:connection_closed:
This is missing the frame type field in the QUIC CONNECTION_CLOSE
frame which is rather useful information.
connection_code can be a string or a uint32, but QUIC uses a variable
length integer here and thus this can be up to 62 bits. Same for the
application_code.
It is up to the decoder to infer by presence of connection_code or
application_code to infer whether this was an application or transport
CONNECTION_CLOSE. But this requires it to log such a value. It would
be nice if this can be communicated without necessarily knowing the
actual value. Consider either adding a field to distinguish between
transport and app errors or replacing the "error" field of "trigger"
with "app_error" and "transport_error".
connectivity:connection_id_updated:
It would be nice to be able to optionally log CID-associated sequence
numbers and the type of CID source (ODCID, Initial CID,
preferred_address transport parameter, NCID frame, etc.)
connectivity:spin_bit_updated:
This is very QUIC-specific, so I question the value of it being defined
as a generic connectivity event.
connectivity:mtu_updated:
I suppose it is a bit of an asinine remark, but IPv6 jumbograms mean
theoretically you could do things with datagrams greater than 64k.
uint32?
quic:version_information:
Maybe clarify that if the client receives a version negotiation packet
with no viable overlap, it logs a version_information event with no
chosen_version?
quic:alpn_information:
RFC 7301 states that ALPN protocol strings are byte strings, not text.
How should ALPN protocol strings which are not valid UTF-8 or not
valid text be logged?
quic:parameters_set:
As things stand, there is no facility for logging the reception of unknown
transport parameters by an endpoint. Since transport parameters are used to
negotiate extensions to QUIC, having logged data about the presence of
transport parameters sent by a peer but not understood, and the consequential
ability to get information about the prevalence of any given unsupported
transport parameter in incoming traffic, seems like it would be quite useful.
I think this event needs more clarify in terms of the distinction between
attempting to negotiate a feature and it being enabled. It should be possible
to log the feature set a local or remote endpoint is requesting and it
should be possible to log the actually negotiated feature set. Possibly
this can be resolved via simple clarification. If a client connects to
a server, does it start by emitting parameters_set (owner=local) with (e.g.)
max_datagram_frame_size set, and then when the server responds without
it, emit parameters_set (owner=remote) without it? I guess this allows
the necessary information to be inferred.
If the absence of a field indicates it was not negotiated after receiving
transport parameters from a server, this creates a problem if this event has
to be used to log other parameters determined at different times (which seems
plausible for `tls_cipher`, `early_data_enabled`, etc.), as then logging
those parameters could be misinterpreted as a sign that the transport
parameters have been received and a feature has not been negotiated.
Either the cause of a parameters_set event should be clarified (maybe a
trigger
field) or possibly this should be split into different event types.
I think it would be valuable to have the ability to log information
such as spin bit policy (e.g. enabled, disabled due to random chance,
disabled administratively), independent of the actual spin_bit_updated
event).
quic:packet_sent:
The meaning of `is_coalesced` should be clarified. I assume this is
set to true unless a packet is both the first and last packet in a
datagram. However that is not necessarily obvious when a packet is
being generated; another option is to define this as being true for
every packet but the first packet in a datagram.
quic:packet_received:
See comment on is_coalesced above
quic:packet_dropped:
What is the intended trigger category here when an EL is discarded and
buffered packets are discarded as a consequence? `decryption_failure`
could match but feels subtly different, as I guess that covers the
case where a packet is received after an EL is discarded. IMO it would
be better to have a different trigger category for this, and add
language clarifying this case.
quic:packet_acked:
`PacketNumberSpace.application_data`, as this is by far the most prevalent
packet
number space a typical QUIC connection will use.
Editorial note: The usage of the . notation
`PacketNumberSpace.application_data` seems slightly novel here. Just
write "application_data"?
quic:datagrams_sent:
It is a bit minor, but the optional logging of the non-ECN bits in the ToS
byte
(as a separate field) might be useful. Implementations which don't do anything
special here would omit it, and specifying a field would indicate that the
implementation has requested a specific ToS from the OS.
I also think it would be potentially valuable if an implementation could
note if it has requested 'Don't Fragment' treatment from an OS, but since
an implementation can be expected to be consistent about this, this doesn't
seem like something which should be logged on a per-datagram basis, perhaps
elsewhere.
quic:datagrams_received:
Comments on ToS above also apply here
quic:stream_state_updated:
I think this needs to be clarified a little that an event logged here with
both
a `stream_id` field and a `stream_side` field is expressing a state for that
component of the stream and from the perspective of the specified vantage
point
(also, what about the network vantage point?); whereas an event logged here
with a `stream_id` field but no `stream_side` field is expressing a state for
both stream components as a whole.
If my interpretation here is correct, I think it would also help to note that
the valid values of StreamState depend on the value *and* presence of
stream_side.
quic:frames_processed:
The `frame_processed` event can be used to signal internal state change not
resulting directly from the actual `parsing` of a frame (e.g., the frame
could have been parsed, data put into a buffer, then later processed, then
logged with this event).
Editorial note: use of backticks around parsing seems off here.
The `packet_received` event can convey all constituent frames. It is not
expected that the `frames_processed` event will also be used for a
redundant purpose. Rather, implementations can use this event to avoid
having to log full packets or to convey extra information about when frames
are processed (for example, frame processing is deferred for any reason).
Editorial note: add "if" before "frame processing is deferred for any reason"
quic:stream_data_moved:
A brief explanation of each from/to value here and the intended meaning would
be a big improvement. Also clarify that raw here refers to the referenced
stream application data (and does not include frame headers etc.).
Consider having a way to report when the end-of-stream (FIN) condition is
reported to an application. One option here is to have an optional "fin" field
here.
There does not appear to be any way to communicate when a stream reset event
is
delivered to an application in an equivalent way, which would be useful.
quic:datagram_data_moved:
As with stream_data_moved, clarify meaning of the from/to fields here.
security:key_updated:
Clarify use of the generation field here, which is introducing a new concept
not directly touched on by the QUIC RFCs, which just deal in the Key Phase
bit.
Provide examples for the first few generation numbers (for example, presumably
the first 1-RTT EL key has a generation of 0). Also clarify how this interacts
with the temporal behaviour of the key update process (since old keys are
retained to handle reordered packets). Presumably once a new key exists, it is
logged with a new generation number and the old key is eventually given a
key_discarded event with the old generation number. But this should be
clarified.
security:key_discarded:
As above.
recovery:parameters_set:
do, for some reason, change these parameters during execution, MAY emit the
`parameters_set` event twice.
Twice? Seems like this should say "more than once".
Maximum datagram size could change repeatedly.
A minor point, but I suppose it may be useful to have a field for a name of a
congestion control algorithm here, defined as "newreno" / text (with "newreno"
being the algorithm specified in RFC 9002).
recovery:congestion_state_updated:
It is a little bit inconsistent here for ECN to be uppercase when the
"ecn" field in other events is lowercase.
recovery:loss_timer_updated:
Probably have a little bit of text clarifying the meaning of
`timer_type` here.
AckFrame:
It is worth noting that this doesn't allow a QLOG implementation to
log whether a peer included ECN data in an ACK frame without actually
including that data. i.e., a QLOG decoder has to infer presence of ECN
from the presence of the ect1/ect0/ce fields. Perhaps this is too
minor to address, but I thought I'd point it out.
ResetStreamFrame:
error_code should be uint64 here.
StopSendingFrame:
error_code should be uint64 here.
CryptoFrame:
StreamFrame has an optional raw field here, so CryptoFrame should too.
StreamFrame:
This format has ambiguity as to whether explicit length and/or offset
fields were used. It is potentially useful to know this, or the
numeric value of the stream type field.
Being able to know the header length for overhead calculation would
also be valuable IMO.
It may be worth explicitly clarifying that it is not possible to use
this object to log information about a stream frame without logging
the value of the FIN bit, as the presence or absence of this bit is
inferred from presence or absence of the FIN field.
ConnectionCloseFrame:
Is there a reason error_code is limited to uint32 here when
error_code_value is correctly specified as uint64? Understanding some
of the rationale behind the encoding of error codes in this
specification would be helpful.
Yours,
Hugo Landau