[Gen-art] Re: draft-ietf-cellar-tags-19 ietf last call Genart review

Spencer Dawkins at IETF Fri, 17 Oct 2025 13:54:38 -0700

Ines,

Thank you for the helpful review! Cellar is meeting tomorrow, and we'll go
over your feedback and Do The Right Thing.


Best,

Spencer

On Mon, Oct 13, 2025 at 10:07 AM Ines Robles via Datatracker <
[email protected]> wrote:

> Document: draft-ietf-cellar-tags
> Title: Matroska Media Container Tag Specifications
> Reviewer: Ines Robles
> Review result: Almost Ready
>
> I am the assigned Gen-ART reviewer for this draft. The General Area
> Review Team (Gen-ART) reviews all IETF documents being processed
> by the IESG for the IETF Chair.  Please treat these comments just
> like any other last call comments.
>
> For more information, please see the FAQ at
>
> <https://wiki.ietf.org/en/group/gen/GenArtFAQ>.
>
> Document: draft-ietf-cellar-tags-19
> Reviewer: Ines Robles
> Review Date: 2025-10-13
> IETF LC End Date: 2025-10-13
> IESG Telechat date: Not scheduled for a telechat
>
> Summary:
>
> This document defines the Matroska multimedia container tags, namely the
> tag
> names and their respective semantic meaning.
>
> I have a few comments and questions below that I would appreciate being
> addressed before publication.
>
> Comments:
>
> 1- Section 3.2.2, states "Multiple items MUST NOT be stored as a list in a
> single TagString. If there is more than one tag value with the same name
> to be
> stored, then more than one SimpleTag MUST be used."
>
> However, several tag definitions (for example, INSTRUMENTS in Section 4.4
> and
> KEYWORDS in Section 4.6) explicitly describe values as being “separated by
> a
> comma.” This wording suggests that multiple items may appear within a
> single
> TagString, which seems to contradict the rule in Section 3.2.2.
>
> Could you please clarify whether these tags are intended to be exceptions
> to
> that rule, or if the text should instead indicate that each value must be
> stored in a separate SimpleTag?
>
> 2- Section 3.3: In Table 2 (“TargetTypeValue for Video”), the draft lists
> MOVIE
> / EPISODE / CONCERT and describes them as “the most common grouping level
> of
> video (e.g., an episode for a TV series).” This correctly indicates that
> movie
> is intended as a representative example.
>
> However, in the document, several tag descriptions (e.g., DIRECTOR, ACTOR,
> LAW_RATING, etc.) refer specifically to “a movie.”
>
> For precision and inclusivity, these occurrences should be generalized,
> since
> the tagging system applies to any audiovisual work; including films,
> television
> episodes, animated content, image-based sequences, podcasts, concerts, or
> other
> recorded video content.
>
> It is therefore suggested to replace movie with a broader term such as
> video
> work, video content, or audiovisual work, as appropriate to the context.
>
> What do you think?
>
> 3- Section 3.3, states: “Tags from a TargetTypeValue apply to the all lower
> TargetTypeValues.”
>
> It is not always clear whether “lower” refers to numerically smaller
> values or
> to semantically subordinate entities. It is implicit that smaller numbers
> indicate lower levels in the hierarchy; however, the current wording could
> confuse newcomers.
>
> What about to add a clarification such as:
>
> “A tag defined for a given TargetTypeValue applies to all Targets with
> numerically smaller TargetTypeValues in the same hierarchy, that is, from
> higher-level groups to lower-level entities.”
>
> What do you think?
>
> 4- Section 3.3 defines TargetTypeValue and provides two tables: Table 1 for
> audio and Table 2 for video. Both tables list the same numeric values
> (e.g.,
> 50, 40, 30, etc.) but associate them with different semantic examples. For
> instance, Table 1 maps 50 to Album, while Table 2 maps 50 to Movie /
> Episode /
> Concert.
>
> It would be helpful to clarify whether these tables represent one shared
> TargetTypeValue numbering system that applies to all media types (where the
> numbers define structural hierarchy levels, and the examples simply
> illustrate
> common use cases for each media type), or two independent numbering systems
> (one for audio and one for video) that happen to reuse the same numeric
> values
> for different purposes.
>
> For example, how should this be interpreted in a Matroska file that
> contains
> both audio and video streams, such as a concert film?
>
> 5- Section 3.3.1: The current description of PART_OFFSET (“... which is the
> number of tracks on the first CD”) correctly implies that it represents a
> cumulative or absolute offset, i.e., the number of lower-level items that
> precede the current group in the overall collection. To avoid potential
> misinterpretation as a relative (per-disc) offset, it might be clearer to
> rephrase to something like:
>
> “PART_OFFSET, at TargetTypeValue 30 (TRACK), represents the number of
> lower-level items that precede the current group in the overall
> collection. For
> example, if CD 1 contains 5 tracks, then the first track of CD 2 has
> PART_OFFSET = 5.”
>
> What do you think?
>
> 6- Section 4.10: It appears to be an inconsistent treatment of numeric tags
> with respect to their encoding type.
>
> For example: The EBU_R128_* tags (e.g., EBU_R128_LOUDNESS) are defined as
> binary and store floating-point values in <TagBinary>. The REPLAYGAIN_*
> tags
> (e.g., REPLAYGAIN_GAIN, REPLAYGAIN_PEAK) represent similar floating-point
> values but are defined as UTF-8 strings in <TagString>. This means that two
> groups of tags describing essentially the same kind of data (gain/loudness
> values in dB or LUFS) are stored using different data types.
>
> 6.1- Could you please clarify whether this distinction is intentional (for
> example, due to backward compatibility) or whether a consistent approach is
> intended?
>
> 6.2- It might be helpful to include a short explanatory note in Section
> 4.10
> such as "..ReplayGain tags retain textual representation for compatibility
> with
> legacy implementations, whereas EBU R128 tags use binary floats for higher
> precision..."?
>
> 6.3- Additionally, it may be useful to provide brief guidance for future
> tag
> definitions on when to prefer binary versus textual representation for
> numeric
> values. For example, recommending binary floats for precision-critical
> engineering data, and UTF-8 strings for human-readable or legacy-compatible
> values. This would help ensure consistent design choices in future
> extensions.
>
> 7- Section 5, states: "Most of the time strings are kept as-is and don't
> pose a
> security issue, apart from invalid UTF-8 values."
>
> While the mention of “invalid UTF-8 values” is helpful, this phrasing might
> still understate the potential risk. Implementations that handle TagStrings
> without proper UTF-8 validation or size checks could encounter parsing
> errors,
> crashes, or buffer overruns if presented with malformed or excessively
> large
> input data. It may be useful to add a clarifying sentence such as:
>
> "Implementations MUST validate TagString inputs for UTF-8 correctness and
> reasonable length before use, in accordance with the security
> considerations in
> [RFC 3629]"
>
> What do you think?
>
> 8- The draft describes how multiple SimpleTag elements may appear under the
> same Tag element, allowing multiple values for the same tag name.
>
> However, how should applications interpret or prioritize these values if
> conflicting tags occur. For example, two TITLE tags with different
> TagString
> values within the same Targets element?
>
> Nits:
>
> 9- choregrapher → choreographer
>
> 10- the values is stored → the value is stored
>
> 11- parts that are inside or outside a given file → ambiguous. Consider
> clarifying to something like: “parts located either within or externally
> referenced by a given file” ?
>
> 12- Due to the various nature of tag sources → Due to the varied nature of
> tag
> sources
>
> 13- each demand needs to balance if it makes sense… → each request needs
> to be
> evaluated to determine if it makes sense…
>
> 14- an host app → a host app
>
> 15- A Tag element has a single Targets element with a single
> TargetTypeValue
> element. But the Targets element… → replace “But..” with “However,...”
>
> 16- It is RECOMMENDED to start a tag name… → It is RECOMMENDED that tag
> names
> start…
>
> 17- for non official tags than are not meant to make it to the list… → for
> non-official tags that are not meant to be added to the list of official
> tags...
>
> 18- apply to the all lower TargetTypeValues → “…apply to all lower
> TargetTypeValues..”
>
> Thanks for this document,
>
> Ines.
>
>
>
>

_______________________________________________
Gen-art mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[Gen-art] Re: draft-ietf-cellar-tags-19 ietf last call Genart review

Reply via email to