fre 2025-06-06 klockan 22:22 +0200 skrev Marcos Del Sol Vives:
> 
> 
> El 6 de junio de 2025 21:43:58 CEST, "Tomas Härdin" <g...@haerdin.se>
> escribió:
> > 
> > Sounds like the demuxer correctly rejected some broken files
> > 
> 
> The WebVTT standard does not call for a fatal error unless the magic
> header does not match. The current implementation is not only non-
> compliant with the standard, but will also break on future changes.

The linked file does not follow the syntax specified in section 4 of
the WebVTT standard. Therefore it is not WebVTT.

The probe function should ensure that the file starts with the
necessary bytes. I see it can let some non-compliant files slip by,
since it does not check whether [BOM]WEBVTT is followed by a newline,
and possibly a space or tab and any non-newline characters. We could
fix that in the main parsing loop. We also shouldn't expect any
"WEBVTT" chunks after the first one.

webvttdec.c also allows REGION etc chunks outside of where section 4
says they are allowed. In my opinion this is bad, since it means the
demuxer allows more than just WebVTT.

I've been harping on the permissive attitude towards parsing on this
list for a while. The reason why I do this is because every time we're
lax with parsing, some user will come to rely on said laxness rather
than fixing their workflow. Therefore we're perpetually unable to fix
our demuxers. My opinion is that it is best to nip this permissiveness
in the bud. The fact that the demuxer does the wrong thing right now is
no excuse to make it behave even more incorrectly.

What I'd like to see is the project moving towards either parser
combinators or a domain specific language for grammars like a PEG
variant extended with length fields.

We can't do much about the W3C making breaking changes to their
standards in the future, other than updating our code when that
happens. We're lucky that the spec is quite narrow. The space for
making non-breaking changes to it is quite small. They could for
example reuse NOTE chunks for future functionality. For example, if W3C
wants to allow STYLE chunks in the middle of the file the current
syntax does not allow that. But they could amend it by using "NOTE
STYLE" for stylesheets between cues.

All this is just my views of course. Other devs might feel very
differently. I'd point out it's no longer the wild west of the early
2000's.

/Tomas
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to