fre 2025-06-06 klockan 22:22 +0200 skrev Marcos Del Sol Vives: > > > El 6 de junio de 2025 21:43:58 CEST, "Tomas Härdin" <g...@haerdin.se> > escribió: > > > > Sounds like the demuxer correctly rejected some broken files > > > > The WebVTT standard does not call for a fatal error unless the magic > header does not match. The current implementation is not only non- > compliant with the standard, but will also break on future changes.
The linked file does not follow the syntax specified in section 4 of the WebVTT standard. Therefore it is not WebVTT. The probe function should ensure that the file starts with the necessary bytes. I see it can let some non-compliant files slip by, since it does not check whether [BOM]WEBVTT is followed by a newline, and possibly a space or tab and any non-newline characters. We could fix that in the main parsing loop. We also shouldn't expect any "WEBVTT" chunks after the first one. webvttdec.c also allows REGION etc chunks outside of where section 4 says they are allowed. In my opinion this is bad, since it means the demuxer allows more than just WebVTT. I've been harping on the permissive attitude towards parsing on this list for a while. The reason why I do this is because every time we're lax with parsing, some user will come to rely on said laxness rather than fixing their workflow. Therefore we're perpetually unable to fix our demuxers. My opinion is that it is best to nip this permissiveness in the bud. The fact that the demuxer does the wrong thing right now is no excuse to make it behave even more incorrectly. What I'd like to see is the project moving towards either parser combinators or a domain specific language for grammars like a PEG variant extended with length fields. We can't do much about the W3C making breaking changes to their standards in the future, other than updating our code when that happens. We're lucky that the spec is quite narrow. The space for making non-breaking changes to it is quite small. They could for example reuse NOTE chunks for future functionality. For example, if W3C wants to allow STYLE chunks in the middle of the file the current syntax does not allow that. But they could amend it by using "NOTE STYLE" for stylesheets between cues. All this is just my views of course. Other devs might feel very differently. I'd point out it's no longer the wild west of the early 2000's. /Tomas _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".