Bug#1010863: streamlink: Broken audio timestamps on arte.tv

2022-05-20 Thread Markus Demleitner
On Wed, May 18, 2022 at 02:09:47AM +0200, Alexis Murzeau wrote:
> Le 18/05/2022 à 01:42, Alexis Murzeau a écrit :
> > Le 17/05/2022 à 20:06, Alexis Murzeau a écrit :
> >> [...]
> > 
> > => So really the only option to use ffmpeg v5.x
> > 
> > 
> In Debian ffmpeg v5.0 is only in experimental, but you can try upstream
> linux binaries.

Getting the static binaries was what I did to solve my problem; in
case someone drops in here who uses the python API: you need to go
through a session, I guess; my code looks like this: 

options = {}
fallback_path = "/usr/local/ffmpeg/ffmpeg"
if os.path.exists(fallback_path):
options["ffmpeg-ffmpeg"] = fallback_path

session = streamlink.Streamlink(options or None)
streams = session.streams(url)

Thanks a lot for all that research -- I'm impressed how much I don't
understand about today's video containers...

I'm not sure whether to close this bug ("not streamlink's bug, and
it's well understood") or whether to keep it open because people
might miss it when it's closed, and I guess the problem won't go away
in bullseye.

I'm fine either way.  Thanks again!



Bug#1010863: streamlink: Broken audio timestamps on arte.tv

2022-05-17 Thread Alexis Murzeau
Le 18/05/2022 à 01:42, Alexis Murzeau a écrit :
> Le 17/05/2022 à 20:06, Alexis Murzeau a écrit :
>> [...]
> 
> => So really the only option to use ffmpeg v5.x
> 
> 
In Debian ffmpeg v5.0 is only in experimental, but you can try upstream
linux binaries.

If you want to use it only for streamlink without replacing the system one,
you can use this streamlink option:
https://streamlink.github.io/cli.html#cmdoption-ffmpeg-ffmpeg

-- 
Alexis Murzeau
PGP: B7E6 0EBB 9293 7B06 BDBC  2787 E7BD 1904 F480 937F|



signature.asc
Description: OpenPGP digital signature


Bug#1010863: streamlink: Broken audio timestamps on arte.tv

2022-05-17 Thread Alexis Murzeau
Le 17/05/2022 à 20:06, Alexis Murzeau a écrit :
> [...]


I've investigated a bit more on how the arte.tv stream is made.
I've found that ffmpeg v4 is using wrong timestamp information from the
input stream and that only ffmpeg v5 read the correct timestamps.
So the same conclusion as upstream.


Details of why (brain dump :-) ):

In fact, arte.tv is not using MPEGTS at all but fragmented MP4.
I've checked the original files 108210-039-B_v216.mp4 and 
108210-039-B_aud_VA.mp4
that are on arte.tv servers, and it is using the fragmented MP4 format
correctly, with this structure:

 - A "header" with metadata: ftyp + moov + sidx
 - Several fragments: moof + mdat
 - A closing header: mfra

In the aud_VA.m3u8 file:
 - The part referenced by #EXT-X-MAP matches exactly the first 1680 bytes
   of the .mp4 files which is the "header"(ftyp + moov + sidx)
 - One fragment in the m3u8 (#EXT-X-BYTERANGE) references 3 MP4 fragments
   (so 3x moof + mdat)

So when streamlink process this file, it will add the "header" every 3
MP4 fragments like this:
 - One "header" (EXT-X-MAP)
   - ftyp + moov + sidx
 - One HLS fragment (EXT-X-BYTERANGE) == 3 MP4 fragments
   - moof + mdat
   - moof + mdat
   - moof + mdat

 - One "header" (EXT-X-MAP)
   - ftyp + moov + sidx
 - One HLS fragment (EXT-X-BYTERANGE) == 3 MP4 fragments
   - moof + mdat
   - moof + mdat
   - moof + mdat
etc... until the last fragment (the "closing header" doesn't appear
anymore).

When checking with the HLS RFC, everything seems fine here.


The MP4 stream contains timestamp in both sidx and tfdt (which is inside
moof blocks).

The issue is that ffmpeg v4 sees the sidx block and decide to reset sample
timestamps (PTS and DTS, presentation and decoding timestamps) according
to that data in sidx.

The correct timestamps are in moof/tfdt which are not part
of the header, so their values are different for each fragments (and
correct). Timestamp in sidx are the same for all fragment as sidx is part
of the "header" which is copied before each HLS fragment.

As sidx appear for each HLS fragment, the behavior of ffmpeg v4 is to
ignore tfdt and resets timestamps instead (which is wrong here).
Then ffmpeg try to recover from that when writing the mpegts output file.

Unfortunately, this means the resulting output file has bad frame
timestamps and players try to follow them, which cause the glitches.


With ffmpeg v5.x, it uses the tfdt (a block in moof block) instead of sidx
and that tfdt contains correct data, that's why everything work fine with
that version.
There is a new option in ffmpeg called "use_tfdt" in that version, which
is enabled by default now.
This was implemented in parts, where the first is here:
https://github.com/FFmpeg/FFmpeg/commit/071930de724166bfb90fc6d368c748771188fd94
According to github, this commit is only in version v5.0 (tag n5.0) but
not earlier.

=> So really the only option to use ffmpeg v5.x



Relevant files:
https://arte-cmafhls.akamaized.net/am/cmaf/108000/108200/108210-039-B/220509185705/medias/108210-039-B_v216.m3u8
https://arte-cmafhls.akamaized.net/am/cmaf/108000/108200/108210-039-B/220509185705/medias/108210-039-B_v216.mp4
https://arte-cmafhls.akamaized.net/am/cmaf/108000/108200/108210-039-B/220509185705/medias/108210-039-B_aud_VA.m3u8
https://arte-cmafhls.akamaized.net/am/cmaf/108000/108200/108210-039-B/220509185705/medias/108210-039-B_aud_VA.mp4
https://arte-cmafhls.akamaized.net/am/cmaf/108000/108200/108210-039-B/220509185705/108210-039-B_VA_XQ.m3u8

Additional sources that helped me:
For MP4: https://bitmovin.com/fun-with-container-formats-2/
For MPEGTS: https://bitmovin.com/fun-with-container-formats-3/
HLS RFC: https://datatracker.ietf.org/doc/html/rfc8216#section-3.3
ffmpeg MP4 options: https://ffmpeg.org/ffmpeg-formats.html#Options-2
Raw MP4 file browser: https://github.com/sannies/isoviewer

-- 
Alexis Murzeau
PGP: B7E6 0EBB 9293 7B06 BDBC  2787 E7BD 1904 F480 937F|



signature.asc
Description: OpenPGP digital signature


Bug#1010863: streamlink: Broken audio timestamps on arte.tv

2022-05-17 Thread Alexis Murzeau
Hi,

On May 11, 2022 9:11:20 PM GMT+02:00, Markus Demleitner  
wrote:
>Package: streamlink
>Version: 3.2.0-1~bpo11+1
>Severity: normal
>
>Dear Maintainer,
>
>When pulling video from arte, e.g.,
>
>streamlink --output o.mp4 \
>https://www.arte.tv/de/videos/108210-039-A/mit-offenen-karten-im-fokus/
>\
>  worst
>
>the audio timestamps for the resulting video file are broken for me
>(since
>fairly recently), which leads to various failures in different clients
>(mpv has
>seconds of hanging videos, vlc has stutters, webkit goes all haywire).
>
>Upstream says it's not a streamlink problem, 
>https://github.com/streamlink/streamlink/issues/4520;
>I've suspected ffmpeg and so backported ffmpeg 4.4 -- to no avail.
>I've also tried a uupdated streamlink 4 -- same result, broken
>timestamps.
>
>So... while I don't doubt upstream's analysis that it's not a
>streamlink
>bug per se that's causing the bad timestamps, I don't know what else
>is.
>I'm grateful for hints on what that might be -- and reports on whether
>other people see the broken timestamps, too.

I've checked this and found that:
 - arte.tv uses a single mp4 file for the whole video
 - use HLS playlist with video fragment referencing part of the mp4 file
 - define a special header as the beginning of the mp4 file. That header is 
repeated before each fragment according to HLS specification. It is a PAT/PMT 
header, and explained here: 
https://medium.com/@stackhousejs/reducing-mpeg-ts-metadata-for-hls-6cec37484d38

So streamlink does that header repetition as it should, but this seems to 
confuse ffmpeg v4.

As of now, I've no more solutions than installing ffmpeg v5 manually (in 
debian, it is currently still n experimental). Maybe a workaround in streamlink 
could be found, but I don't know enough about mpegts.

>
>-- System Information:
>Debian Release: 11.3
>  APT prefers stable-security
>  APT policy: (500, 'stable-security'), (500, 'stable')
>Architecture: i386 (x86_64)
>
>Kernel: Linux 5.16.19 (SMP w/4 CPU threads)
>Kernel taint flags: TAINT_USER, TAINT_OOT_MODULE
>Locale: LANG=C.UTF-8, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE not
>set
>Shell: /bin/sh linked to /bin/bash
>Init: sysvinit (via /sbin/init)
>
>Versions of packages streamlink depends on:
>ii  python3 3.9.2-3
>ii  python3-streamlink  3.2.0-1~bpo11+1
>
>Versions of packages streamlink recommends:
>ii  mpv  0.32.0-3
>ii  vlc  3.0.16-1
>
>streamlink suggests no packages.
>
>-- no debconf information

-- 
Alexis Murzeau



Bug#1010863: streamlink: Broken audio timestamps on arte.tv

2022-05-11 Thread Markus Demleitner
Package: streamlink
Version: 3.2.0-1~bpo11+1
Severity: normal

Dear Maintainer,

When pulling video from arte, e.g.,

streamlink --output o.mp4 \
  https://www.arte.tv/de/videos/108210-039-A/mit-offenen-karten-im-fokus/ \
  worst

the audio timestamps for the resulting video file are broken for me (since
fairly recently), which leads to various failures in different clients (mpv has
seconds of hanging videos, vlc has stutters, webkit goes all haywire).

Upstream says it's not a streamlink problem, 
https://github.com/streamlink/streamlink/issues/4520;
I've suspected ffmpeg and so backported ffmpeg 4.4 -- to no avail.
I've also tried a uupdated streamlink 4 -- same result, broken
timestamps.

So... while I don't doubt upstream's analysis that it's not a streamlink
bug per se that's causing the bad timestamps, I don't know what else is.
I'm grateful for hints on what that might be -- and reports on whether
other people see the broken timestamps, too.

-- System Information:
Debian Release: 11.3
  APT prefers stable-security
  APT policy: (500, 'stable-security'), (500, 'stable')
Architecture: i386 (x86_64)

Kernel: Linux 5.16.19 (SMP w/4 CPU threads)
Kernel taint flags: TAINT_USER, TAINT_OOT_MODULE
Locale: LANG=C.UTF-8, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /bin/bash
Init: sysvinit (via /sbin/init)

Versions of packages streamlink depends on:
ii  python3 3.9.2-3
ii  python3-streamlink  3.2.0-1~bpo11+1

Versions of packages streamlink recommends:
ii  mpv  0.32.0-3
ii  vlc  3.0.16-1

streamlink suggests no packages.

-- no debconf information