Re: [FFmpeg-devel] [PATCH 1/2] decode: add ff_decode_skip_samples function

Martin Storsjö Sat, 04 Nov 2023 16:05:08 -0700

Hi,

Just following up on this - I'm sorry I haven't been able to look at theproposed patchset myself quite in detail yet.

My prime concern is about the requests to have this merged into theupcoming 6.1 release; that's way too soon IMO.

These patches do change aspects of how these things behave, that have beenworking the same for a very long time, so there are all sorts of potentialsubtle breakage, or (incorrect or not) assumptions being broken, acrosslibavcodec and its users.


On Sat, 4 Nov 2023, Derek Buitenhuis wrote:

Next, a quick breakdown of the AAC situation, in terms of both how this it is 
stored,
what we support, and the state of the ecosystem and types of files that exist:
   * 'Raw' ADTS streams have no way to store any of this. The best we can do is 
guess
     the pre-roll. We should not guess priming or end padding, as no matter 
what we do,
     it'll be wrong, and any value will be a cargo culted hack value.

I share this concern; all the various encoders I've seen have used adifferent amount of priming samples, so guessing it will be bound to bewrong in a lot of the cases.

   * MP4 - there are two places to store this metadata - one standard, and one 
proprietary
     Apple way. There are, separately, two ways to signal priming length when 
SBR is present.
      * MP4s may contain a user data box with 'iTunSMPB' which contains 
priming, pre-roll,
        and end padding data. We support reading only priming data from this at 
the moment,
        and we set skip samples based on this. This is 'iTunes style' metadata.
      * The standards compliant (read: non-iTunes) way is to use an edit list 
to trim the
        priming samples, and, opionally end padding. End padding may also be 
trimmed by reducing
        the sample duration of the last packet in the stts box. Pre-roll is 
store in the sgpb
        box with the 'roll', type, which signals the roll distance as a number 
of packets;
        for example, -1 indicates you should decode an discard the samples of 1 
packet before
        beginning plaback. Notably, this allows the sgpd box to also be use for 
video like
        periodic intra refresh H.264. libavformat does not current parse or 
export this info,
        but even if we did, converting number of packets to audio samples can 
get hairy.
          * Notably, since in MP4, the edit list represents the exact 
presentation-level info,
            when no edit list, or an edit list startiing at 0 is present, no 
samples, not even
            pre-roll should be trimmed - all players in the wild handle this 
properly, and it
            has been standard practice among streaming services for >10 years 
to not output
            the AAC frames representing priming samples at all (even if there 
is a small hit
            quality). This is what the patch at [0] is addressing.

FWIW, MP4 isn't the only container where this might be relevant; AAC isfrequently used in muxes together with video in FLV and MKV and others aswell.

In the case of FLV, I'm not aware of any metadata that signals how much totrim off, so essentially we can't do it by guessing. On the producingside, this is handled by shifting the timestamps so the audio track, whichwould be starting at -<delay>, ends up starting at 0, and the video trackends up starting at +<audiodelay> instead.

In that case, if we trim off the priming samples (based on a guess asthat's all we have?), I guess that'd lead us to both tracks starting at+<delay> (i.e. not affecting sync). As long as it doesn't change sync, Iguess it can be tolerable.

To avoid all these effects, producers of muxed files can work around thisin many ways. For many years, I've been doing the trick of skipping thefirst <delay> samples of input to the audio encoder, so that afteraccounting for that, I have both audio and video tracks starting at 0.0,without the decoder needing to do anything - working the same across allplayers, good and bad.

If we suddenly start decoding such files with the audio track starting at+<delay>, I guess it'll be ok for sync, but it's a mildly surprisingchange, but hopefully any reasonable player based on libavcodec wouldstill not freak out by it.

   * As noted above, I don't think we should apply any guessed priming to 
initial samples (pre-roll,
     or 'algorithmic delay, included). No other decoders or players do this, in 
the world, to my
     knowledge, and violating the principal of least surpise because we think 
we're slightly more
     correct isn't great. I also think trying to 'fix' raw ADTS is destined to 
always be a hack,
     an we shouldn't. YMMV. I'd like to hear views from others here. This would 
make the patch in
     [0] redundant.

Yes, with raw ADTS there's really no good way of getting this right, otherthan plain guessing, and there's no single universally correct guessAFAIK.

(And even if we have a qualified guess for the amount of encoder priming,we have even less knowledge about how much to trim off at the end, ifwe're aiming at proper gapless playback.)


For MP4 there's at least a couple ways of signalling it.

But also, given all of this, I think we need to deeply consider how weapproach this, so we don't end up with something that only coverscertain cases (and I am sure I forgot more cases). To that end, I do notthink rushing to get a patchset that can change sync on all AAC files inexistence into 6.1 is wise. Even when this does go in, it should be ableto sit in master for a good long time before being in a release.

+1. This has the potential to be surprising in many different cases, andmay need a bunch of follow up patches to sort out cases found later. Itdefinitely should sit in git master for a some time before ending up in arelease - not be slipped into 6.1 the week before the release.


// Martin

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 1/2] decode: add ff_decode_skip_samples function

Reply via email to