2017-02-25 19:09 GMT+09:00 Martin Storsjö <[email protected]>: > On Fri, 24 Feb 2017, John Stebbins wrote: > > On 02/24/2017 03:13 PM, Martin Storsjö wrote: >> >>> On Fri, 24 Feb 2017, John Stebbins wrote: >>> >>> On 02/24/2017 02:21 PM, Martin Storsjö wrote: >>>> >>>>> On Fri, 24 Feb 2017, John Stebbins wrote: >>>>> >>>>> This recordes the number of AAC frames that must be decoded before >>>>>> valid >>>>>> audio is decoded. >>>>>> >>>>>> It also signals to Apple tools that the encoder delay is explicitly >>>>>> set >>>>>> in the edit list. If the roll sample group is omitted, Apply tools >>>>>> will >>>>>> apply an implicit rule to remove encoder delay which would result in >>>>>> removing the delay twice. >>>>>> --- >>>>>> libavformat/movenc.c | 63 ++++++++++++++++++++++++++++++ >>>>>> +++++++++++++++++++++ >>>>>> tests/ref/fate/movenc | 32 +++++++++++++------------- >>>>>> 2 files changed, 79 insertions(+), 16 deletions(-) >>>>>> >>>>>> diff --git a/libavformat/movenc.c b/libavformat/movenc.c >>>>>> index 4d351b7..a26dcf7 100644 >>>>>> --- a/libavformat/movenc.c >>>>>> +++ b/libavformat/movenc.c >>>>>> @@ -136,6 +136,54 @@ static int mov_write_stco_tag(AVIOContext *pb, >>>>>> MOVTrack *track) >>>>>> return update_size(pb, pos); >>>>>> } >>>>>> >>>>>> +static int mov_write_sgpd_tag(AVIOContext *pb, int16_t roll) >>>>>> +{ >>>>>> + int64_t pos = avio_tell(pb); >>>>>> + avio_wb32(pb, 0); /* size */ >>>>>> + >>>>>> + ffio_wfourcc(pb, "sgpd"); >>>>>> + avio_w8(pb, 1); /* version */ >>>>>> + avio_w8(pb, 0); /* flags (1) */ >>>>>> + avio_wb16(pb, 0); /* flags (2) */ >>>>>> + ffio_wfourcc(pb, "roll"); /* grouping type */ >>>>>> + avio_wb32(pb, 2); /* table entry length */ >>>>>> + avio_wb32(pb, 1); /* table entry count */ >>>>>> + >>>>>> + /* table data, roll distance >>>>>> + * i.e. number of audio frames to pre-roll after a seek */ >>>>>> + avio_wb16(pb, roll); >>>>>> + >>>>>> + return update_size(pb, pos); >>>>>> +} >>>>>> + >>>>>> +static int mov_write_sbgp_tag(AVIOContext *pb, MOVTrack *track) >>>>>> +{ >>>>>> + int count = 0; >>>>>> + int i; >>>>>> + int64_t pos; >>>>>> + >>>>>> + for (i = 0; i < track->entry; i++) >>>>>> + { >>>>>> + count += track->cluster[i].entries; >>>>>> + } >>>>>> >>>>> The starting brace should be on the same line as for, or omitted >>>>> >>>> Oops, I know this, but habits are hard to break. >>>> >>>> + >>>>>> + pos = avio_tell(pb); >>>>>> + avio_wb32(pb, 0); /* size */ >>>>>> + >>>>>> + ffio_wfourcc(pb, "sbgp"); /* atom name */ >>>>>> + avio_wb32(pb, 0); /* version & flags */ >>>>>> + ffio_wfourcc(pb, "roll"); /* grouping type */ >>>>>> + avio_wb32(pb, 1); /* table entry count */ >>>>>> + >>>>>> + /* table data */ >>>>>> + avio_wb32(pb, count); >>>>>> + /* sgpd table index, index values are 1 based >>>>>> + * we write 'roll' sample group at index 1 */ >>>>>> + avio_wb32(pb, 1); >>>>>> + >>>>>> + return update_size(pb, pos); >>>>>> +} >>>>>> + >>>>>> /* Sample size atom */ >>>>>> static int mov_write_stsz_tag(AVIOContext *pb, MOVTrack *track) >>>>>> { >>>>>> @@ -1260,6 +1308,7 @@ static int mov_write_dref_tag(AVIOContext *pb) >>>>>> >>>>>> static int mov_write_stbl_tag(AVFormatContext *s, AVIOContext *pb, >>>>>> MOVTrack *track) >>>>>> { >>>>>> + MOVMuxContext *mov = s->priv_data; >>>>>> int64_t pos = avio_tell(pb); >>>>>> avio_wb32(pb, 0); /* size */ >>>>>> ffio_wfourcc(pb, "stbl"); >>>>>> @@ -1277,6 +1326,20 @@ static int mov_write_stbl_tag(AVFormatContext >>>>>> *s, AVIOContext *pb, MOVTrack *tra >>>>>> mov_write_stsc_tag(pb, track); >>>>>> mov_write_stsz_tag(pb, track); >>>>>> mov_write_stco_tag(pb, track); >>>>>> + >>>>>> + if (track->par->codec_id == AV_CODEC_ID_AAC && mov->use_editlist >>>>>> && >>>>>> + track->start_dts != AV_NOPTS_VALUE && >>>>>> + (track->start_dts < 0 || track->par->initial_padding > 0)) { >>>>>> + /* Add sgpd and sbgp tags for AAC tracks. >>>>>> + * Apple documentation says they use this as a flag to >>>>>> indicate >>>>>> + * that AAC encoder delay is explicitely set in the edit >>>>>> list. >>>>>> + * If this is omitted, Apple tools will apply an implicit >>>>>> + * rule to remove encoder delay samples, which results in >>>>>> + * removal of 2 * delay and A/V desync. */ >>>>>> + mov_write_sgpd_tag(pb, -1); >>>>>> >>>>> The magic value -1 could probably be described here. From the code in >>>>> mov_write_sgpd_tag, it seems like it would be a normal actual value, >>>>> but >>>>> this seems like a special case that isn't explained elsewhere. >>>>> >>>>> Otherwise this seems ok. >>>>> >>>>> >>>>> The parameter "roll" is documented where it is used in >>>> mov_write_sgpd_tag. But I can explain it here as well. >>>> >>> Yes, there I see that it says "number of audio frames to pre-roll after a >>> seek". But what does -1 frames mean? >>> >>> >>> >> Ah, ok. Would it be clearer like this? "roll is an offset in frames that >> should be applied when seeking in order to >> pre-roll data to the decoder" >> > > I see - then it makes more sense. I guess this means in practice it won't > ever be positive?
I've never seen positive values of roll_distances other than video tracks (which indicates gradual decoder refresh at recovery point SEI), but I found the following descriptions in the latest mpeg file format meeting report. 1.4.10m39988 Proposal for Consistent Definitions of Audio Sync Samples and Audio Roll Groups *Disposition: accepted in principle* We decide to correct the PreRollEntry and the roll_distance. We keep it a signed value, but restrict it to positive in the PreRollEntry. Into the part 12 corrigendum. Editors to phrase with the authors. > > > // Martin > _______________________________________________ > libav-devel mailing list > [email protected] > https://lists.libav.org/mailman/listinfo/libav-devel > _______________________________________________ libav-devel mailing list [email protected] https://lists.libav.org/mailman/listinfo/libav-devel
