On Sat, 25 Feb 2017, John Stebbins wrote:

On 02/25/2017 03:09 AM, Martin Storsjö wrote:
On Fri, 24 Feb 2017, John Stebbins wrote:

On 02/24/2017 03:13 PM, Martin Storsjö wrote:
On Fri, 24 Feb 2017, John Stebbins wrote:

On 02/24/2017 02:21 PM, Martin Storsjö wrote:
On Fri, 24 Feb 2017, John Stebbins wrote:

This recordes the number of AAC frames that must be decoded before valid
audio is decoded.

It also signals to Apple tools that the encoder delay is explicitly set
in the edit list.  If the roll sample group is omitted, Apply tools will
apply an implicit rule to remove encoder delay which would result in
removing the delay twice.
---
libavformat/movenc.c  | 63 +++++++++++++++++++++++++++++++++++++++++++++++++++
tests/ref/fate/movenc | 32 +++++++++++++-------------
2 files changed, 79 insertions(+), 16 deletions(-)

diff --git a/libavformat/movenc.c b/libavformat/movenc.c
index 4d351b7..a26dcf7 100644
--- a/libavformat/movenc.c
+++ b/libavformat/movenc.c
@@ -136,6 +136,54 @@ static int mov_write_stco_tag(AVIOContext *pb, MOVTrack 
*track)
    return update_size(pb, pos);
}

+static int mov_write_sgpd_tag(AVIOContext *pb, int16_t roll)
+{
+    int64_t pos = avio_tell(pb);
+    avio_wb32(pb, 0); /* size */
+
+    ffio_wfourcc(pb, "sgpd");
+    avio_w8(pb, 1); /* version */
+    avio_w8(pb, 0); /* flags (1) */
+    avio_wb16(pb, 0); /* flags (2) */
+    ffio_wfourcc(pb, "roll"); /* grouping type */
+    avio_wb32(pb, 2); /* table entry length */
+    avio_wb32(pb, 1); /* table entry count */
+
+    /* table data, roll distance
+     * i.e. number of audio frames to pre-roll after a seek */
+    avio_wb16(pb, roll);
+
+    return update_size(pb, pos);
+}
+
+static int mov_write_sbgp_tag(AVIOContext *pb, MOVTrack *track)
+{
+    int count = 0;
+    int i;
+    int64_t pos;
+
+    for (i = 0; i < track->entry; i++)
+    {
+        count += track->cluster[i].entries;
+    }
The starting brace should be on the same line as for, or omitted
Oops, I know this, but habits are hard to break.

+
+    pos = avio_tell(pb);
+    avio_wb32(pb, 0); /* size */
+
+    ffio_wfourcc(pb, "sbgp"); /* atom name */
+    avio_wb32(pb, 0); /* version & flags */
+    ffio_wfourcc(pb, "roll"); /* grouping type */
+    avio_wb32(pb, 1); /* table entry count */
+
+    /* table data */
+    avio_wb32(pb, count);
+    /* sgpd table index, index values are 1 based
+     * we write 'roll' sample group at index 1 */
+    avio_wb32(pb, 1);
+
+    return update_size(pb, pos);
+}
+
/* Sample size atom */
static int mov_write_stsz_tag(AVIOContext *pb, MOVTrack *track)
{
@@ -1260,6 +1308,7 @@ static int mov_write_dref_tag(AVIOContext *pb)

static int mov_write_stbl_tag(AVFormatContext *s, AVIOContext *pb, MOVTrack 
*track)
{
+    MOVMuxContext *mov = s->priv_data;
    int64_t pos = avio_tell(pb);
    avio_wb32(pb, 0); /* size */
    ffio_wfourcc(pb, "stbl");
@@ -1277,6 +1326,20 @@ static int mov_write_stbl_tag(AVFormatContext *s, 
AVIOContext *pb, MOVTrack *tra
    mov_write_stsc_tag(pb, track);
    mov_write_stsz_tag(pb, track);
    mov_write_stco_tag(pb, track);
+
+    if (track->par->codec_id == AV_CODEC_ID_AAC && mov->use_editlist &&
+        track->start_dts != AV_NOPTS_VALUE &&
+        (track->start_dts < 0 || track->par->initial_padding > 0)) {
+        /* Add sgpd and sbgp tags for AAC tracks.
+         * Apple documentation says they use this as a flag to indicate
+         * that AAC encoder delay is explicitely set in the edit list.
+         * If this is omitted, Apple tools will apply an implicit
+         * rule to remove encoder delay samples, which results in
+         * removal of 2 * delay and A/V desync. */
+        mov_write_sgpd_tag(pb, -1);
The magic value -1 could probably be described here. From the code in
mov_write_sgpd_tag, it seems like it would be a normal actual value, but
this seems like a special case that isn't explained elsewhere.

Otherwise this seems ok.


The parameter "roll" is documented where it is used in
mov_write_sgpd_tag.  But I can explain it here as well.
Yes, there I see that it says "number of audio frames to pre-roll after a
seek". But what does -1 frames mean?


Ah, ok.  Would it be clearer like this? "roll is an offset in frames that 
should be applied when seeking in order to
pre-roll data to the decoder"
I see - then it makes more sense. I guess this means in practice it won't
ever be positive?



ISO 14496-12 says, "A positive value indicates the number of samples after the
sample that is a group member that must be decoded such that at the last of 
these recovery is
complete, i.e. the last sample is correct. A negative value indicates the 
number of samples before the
sample that is a group member that must be decoded in order for recovery to be 
complete at the
marked sample."

I should probably just put the above description in the comment since it's 
quite clear.

Apple documentation only talks about the negative variation. And since the 
objective of this patch is to fix a sync
problem with Apple tools, I stuck to what the apple documentation says. I could 
test it with a positive value to see
what the apple tools do in this case if you have a preference for writing this 
with a positive value.

No, I don't have any preference for that - I just found it confusing. The description above makes it clear though.

// Martin
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to