On 03/24/2011 05:41 PM, John Stebbins wrote:
On 03/23/2011 11:56 PM, John Stebbins wrote:
On 03/23/2011 11:32 PM, John Stebbins wrote:


On 03/23/2011 10:15 PM, Alexander Strange wrote:

On Mar 23, 2011, at 10:10 PM, John Stebbins wrote:

On 03/23/2011 06:25 PM, Alexander Strange wrote:

On Mar 23, 2011, at 9:09 PM, John Stebbins wrote:



On 03/23/2011 05:42 PM, Alexander Strange wrote:
On Mar 23, 2011, at 6:00 AM, Luca Barbato wrote:

On 03/23/2011 08:20 AM, Alexander Strange wrote:
On Tue, Mar 22, 2011 at 10:40 PM, Luca
Barbato<[email protected]> wrote:
On 03/23/2011 03:31 AM, John Stebbins wrote:
fwiw, I've verified that both patches solve the original
problem I had.
I'd pick your since I expect to have 0 duration frames sooner
or later.
Expect to have them where?
The current specification states that Duration should be>0 (as
already
discussed in the thread), so both patch are safe and correct now.
Setting the default in a more explicit way feels more futureproof.

Adding a comment on the Aurel patch would make me happy as well.

lu
I forgot something I should've said first, which is the Duration
field is useless and can be ignored. Except on the last frame it's
the same as pts (timecodes), which is more reliable due to fewer
muxer bugs.
Perian's demuxer, the only one I'm familiar with, ignores the
field. Duplicate timecodes in video are handled the way Aurel's
patch does, which fixes several files I have:

http://trac.perian.org/changeset/1364/trunk/MatroskaImportPrivate.cpp


So that patch should be fine.

Maybe I'm misunderstanding your point. But the duration field is
definitely meant to be used when a SimpleBlock is laced. In this
case the block contains multiple frames and only one timestamp. The
timestamp of each subsequent laced frame after the first is
calculated using the duration.

Yes, Perian ignores the Duration for that and figures it out from
the next largest present timecode. Unfortunately I don't remember
what files caused this to happen, so it's something to look at
another time.


Are you saying that duration is too unreliable to be useful and
libav should ignore it all together? If so, a completely different
patch would be necessary and some kind of fix for
libavformat/util.c:compute_frame_duration is needed. Because this
function is coming up with wildly inaccurate values in some cases.
The generated timestamps for the DTS-HD MA stream I was looking at
were so bad that they would drift far enough during a laced
sequence that the timestamp for the next block would appear to go
backward.

Sorry, I don't have much experience with audio here, but it's
actually a different problem than video tracks. Pretty much all the
durations in DTS-HD are expected to be 0, because the size of one
audio frame is less than the difference between two MKV timecodes. I
would suggest ignoring container times entirely, setting the
timebase to the audio sample rate, and incrementing by the frame
size of the audio.

If you increment even by 1 in the rounded-off MKV timebase you might
actually end up with packet durations much longer than the actual
number of samples in the packet. Could you check that's not
happening with ffprobe -show_packets or so? Any patch that looks OK
there seems fine.

Actually, the durations of the audio frames in my sample are around
10ms which is well above the round-off threshold.

Who knew there were so many problems… anyway, here's a sample with
that DTS-HD thing:

http://astrange.ithinksw.net/ffmpeg/TrueHD_Sample_2.mkv

You mean TrueHD, not DTS-HD. But thanks for the sample. It doesn't cause
any issues for my app. We keep track of the next expected timestamp and
as long as the difference between it and the timestamp that is delivered
to us doesn't exceed some threshold, we use our computed timestamp. When
the threshold is exceeded, we assume a discontinuity occurred and make
adjustments necessary to maintain A/V sync.


But that's not really important here. Our code allows for quite a lot
of slop in audio timestamps. And we deal fine with timestamps that
are AV_NOPTS_VALUE. The issue is that during this laced sequence of
audio, libav is delivering unreliable timestamps that are bad guesses
and the difference between where we think the next timestamp should
be and what libav delivers gets larger than the threshold we allow
(70ms). At this point, we decide that there must have been a
significant discontinuity in the audio, so we insert frames of
silence in order to keep A/V in sync. Then when the next block
begins, we get a reliable timestamp that jumps backward in time by
the same 70+ms that the unreliable timestamps drifted and we have to
drop several frames to put A/V back in sync again.

If we are going to assume that the duration supplied in the mkv file
can't be used because too many muxers set it incorrectly (and I do
believe you when you say this is the case). Then libav needs a way to
signal to the app when a timestamp is reliable and when it is just a
wild guess. Then the app can loosen it's constraints for keeping A/V
sync when it sees that the timestamp is just a guess. I'm tempted to
say that libav shouldn't be setting pts or dts at all in such cases
and just pass AV_NOPTS_VALUE. If you want to provide a convenience to
the app, provide a public api for the app to obtain libav's best
guess (or add a pts_guess member to AVFrame to store the guess in.

Luckily I think exactness is achievable here..

In the meantime timestamps that go up are better than ones that go
down, I guess.

What do you mean by "exactness is achievable"? Do you have some idea how
correct the duration that libavformat is calculating and using to guess
the next pts? I haven't looked close enough yet at
compute_frame_duration to see if there is some error there or if it is
just making necessary guesses because there isn't enough information
available to it.


If it's any help, here's the sample that triggers the problem in our app
http://www.stebbins.biz/images/dts-hd_sync_problem.mkv

Another one of our users has come up with another variation of this
problem. In this case, the DefaultDuration is not set in the Track
header of the mkv. The audio is dts-hd. SimpleBlocks with lacing is
being used. Since DefaultDuration is not set, libavformat ends up
falling back on using compute_frame_duration to guess timestamps. And
the guesses are very bad as before. Here's a sample of the timestamps we
get from av_read_frame with this stream. As you can see, they jump by
35'ish at each step (should be 10'ish) until the laces are complete and
the next block begins. Then the timestamp jumps backward by more than 100.
Sample: http://www.stebbins.biz/images/garbled_audio_dts-hd.mkv
dts 1 pts 1
dts 1 pts 1
dts 37 pts 37
dts 73 pts 73
dts 108 pts 108
dts 143 pts 143
dts 178 pts 178
dts 213 pts 213
dts 87 pts 87
dts 121 pts 121
dts 155 pts 155
dts 189 pts 189
dts 224 pts 224
dts 259 pts 259
dts 294 pts 294
dts 330 pts 330
dts 172 pts 172
dts 208 pts 208
dts 244 pts 244
dts 279 pts 279
dts 314 pts 314
dts 349 pts 349
dts 383 pts 383
dts 417 pts 417
dts 204 pts 204


This may be obvious to some following this thread, but I think I finally noodled out why dts-hd causes such a problem. When libavformat has to fall back on using compute_frame_duration, it ends up using the following to calculate the frame size (in samples):
  frame_size = ((int64_t)size * 8 * enc->sample_rate) / enc->bit_rate;

Normally this would be a fair approximation because enc->bit_rate (which is the transmission rate) is usually going to be pretty close to the rate it is intended to be received. And if there isn't a lot of overhead in "size" (which is the size of the pkt in bytes) then the calculation works reasonably well. But for dts-hd, bitrate is for the dts core, and size is for the whole packet including the hd parts. So the calculation overestimates the frame_size by a large margin (factor of 3).
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to