On Mar 28, 2013, at 11:53 PM, Kalileo <[email protected]> wrote:
> Hi Brad,
> 
> when you start writing the packets (muxing them), you give each audio and 
> video packet a DTS (and PTS) value. You can start at zero. 
> 
> At the start you give the first audio and the first video packet the same 
> value.  For every new packet you have to increase the DTS value accordingly, 
> depending on the length of the  audio or video packet before. Audio and video 
> packets have different lengths, so you increase them using different step 
> values.  
> 
> For example, you can increase the DTS value for every video packets by 4000, 
> and for every audio packets by 2000 (you must correct these values depending 
> on your codecs).
> 
> If you use the correct step values, then at the end of your video, both audio 
> and video DTS values should be roughly the same again. If they are not, your 
> step value is wrong.
> 
> That's all already. Works perfectly for me.

Kalileo -- hey thanks for taking the time to respond, it is good to hear from 
you again. I think you are probably right on target, but I have a few wrinkles 
to add which have caused me to scratch my head a bit. Check these few tidbits 
out: 

- Another poster has mentioned earlier in this thread (if I understood his 
point accurately) that audio and video streams (timing that is) are completely 
unrelated in their handling. While we view these streams as single rendered 
product, that internally they are completely separate entities. There's kind of 
an issue of semantics here, but I'm not sure whether that agrees with or 
contradicts above what you are saying about the relationship between audio and 
video pts / dts. To the best of what I've been able to determine from mailing 
list responses, doc, and my testing, it would appear that these settings for 
audio don't have any material effect on settings for video and vice versa, but 
in viewing the output, they obviously would show sync problems if timings 
weren't right. This seems supported by the next several points which follow. 

- Here's an interesting note: it doesn't appear that pts and dts are even 
relevant for audio. I don't know whether that is the case across the board, or 
only in some specific circumstances, but I don't even have to set either value, 
and the audio is perfect both in the case of writing video frames as well, or 
if I completely turn off writing of all video frames. I've outputted the audio 
pts value when not setting it and it is complete junk, yet the audio is 
perfect. 

- If I completely turn off the writing of all audio frames, there is absolutely 
no change in video rendering -- it still renders video frames at twice the 
speed. This would seem to support the fact that a) pts might only be 
significant for video packets and not for audio, and b) there's no direct 
relationship between video and audio packet pts. 

So my next questions become the following: 

1. Is setting the audio pts and dts even relevant? I've seen no functional 
indication that it is. 

2. Is there any direct thing that the playback codecs do (other than just 
rendering at the proper time) to relate audio timing to video timing? There's 
no comparison or sequencing being done between values is there? 

3. The whole setting of pts and dts is relative to the time_base configured on 
the codec context. According to the documentation, the time_base.num should be 
1, and the time_base.den should be equal to the expected frames per second. I 
have both of these set accordingly. However, I got to thinking, what if you 
expect (I'm going to use round multiples for discussion here, I'm actually 
setting time_base.den to 24 fps) 30 fps, but at runtime receive only 15 fps. 
Will this internally have any material impact to rendering? I think this is 
where some of the FFmpeg code examples may be bypassing an issue common to many 
actual use-cases. They can virtually guarantee frame-rate and proper pts values 
by simply generating X frames and assigning them proper pts. But what happens 
when receiving these frames from an external source and frames aren't delivered 
at the frame rate expected? Is there some compensation that has to be done in 
code, or is the codec smart enough to render frames at
  the timings you stamp on them, regardless of whether the frame rate matches 
your time_base.den setting? 

Thanks,

Brad
_______________________________________________
Libav-user mailing list
[email protected]
http://ffmpeg.org/mailman/listinfo/libav-user

Reply via email to