Re: [Libav-user] Video and audio timing / syncing

Brad O'Hearne Sun, 31 Mar 2013 16:32:01 -0700

On Mar 31, 2013, at 1:25 AM, Alex Cohn <[email protected]> wrote:
> I am not sure when "duration" is taken into account, but you could
> simply set current->pts = prev->pts+2. Note that this was my original
> proposal.


Alex -- I considered that approach, and essentially ran that same test -- 
manually manipulating the pts values that is. But the problem with this is 
twofold: 

- A "+2" pts increase likely will only work when you have an actual frame rate 
that is half that of the expected frame rate which is used to initially set the 
time_base. That's not my use case (the closest approximation is 24 fps 
expected, and on this particular computer / camera I'm testing with, 15fps. And 
I think in theory, the only video source that can feed an encoder with 
fixed-fps video is one that is fully known and controllable prior to encoding 
(like an existing video file, or just generating data like the examples do). 
More about this in a moment. 

- If I'm going to muck with the video pts in this fashion, the audio pts also 
has to be mucked with to keep it in sync, and of course, audio samples are 
received at a different rate than the video. So that's a complexity there. 

The bigger picture is the matter of having encoding determined by a frame 
rate-based time_base. As I alluded to above, unless you are just generating 
audio/video at runtime, or are reading from a file where the video obviously 
already exists and all the meta-data is available up front, I would think that 
any live capture video source is theoretically variable in nature, due to 
almost certain variances in hardware, computer, etc. 

That said, it raises the question of having a time base in terms of frame rate, 
rather than time-base in terms of time. One thing I encountered with some 
frequency in my Google journeys were blog posts and examples discussing 
actually setting time_base.den to 1000, not in terms of frame rate, but 
milliseconds in a second. This seems to make more sense to me, but there must 
be a reason, that time_base was pinned to frame rate (if anyone knows, I'd be 
interested to hear). 

I'm not sure if this was the intended design or not, but it seems peculiar to 
me that while I have sample buffers that provide exact presentation time (with 
scale), decode time (with scale), and duration time (with scale) for every 
single frame, this information does appear to be enough on its own to encode 
the frames with proper timing, simply because the codec context needs a fixed 
frame rate configured up front before encoding begins. Somehow that just seems 
wrong, and I feel like I must be missing something simple -- having all of this 
information should be enough for encoding. 

I'll ask this one again -- because I would think this would be the way to iron 
out the discrepancy: what is the net effect of duration on timing? If I set an 
accurate pts and dts (which I am, I have the exact info for these), why does 
setting the exact duration of the frame not account for the other missing piece 
of the puzzle. Essentially that info equals sequence, start time and length -- 
that would seem complete...any thoughts?

Brad



_______________________________________________
Libav-user mailing list
[email protected]
http://ffmpeg.org/mailman/listinfo/libav-user

Re: [Libav-user] Video and audio timing / syncing

Reply via email to