I develop a Python application used to conduct official timing for speedrunning leaderboards based on automated video analysis. And I've caught a little oversight of my own that leads me to wonder if there isn't an oversight at the core of ffmpeg in the 'fps' reported in the output stream.
This relates to Variable Frame Rate video, so please try to hold your shouts of "*JUST CONVERT IT TO CFR*" until after. --- So, I'm opening a rawvideo pipe to ffmpeg in Python (using ffmpeg-python 0.2.0 by Karl Kroening as a command-line wrapper) to receive a bitstream of the frames in a video for analysis. It's blazing fast, btw! Used to do this with OpenCV and it was a slog. (Not to mention unable to transit the "read-head" to the ACTUAL frame or time requested without just landing on a near-ish keyframe instead.) Before accessing the file, I call to ffprobe to get the 'r_frame_rate,' and use those values to identify the frames per second of the footage. And apparently, what 'r_frame_rate' returns is the "Output" stream fps. Which is NOT the "Input" frames per second on VFR video. Have a look at the Input data in my console for this VFR footage, specifically the fps under Stream#0:0: > *Input #0*, mov,mp4,m4a,3gp,3g2,mj2, from 'I:/Downloads/Medieval 111 > IGT.mp4': > Metadata: > major_brand : isom > minor_version : 512 > compatible_brands: isomiso2avc1mp41 > encoder : Lavf58.29.100 > Duration: 00:01:23.94, start: 0.000000, bitrate: 2270 kb/s > *Stream #0:0*(und): Video: h264 (Main) (avc1 / 0x31637661), > yuv420p(tv, bt709), 1280x720 [SAR 1:1 DAR 16:9], 2137 kb/s, *30.01 fps*, > 30 tbr, 90k tbn, 60 tbc (default) So the input is 30.01 fps. Cool. Now look at the Output stream reported. *Output #0*, rawvideo, to 'pipe:': > Metadata: > major_brand : isom > minor_version : 512 > compatible_brands: isomiso2avc1mp41 > encoder : Lavf58.45.100 > *Stream #0:0*: Video: rawvideo (BGR[24] / 0x18524742), bgr24, 446x344 > [SAR 1:1 DAR 223:172], q=2-31, 110465 kb/s, *30 fps*, 30 tbn, 30 tbc > (default) It's just straight-up 30fps. Okay! Great. So, ffmpeg is automatically converting this VFR footage to CFR and handing it back to me at 30.00fps. ...is what I had erroneously assumed. (lot of "buts" coming) But my application that times events detected in the frames of the footage counted 2301 frames between two events. And then told me that 2301 / 30fps is 1m 17s 700ms. Which is correct! AT 30FPS ANYWAY. But when the footage is converted to CFR, or simply mathematically measured at 30.0fps there are NOT 2301 frames between the two events. There are 2300 frames between those events. Because if you account for the .01 of 30.01fps, a frame must be dropped. 2301 frames / 30.01fps = 1m 17s 674ms. And at a flat integer of 30fps, it's not 2301, but 2300 frames / 30fps that gets you a 667ms time. As a matter of approximation, a frame must be discarded to be as accurate as you can while squeezing the footage. But all this is just backstory, because I now realize that ffmpeg is NOT altering the footage of VFR in any way, which makes the MOST sense. It's just handing me back a bitstream of every frame in the video. Perfectly sensible! But while it's handing me back every frame of the 30.01fps media in the output stream, it's also identifying that output as a flat 30fps. Which it IS NOT. As ffmpeg clearly reports in the input stream's console log. And yet more confusing, it seems that ffprobe's 'r_frame_rate' is returning the output frame rate of 30 / 1, instead of whatever values it used to come up with the 30.01 it declares for the input at runtime. To deepen the quagmire, I have pushed VFR video through this application before that averaged 59.96fps. And ffprobe / ffmpeg have reported back to me a frame rate of 59.94fps instead -- snapping into the common NTSC frame base of US-broadcast 60fps TV. So I think, for the first time, I understand the information ffmpeg is giving me. That's good! But I find myself wondering why this would possibly be the intended behavior? The "output" frame rate and the response from ffprobe is automatically snapped to the nearest of an internally stored list of industry-standard framebases and reported as the fps when it is definitely not. There would be no need to "snap" to these values if they were correct. And because the output tends to report round integer values of decisive framerates, it led me to the conclusion that ffmpeg was doing some automatic magic on VFR to present it back to me as CFR. Which it is not doing. AND AND, because the frame rate ffprobe returns is this willfully incorrect industry-standard framebase data, any calculations done USING that value will be decidedly more wrong than they would be when using the VFR frame rate echoed in the INPUT stream. So my question is: Why? What is the virtue or benefit of this mis-reporting behavior of the output-frame rate? If it's a convenience feature to report the nearest industry standard frame rate without another developer having to maintain their own list, that's a good idea! But I can't see why it would be reported as the output frames per second on console, when the output stream's frames are being delivered at a different frame rate. The console representation led me, as a developer, to draw the most obvious conclusion from an application I trust: That if ffmpeg is telling me this output stream is 30fps, the output frames of this stream must be at 30fps. I assumed there was some automatic transcoding applied to VFR, or internal frame-drop / add happening to conform the input to the figure presented on the console. And I scratched my head for a long time wondering why I never got a float value back from 'r_frame_rate' when handing it VFR. So, I'm asking for confirmation that I'm understanding this right. And it's not a rhetorical question when I ask why ffmpeg intentionally mis-reports the best known frame rate figure in the output stream? Is there a reason? Does it make more sense when you aren't pulling a raw bytestream or something? Also, I presume I'll find in the documentation a value other than 'r_frame_rate' where I can poll the actual frame rate - the INPUT frame rate - instead of this snapped and conformed one. Feel free to save me the search if you know, though! Thanks in advance for replies. -Roninpawn _______________________________________________ ffmpeg-user mailing list [email protected] https://ffmpeg.org/mailman/listinfo/ffmpeg-user To unsubscribe, visit link above, or email [email protected] with subject "unsubscribe".
