Hello,

what I am trying to achieve is the following: I have multiple V4L sources with possibly different time bases (different internal start times, different fps) and I would like to capture them live and in sync to separate files. In my particular case those devices are two HDMI grabbers at 60 FPS and two webcams capturing at ~30 fps.

Ideally, I would also like to preview the sources at the same time using e.g. an SDL or OpenGL window showing the inputs using overlay or {v,h}stack.

If necessary, frames should be dropped or duplicated in order to maintain real-time even when capturing for a long period (say, 2-5 hours). The resulting videos should be constant frame rate (30 and 60 FPS, respectively, or all 60 FPS if that should be necessary).

The main problem that I have is that the different video steams are not in sync right from the beginning and I cannot find a way to make them synchronized.

As I will explain, I think the underlying problem (or solution) is quite simple, however to give you an idea how a minimal, naive approach could look like, consider the following example:


ffmpeg -y \
    -video_size .. -input_format .. -framerate 60 -i /dev/video0 \
    -video_size .. -input_format .. -framerate 60 -i /dev/video1  \
    -video_size .. -input_format .. -framerate 30 -i /dev/video2 \
    -video_size .. -input_format .. -framerate 30 -i /dev/video3 \
    -filter_complex "
        [0:v] format=abgr, vflip, split [hdmi0a][hdmi0b];
        [1:v] format=abgr, vflip, split [hdmi1a][hdmi1b];
        [2:v] format=abgr, split [cam0a][cam0b];
        [3:v] format=abgr, split [cam1a][cam1b];
[hdmi0a] scale=.. [tmp0], [hdmi1a] scale=.., [tmp0] hstack [hdmistack]; [cam0a] scale=.. [tmp1], [cam1a] scale=.., [tmp1] hstack [camstack];
        [hdmistack][camstack] vstack [preview]
    " \
    -map "[preview]" -f opengl - \
    -map "[hdmi0b]" -c:v h264_nvenc -qp 23 /tmp/hdmi0.mkv \
    -map "[hdmi1b]" -c:v h264_nvenc -qp 23 /tmp/hdmi1.mkv \
    -map "[cam0b]" -c:v h264 -qp 23 /tmp/cam0.mkv \
    -map "[cam1b]" -c:v h264 -qp 23 /tmp/cam1.mkv


When the V4L devices are initialized, the start timestamps of the input steams are all either a bit different (I suppose due to the fact that the devices are initialized in a particular order) and some start times may even be zero, supposedly due to a bug in Magewell HDMI capture boxes:


ffmpeg version 3.4.1 Copyright (c) 2000-2017 the FFmpeg developers
  built with gcc 7.2.1 (GCC) 20171224
configuration: --prefix=/usr --disable-debug --disable-static --disable-stripping --enable-avisynth --enable-avresample --enable-fontconfig --enable-gmp --enable-gnutls --enable-gpl --enable-ladspa --enable-libass --enable-libbluray --enable-libfreetype --enable-libfribidi --enable-libgsm --enable-libiec61883 --enable-libmodplug --enable-libmp3lame --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libv4l2 --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxcb --enable-libxml2 --enable-libxvid --enable-shared --enable-version3 --enable-opengl --enable-opencl
  libavutil      55. 78.100 / 55. 78.100
  libavcodec     57.107.100 / 57.107.100
  libavformat    57. 83.100 / 57. 83.100
  libavdevice    57. 10.100 / 57. 10.100
  libavfilter     6.107.100 /  6.107.100
  libavresample   3.  7.  0 /  3.  7.  0
  libswscale      4.  8.100 /  4.  8.100
  libswresample   2.  9.100 /  2.  9.100
  libpostproc    54.  7.100 / 54.  7.100
[video4linux2,v4l2 @ 0x5606648ea320] Dequeued v4l2 buffer contains corrupted data (0 bytes).
Input #0, video4linux2,v4l2, from '/dev/video0':
  Duration: N/A, start: 0.000000, bitrate: 2985984 kb/s
Stream #0:0: Video: rawvideo (BGR[24] / 0x18524742), bgr24, 1920x1080, 2985984 kb/s, 60 fps, 60 tbr, 1000k tbn, 1000k tbc
Input #1, video4linux2,v4l2, from '/dev/video1':
  Duration: N/A, start: 122510.232758, bitrate: 2985984 kb/s
Stream #1:0: Video: rawvideo (BGR[24] / 0x18524742), bgr24, 1920x1080, 2985984 kb/s, 60 fps, 60 tbr, 1000k tbn, 1000k tbc
Input #2, video4linux2,v4l2, from '/dev/video2':
  Duration: N/A, start: 122510.619448, bitrate: N/A
Stream #2:0: Video: mjpeg, yuvj422p(pc, bt470bg/unknown/unknown), 640x360, 30 fps, 30 tbr, 1000k tbn, 1000k tbc
Input #3, video4linux2,v4l2, from '/dev/video3':
  Duration: N/A, start: 122510.997742, bitrate: N/A
Stream #3:0: Video: mjpeg, yuvj422p(pc, bt470bg/unknown/unknown), 640x360, 30 fps, 30 tbr, 1000k tbn, 1000k tbc
Stream mapping:
  Stream #0:0 (rawvideo) -> format
  Stream #1:0 (rawvideo) -> format
  Stream #2:0 (mjpeg) -> format
  Stream #3:0 (mjpeg) -> format
  vstack -> Stream #0:0 (rawvideo)
  split:output1 -> Stream #1:0 (h264_nvenc)
  split:output1 -> Stream #2:0 (h264_nvenc)
  split:output1 -> Stream #3:0 (libx264)
  split:output1 -> Stream #4:0 (libx264)


The result is that I do get a window showing a 2x2 grid of the inputs and the files are being written out to disk. However, the inputs are not in sync by 0.5-1 seconds. I have tried all kinds of combinations of input and output frame rate settings (-r), "vsync" settings, the "fps" filter and the "setpts" filter but I have never achieved what I was looking for.

I'm sometimes using OBS Studio to do recordings from multiple sources by placing the sources next to each other in a large canvas, recording the resulting video in 60 FPS and later on splitting the parts within the video to separate videos for editing. What OBS seemingly does is to simply pull the latest frame available from all input devices, rendering them together into the canvas and writing the resulting frame out to disk. This kind of synchronization is what I would like to achieve with ffmpeg to record to separate files directly.

Now to the core of my question: Is there any way how I can fetch the latest frame from all input devices at a fixed interval (60 fps), dropping all possibly buffered frames as necessary, to just keep the last available frame from each device and writing that out? I would like to ignore all timestamps that any device gives me and just fetch data as quickly as possible and write that out at a fixed rate starting from a common timestamp zero.

I thought the "fps" video filter would do something like that but no matter what I tried, ffmpeg always seems to try to do magic to synchronize things in a way which do not correspond to real-time. I would imagine a hypothetical "live sync" filter that gets n inputs, fetches the latest frame available from all devices and sends the result (roughly corresponding to the complete rendered image of OBS, just in separate streams) out to disk, or rather to the encoders. Or maybe there is some solution based on separate (ffmpeg?) processes which provide real-time buffers to an aggregating instance of ffmpeg, but I wasn't successful with such an approach yet either.

I would be grateful for any advice.

Best regards
Ochi
_______________________________________________
ffmpeg-user mailing list
[email protected]
http://ffmpeg.org/mailman/listinfo/ffmpeg-user

To unsubscribe, visit link above, or email
[email protected] with subject "unsubscribe".

Reply via email to