Bug#1034527: unblock: nageru/2.2.1-1

Steinar H. Gunderson Mon, 17 Apr 2023 10:21:22 -0700

Package: release.debian.org
Severity: normal
User: release.debian....@packages.debian.org
Usertags: unblock
X-Debbugs-Cc: nag...@packages.debian.org
Control: affects -1 + src:nageru


Please unblock package nageru

Please consider unblocking Nageru 2.2.1-1. It contains a number
of focused fixes, mostly for crash bugs related to video stream
inputs (e.g. from video files, or from the network). As upstream,
I made 2.2.1 specifically as a set of targeted fixes over 2.2.0
to make this feature less broken for bookworm.

Nageru is a leaf package, so there should be fairly low risk.
I've tested both myself and had user reports verifying the fixes
(in upstream git).

There are no Debian package changes (beyond the changelog),
only upstream changes. debdiff is attached.

[ Checklist ]
  [x] all changes are documented in the d/changelog
  [x] I reviewed all changes and I approve them
  [x] attach debdiff against the package in testing

unblock nageru/2.2.1-1

diff -Nru nageru-2.2.0/debian/changelog nageru-2.2.1/debian/changelog
--- nageru-2.2.0/debian/changelog       2022-11-23 09:26:33.000000000 +0100
+++ nageru-2.2.1/debian/changelog       2023-04-17 18:37:27.000000000 +0200
@@ -1,3 +1,10 @@
+nageru (2.2.1-1) unstable; urgency=medium
+
+  * New upstream release.
+    * Fixes several crash bugs related to video inputs. (Closes: #1034471)
+
+ -- Steinar H. Gunderson <se...@debian.org>  Mon, 17 Apr 2023 18:37:27 +0200
+
 nageru (2.2.0-2) unstable; urgency=medium
 
   * Remove ppc64el from futatabi's architecture list.
diff -Nru nageru-2.2.0/meson.build nageru-2.2.1/meson.build
--- nageru-2.2.0/meson.build    2022-11-15 00:25:44.000000000 +0100
+++ nageru-2.2.1/meson.build    2023-04-17 18:35:47.000000000 +0200
@@ -1,4 +1,4 @@
-project('nageru', 'cpp', default_options: ['buildtype=debugoptimized'], 
version: '2.2.0')
+project('nageru', 'cpp', default_options: ['buildtype=debugoptimized'], 
version: '2.2.1')
 
 cxx = meson.get_compiler('cpp')
 qt5 = import('qt5')
diff -Nru nageru-2.2.0/nageru/cef_capture.cpp 
nageru-2.2.1/nageru/cef_capture.cpp
--- nageru-2.2.0/nageru/cef_capture.cpp 2022-11-15 00:25:44.000000000 +0100
+++ nageru-2.2.1/nageru/cef_capture.cpp 2023-04-17 18:35:47.000000000 +0200
@@ -165,8 +165,6 @@
        });
 }
 
-#define FRAME_SIZE (8 << 20)  // 8 MB.
-
 void CEFCapture::configure_card()
 {
        if (video_frame_allocator == nullptr) {
diff -Nru nageru-2.2.0/nageru/decklink_capture.cpp 
nageru-2.2.1/nageru/decklink_capture.cpp
--- nageru-2.2.0/nageru/decklink_capture.cpp    2022-11-15 00:25:44.000000000 
+0100
+++ nageru-2.2.1/nageru/decklink_capture.cpp    2023-04-17 18:35:47.000000000 
+0200
@@ -24,8 +24,6 @@
 #include "shared/memcpy_interleaved.h"
 #include "v210_converter.h"
 
-#define FRAME_SIZE (8 << 20)  // 8 MB.
-
 using namespace std;
 using namespace std::chrono;
 using namespace std::placeholders;
diff -Nru nageru-2.2.0/nageru/defs.h nageru-2.2.1/nageru/defs.h
--- nageru-2.2.0/nageru/defs.h  2022-11-15 00:25:44.000000000 +0100
+++ nageru-2.2.1/nageru/defs.h  2023-04-17 18:35:47.000000000 +0200
@@ -3,11 +3,12 @@
 
 #include <libavformat/version.h>
 
-#define MAX_FPS 60
+#define TYPICAL_FPS 60
 #define FAKE_FPS 25  // Must be an integer.
 // #define MAX_VIDEO_CARDS 16  // defined in shared_defs.h.
 #define MAX_ALSA_CARDS 16
 #define MAX_BUSES 256  // Audio buses.
+#define FRAME_SIZE (8 << 20)  // 8 MB. (FIXME: Not enough for a 2160p frame!)
 
 // For deinterlacing. See also comments on InputState.
 #define FRAME_HISTORY_LENGTH 5
diff -Nru nageru-2.2.0/nageru/ffmpeg_capture.cpp 
nageru-2.2.1/nageru/ffmpeg_capture.cpp
--- nageru-2.2.0/nageru/ffmpeg_capture.cpp      2022-11-15 00:25:44.000000000 
+0100
+++ nageru-2.2.1/nageru/ffmpeg_capture.cpp      2023-04-17 18:35:47.000000000 
+0200
@@ -26,6 +26,7 @@
 #include <cstdint>
 #include <utility>
 #include <vector>
+#include <unordered_set>
 
 #include <Eigen/Core>
 #include <Eigen/LU>
@@ -43,8 +44,6 @@
 #include <srt/srt.h>
 #endif
 
-#define FRAME_SIZE (8 << 20)  // 8 MB.
-
 using namespace std;
 using namespace std::chrono;
 using namespace bmusb;
@@ -454,24 +453,63 @@
        }
 }
 
-AVPixelFormat get_vaapi_hw_format(AVCodecContext *ctx, const AVPixelFormat 
*fmt)
+template<AVHWDeviceType type>
+AVPixelFormat get_hw_format(AVCodecContext *ctx, const AVPixelFormat *fmt)
 {
-       for (const AVPixelFormat *fmt_ptr = fmt; *fmt_ptr != -1; ++fmt_ptr) {
-               for (int i = 0;; ++i) {  // Termination condition inside loop.
-                       const AVCodecHWConfig *config = 
avcodec_get_hw_config(ctx->codec, i);
-                       if (config == nullptr) {  // End of list.
-                               fprintf(stderr, "Decoder %s does not support 
device.\n", ctx->codec->name);
-                               break;
-                       }
-                       if (config->methods & 
AV_CODEC_HW_CONFIG_METHOD_HW_DEVICE_CTX &&
-                           config->device_type == AV_HWDEVICE_TYPE_VAAPI &&
-                           config->pix_fmt == *fmt_ptr) {
+       bool found_config_of_right_type = false;
+       for (int i = 0;; ++i) {  // Termination condition inside loop.
+               const AVCodecHWConfig *config = 
avcodec_get_hw_config(ctx->codec, i);
+               if (config == nullptr) {  // End of list.
+                       break;
+               }
+               if (!(config->methods & 
AV_CODEC_HW_CONFIG_METHOD_HW_DEVICE_CTX) ||
+                   config->device_type != type) {
+                       // Not interesting for us.
+                       continue;
+               }
+
+               // We have a config of the right type, but does it actually 
support
+               // the pixel format we want? (Seemingly, FFmpeg's way of 
signaling errors
+               // is to just replace the pixel format with a software-decoded 
one,
+               // such as yuv420p.)
+               found_config_of_right_type = true;
+               for (const AVPixelFormat *fmt_ptr = fmt; *fmt_ptr != -1; 
++fmt_ptr) {
+                       if (config->pix_fmt == *fmt_ptr) {
+                               fprintf(stderr, "Initialized '%s' hardware 
decoding for codec '%s'.\n",
+                                       av_hwdevice_get_type_name(type), 
ctx->codec->name);
+                               if (ctx->profile == FF_PROFILE_H264_BASELINE) {
+                                       fprintf(stderr, "WARNING: Stream claims 
to be H.264 Baseline, which is generally poorly supported in hardware 
decoders.\n");
+                                       fprintf(stderr, "         Consider 
encoding it as Constrained Baseline, Main or High instead.\n");
+                                       fprintf(stderr, "         Decoding 
might fail and fall back to software.\n");
+                               }
                                return config->pix_fmt;
                        }
                }
+               fprintf(stderr, "Decoder '%s' supports only these pixel 
formats:", ctx->codec->name);
+               unordered_set<AVPixelFormat> seen;
+               for (const AVPixelFormat *fmt_ptr = fmt; *fmt_ptr != -1; 
++fmt_ptr) {
+                       if (!seen.count(*fmt_ptr)) {
+                               fprintf(stderr, " %s", 
av_get_pix_fmt_name(*fmt_ptr));
+                               seen.insert(*fmt_ptr);
+                       }
+               }
+               fprintf(stderr, " (wanted %s for hardware acceleration)\n", 
av_get_pix_fmt_name(config->pix_fmt));
+
+       }
+
+       if (!found_config_of_right_type) {
+               fprintf(stderr, "Decoder '%s' does not support device type 
'%s'.\n", ctx->codec->name, av_hwdevice_get_type_name(type));
        }
 
-       // We found no VA-API formats, so take the best software format.
+       // We found no VA-API formats, so take the first software format.
+       for (const AVPixelFormat *fmt_ptr = fmt; *fmt_ptr != -1; ++fmt_ptr) {
+               if ((av_pix_fmt_desc_get(*fmt_ptr)->flags & 
AV_PIX_FMT_FLAG_HWACCEL) == 0) {
+                       fprintf(stderr, "Falling back to software format 
%s.\n", av_get_pix_fmt_name(*fmt_ptr));
+                       return *fmt_ptr;
+               }
+       }
+
+       // Fallback: Just return anything. (Should never really happen.)
        return fmt[0];
 }
 
@@ -544,14 +582,24 @@
        }
 
        // Seemingly, it's not too easy to make something that just initializes
-       // “whatever goes”, so we don't get VDPAU or CUDA here without 
enumerating
-       // through several different types. VA-API will do for now.
+       // “whatever goes”, so we don't get CUDA or VULKAN or whatever here
+       // without enumerating through several different types.
+       // VA-API and VDPAU will do for now. We prioritize VDPAU for the
+       // simple reason that there's a VA-API-via-VDPAU emulation for NVidia
+       // cards that seems to work, but just hangs when trying to transfer the 
frame.
+       //
+       // Note that we don't actually check codec support beforehand,
+       // so if you have a low-end VDPAU device but a high-end VA-API device,
+       // you lose out on the extra codec support from the latter.
        AVBufferRef *hw_device_ctx = nullptr;
-       if (av_hwdevice_ctx_create(&hw_device_ctx, AV_HWDEVICE_TYPE_VAAPI, 
nullptr, nullptr, 0) < 0) {
-               fprintf(stderr, "Failed to initialize VA-API for FFmpeg 
acceleration. Decoding video in software.\n");
-       } else {
+       if (av_hwdevice_ctx_create(&hw_device_ctx, AV_HWDEVICE_TYPE_VDPAU, 
nullptr, nullptr, 0) >= 0) {
                video_codec_ctx->hw_device_ctx = av_buffer_ref(hw_device_ctx);
-               video_codec_ctx->get_format = get_vaapi_hw_format;
+               video_codec_ctx->get_format = 
get_hw_format<AV_HWDEVICE_TYPE_VDPAU>;
+       } else if (av_hwdevice_ctx_create(&hw_device_ctx, 
AV_HWDEVICE_TYPE_VAAPI, nullptr, nullptr, 0) >= 0) {
+               video_codec_ctx->hw_device_ctx = av_buffer_ref(hw_device_ctx);
+               video_codec_ctx->get_format = 
get_hw_format<AV_HWDEVICE_TYPE_VAAPI>;
+       } else {
+               fprintf(stderr, "Failed to initialize VA-API or VDPAU for 
FFmpeg acceleration. Decoding video in software.\n");
        }
 
        if (avcodec_open2(video_codec_ctx.get(), video_codec, nullptr) < 0) {
@@ -591,6 +639,7 @@
 
        // Main loop.
        bool first_frame = true;
+       int consecutive_errors = 0;
        while (!producer_thread_should_quit.should_quit()) {
                if (process_queued_commands(format_ctx.get(), pathname, 
last_modified, /*rewound=*/nullptr)) {
                        return true;
@@ -607,7 +656,14 @@
                AVFrameWithDeleter frame = decode_frame(format_ctx.get(), 
video_codec_ctx.get(), audio_codec_ctx.get(),
                        pathname, video_stream_index, audio_stream_index, 
subtitle_stream_index, audio_frame.get(), &audio_format, &audio_pts, &error);
                if (error) {
-                       return false;
+                       if (++consecutive_errors >= 100) {
+                               fprintf(stderr, "More than 100 consecutive 
video frames, aborting playback.\n");
+                               return false;
+                       } else {
+                               continue;
+                       }
+               } else {
+                       consecutive_errors = 0;
                }
                if (frame == nullptr) {
                        // EOF. Loop back to the start if we can.
@@ -715,7 +771,7 @@
                                                1e3 * duration<double>(now - 
next_frame_start).count());
                                        pts_origin = frame->pts;
                                        start = next_frame_start = now;
-                                       timecode += MAX_FPS * 2 + 1;
+                                       timecode += TYPICAL_FPS * 2 + 1;
                                }
                        }
                        bool finished_wakeup;
@@ -733,7 +789,7 @@
                                        // Make sure to get the audio resampler 
reset. (This is a hack;
                                        // ideally, the frame callback should 
just accept a way to signal
                                        // audio discontinuity.)
-                                       timecode += MAX_FPS * 2 + 1;
+                                       timecode += TYPICAL_FPS * 2 + 1;
                                }
                                last_neutral_color = 
get_neutral_color(frame->metadata);
                                if (frame_callback != nullptr) {
@@ -891,7 +947,8 @@
                // Decode video, if we have a frame.
                int err = avcodec_receive_frame(video_codec_ctx, 
video_avframe.get());
                if (err == 0) {
-                       if (video_avframe->format == AV_PIX_FMT_VAAPI) {
+                       if (video_avframe->format == AV_PIX_FMT_VAAPI ||
+                           video_avframe->format == AV_PIX_FMT_VDPAU) {
                                // Get the frame down to the CPU. (TODO: See if 
we can keep it
                                // on the GPU all the way, since it will be 
going up again later.
                                // However, this only works if the OpenGL GPU 
is the same one.)
@@ -1087,6 +1144,16 @@
 
                current_frame_ycbcr_format = decode_ycbcr_format(desc, frame, 
is_mjpeg, &last_colorspace, &last_chroma_location);
        }
+
+       // FIXME: Currently, if the video is too high-res for one of the 
allocated
+       // frames, we simply refuse to scale it here to avoid crashes. It would 
be better
+       // if we could somehow signal getting larger frames, especially as 4K 
is a thing now.
+       if (video_frame->len > FRAME_SIZE) {
+               fprintf(stderr, "%s: Decoded frame would be larger than 
supported FRAME_SIZE (%zu > %u), not decoding.\n", pathname.c_str(), 
video_frame->len, FRAME_SIZE);
+               *error = true;
+               return video_frame;
+       }
+
        sws_scale(sws_ctx.get(), frame->data, frame->linesize, 0, 
frame->height, pic_data, linesizes);
 
        return video_frame;
diff -Nru nageru-2.2.0/nageru/main.cpp nageru-2.2.1/nageru/main.cpp
--- nageru-2.2.0/nageru/main.cpp        2022-11-15 00:25:44.000000000 +0100
+++ nageru-2.2.1/nageru/main.cpp        2023-04-17 18:35:47.000000000 +0200
@@ -67,6 +67,9 @@
                global_flags.va_display = 
QuickSyncEncoder::get_usable_va_display();
        }
 
+       // The OpenGL widgets do not work well with the native Wayland 
integration.
+       setenv("QT_QPA_PLATFORM", "xcb", 0);
+
        if ((global_flags.va_display.empty() ||
             global_flags.va_display[0] != '/') && 
!global_flags.x264_video_to_disk) {
                // We normally use EGL for zerocopy, but if we use VA against 
DRM
diff -Nru nageru-2.2.0/nageru/mixer.cpp nageru-2.2.1/nageru/mixer.cpp
--- nageru-2.2.0/nageru/mixer.cpp       2022-11-15 00:25:44.000000000 +0100
+++ nageru-2.2.1/nageru/mixer.cpp       2023-04-17 18:35:47.000000000 +0200
@@ -618,7 +618,7 @@
        if (is_active) {
                card->capture->set_frame_callback(bind(&Mixer::bm_frame, this, 
card_index, _1, _2, _3, _4, _5, _6, _7));
                if (card->frame_allocator == nullptr) {
-                       card->frame_allocator.reset(new 
PBOFrameAllocator(pixel_format, 8 << 20, global_flags.width, 
global_flags.height, card_index, mjpeg_encoder.get()));  // 8 MB.
+                       card->frame_allocator.reset(new 
PBOFrameAllocator(pixel_format, FRAME_SIZE, global_flags.width, 
global_flags.height, card_index, mjpeg_encoder.get()));
                } else {
                        // The format could have changed, but we cannot reset 
the allocator
                        // and create a new one from scratch, since there may 
be allocated
@@ -627,7 +627,7 @@
                        // any old ones as they come back. This takes the mutex 
while
                        // allocating, but nothing should really be sending 
frames in there
                        // right now anyway (start_bm_capture() has not been 
called yet).
-                       card->frame_allocator->reconfigure(pixel_format, 8 << 
20, global_flags.width, global_flags.height, card_index, mjpeg_encoder.get());
+                       card->frame_allocator->reconfigure(pixel_format, 
FRAME_SIZE, global_flags.width, global_flags.height, card_index, 
mjpeg_encoder.get());
                }
                
card->capture->set_video_frame_allocator(card->frame_allocator.get());
                if (card->surface == nullptr) {
@@ -981,7 +981,7 @@
        // (Could be nonintegral, but resampling will save us then.)
        const int silence_samples = OUTPUT_FREQUENCY * 
video_format.frame_rate_den / video_format.frame_rate_nom;
 
-       if (dropped_frames > MAX_FPS * 2) {
+       if (dropped_frames > TYPICAL_FPS * 2) {
                fprintf(stderr, "%s lost more than two seconds (or time code 
jumping around; from 0x%04x to 0x%04x), resetting resampler\n",
                        description_for_card(card_index).c_str(), 
card->last_timecode, timecode);
                audio_mixer->reset_resampler(device);
diff -Nru nageru-2.2.0/nageru/quicksync_encoder.cpp 
nageru-2.2.1/nageru/quicksync_encoder.cpp
--- nageru-2.2.0/nageru/quicksync_encoder.cpp   2022-11-15 00:25:44.000000000 
+0100
+++ nageru-2.2.1/nageru/quicksync_encoder.cpp   2023-04-17 18:35:47.000000000 
+0200
@@ -592,51 +592,6 @@
                                            {IDR(PBB)(PBB)}.
 */
 
-// General pts/dts strategy:
-//
-// Getting pts and dts right with variable frame rate (VFR) and B-frames can 
be a
-// bit tricky. We assume first of all that the frame rate never goes _above_
-// MAX_FPS, which gives us a frame period N. The decoder can always decode
-// in at least this speed, as long at dts <= pts (the frame is not attempted
-// presented before it is decoded). Furthermore, we never have longer chains of
-// B-frames than a fixed constant C. (In a B-frame chain, we say that the base
-// I/P-frame has order O=0, the B-frame depending on it directly has order O=1,
-// etc. The last frame in the chain, which no B-frames depend on, is the “tip”
-// frame, with an order O <= C.)
-//
-// Many strategies are possible, but we establish these rules:
-//
-//  - Tip frames have dts = pts - (C-O)*N.
-//  - Non-tip frames have dts = dts_last + N.
-//
-// An example, with C=2 and N=10 and the data flow showed with arrows:
-//
-//        I  B  P  B  B  P
-//   pts: 30 40 50 60 70 80
-//        ↓  ↓     ↓
-//   dts: 10 30 20 60 50←40
-//         |  |  ↑        ↑
-//         `--|--'        |
-//             `----------'
-//
-// To show that this works fine also with irregular spacings, let's say that
-// the third frame is delayed a bit (something earlier was dropped). Now the
-// situation looks like this:
-//
-//        I  B  P  B  B   P
-//   pts: 30 40 80 90 100 110
-//        ↓  ↓     ↓
-//   dts: 10 30 20 90 50←40
-//         |  |  ↑        ↑
-//         `--|--'        |
-//             `----------'
-//
-// The resetting on every tip frame makes sure dts never ends up lagging a lot
-// behind pts, and the subtraction of (C-O)*N makes sure pts <= dts.
-//
-// In the output of this function, if <dts_lag> is >= 0, it means to reset the
-// dts from the current pts minus <dts_lag>, while if it's -1, the frame is not
-// a tip frame and should be given a dts based on the previous one.
 #define FRAME_P 0
 #define FRAME_B 1
 #define FRAME_I 2
@@ -645,12 +600,10 @@
     int encoding_order, int intra_period,
     int intra_idr_period, int ip_period,
     int *displaying_order,
-    int *frame_type, int *pts_lag)
+    int *frame_type)
 {
     int encoding_order_gop = 0;
 
-    *pts_lag = 0;
-
     if (intra_period == 1) { /* all are I/IDR frames */
         *displaying_order = encoding_order;
         if (intra_idr_period == 0)
@@ -682,20 +635,13 @@
 
     // We have B-frames. Sequence is like IDR (PBB)(PBB)(IBB)(PBB).
     encoding_order_gop = (intra_idr_period == 0) ? encoding_order : 
(encoding_order % (intra_idr_period + 1));
-    *pts_lag = -1;  // Most frames are not tip frames.
          
     if (encoding_order_gop == 0) { /* the first frame */
         *frame_type = FRAME_IDR;
         *displaying_order = encoding_order;
-        // IDR frames are a special case; I honestly can't find the logic 
behind
-        // why this is the right thing, but it seems to line up nicely in 
practice :-)
-        *pts_lag = TIMEBASE / MAX_FPS;
     } else if (((encoding_order_gop - 1) % ip_period) != 0) { /* B frames */
         *frame_type = FRAME_B;
         *displaying_order = encoding_order - 1;
-        if ((encoding_order_gop % ip_period) == 0) {
-            *pts_lag = 0;  // Last B-frame.
-        }
     } else if (intra_period != 0 && /* have I frames */
                encoding_order_gop >= 2 &&
                ((encoding_order_gop - 1) / ip_period % (intra_period / 
ip_period)) == 0) {
@@ -707,6 +653,72 @@
     }
 }
 
+// General pts/dts strategy:
+//
+// Getting pts and dts right with variable frame rate (VFR) and B-frames can 
be a
+// bit tricky. This strategy roughly matches what x264 seems to do: We take in
+// the pts as the frames are encoded, and reuse that as dts in the same order,
+// slightly offset.
+//
+// If we don't have B-frames (only I and P), this means pts == dts always.
+// This is the simple case. Now consider the case with a single B-frame:
+//
+//        I  B  P  B  P
+//   pts: 30 40 50 60 70
+//
+// Since we always inherently encode P-frames before B-frames, this means that
+// we see them in this order, which we can _almost_ use for dts:
+//
+//   dts: 30 50 40 70 60
+//
+// the only problem here is that for the B-frames, pts < dts. We solve this by
+// priming the queue at the very start with some made-up dts:
+//
+//        I  B  P  B  P
+//   pts: 30 40 50 60 70
+//   dts: xx 30 50 40 70 60
+//
+// Now we have all the desirable properties: pts >= dts, successive dts delta
+// is never larger than the decoder can figure out (assuming, of course,
+// the pts has that property), and there's minimal lag between pts and dts.
+// For the made-up dts, we assume 1/60 sec per frame, which should generally
+// be reasonable. dts can go negative, but this is corrected using 
global_delay()
+// by delaying both pts and dts (although we probably don't need to).
+//
+// If there's more than one B-frame possible, we simply insert more of them
+// (here shown with some irregular spacing, assuming B-frames don't depend
+// on each other and simply go back-to-front):
+//
+//        I  B  B  B  P  B  B  B  P
+//   pts: 30 40 55 60 65 66 67 68 80
+//   dts: xx yy zz 30 65 60 55 40 80 68 67 66
+class DTSReorderer {
+public:
+       DTSReorderer(int num_b_frames) : num_b_frames(num_b_frames) {}
+
+       void push_pts(int64_t pts)
+       {
+               if (buf.empty() && num_b_frames > 0) {  // First frame.
+                       int64_t base_dts = pts - num_b_frames * (TIMEBASE / 
TYPICAL_FPS);
+                       for (int i = 0; i < num_b_frames; ++i) {
+                               buf.push(base_dts + i * (TIMEBASE / 
TYPICAL_FPS));
+                       }
+               }
+               buf.push(pts);
+       }
+
+       int64_t pop_dts()
+       {
+               assert(!buf.empty());
+               int64_t dts = buf.front();
+               buf.pop();
+               return dts;
+       }
+
+private:
+       const int num_b_frames;
+       queue<int64_t> buf;
+};
 
 void QuickSyncEncoderImpl::enable_zerocopy_if_possible()
 {
@@ -1349,6 +1361,7 @@
 // this is weird. but it seems to put a new frame onto the queue
 void QuickSyncEncoderImpl::storage_task_enqueue(storage_task task)
 {
+       assert(task.pts >= task.dts);
        lock_guard<mutex> lock(storage_task_queue_mutex);
        storage_task_queue.push(move(task));
        storage_task_queue_changed.notify_all();
@@ -1756,6 +1769,8 @@
 {
        pthread_setname_np(pthread_self(), "QS_Encode");
 
+       DTSReorderer dts_reorder_buf(ip_period - 1);
+
        int64_t last_dts = -1;
        int gop_start_display_frame_num = 0;
        for (int display_frame_num = 0; ; ++display_frame_num) {
@@ -1782,6 +1797,8 @@
                        }
                }
 
+               dts_reorder_buf.push_pts(frame.pts);
+
                // Pass the frame on to x264 (or uncompressed to HTTP) as 
needed.
                // Note that this implicitly waits for the frame to be done 
rendering.
                pass_frame(frame, display_frame_num, frame.pts, frame.duration);
@@ -1797,10 +1814,9 @@
                // Now encode as many QuickSync frames as we can using the 
frames we have available.
                // (It could be zero, or it could be multiple.) FIXME: make a 
function.
                for ( ;; ) {
-                       int pts_lag;
                        int frame_type, quicksync_display_frame_num;
                        encoding2display_order(quicksync_encoding_frame_num, 
intra_period, intra_idr_period, ip_period,
-                                              &quicksync_display_frame_num, 
&frame_type, &pts_lag);
+                                              &quicksync_display_frame_num, 
&frame_type);
                        if (!reorder_buffer.count(quicksync_display_frame_num)) 
{
                                break;
                        }
@@ -1820,14 +1836,7 @@
                                gop_start_display_frame_num = 
quicksync_display_frame_num;
                        }
 
-                       // Determine the dts of this frame.
-                       int64_t dts;
-                       if (pts_lag == -1) {
-                               assert(last_dts != -1);
-                               dts = last_dts + (TIMEBASE / MAX_FPS);
-                       } else {
-                               dts = frame.pts - pts_lag;
-                       }
+                       const int64_t dts = dts_reorder_buf.pop_dts();
                        last_dts = dts;
 
                        encode_frame(frame, quicksync_encoding_frame_num, 
quicksync_display_frame_num, gop_start_display_frame_num, frame_type, 
frame.pts, dts, frame.duration, frame.ycbcr_coefficients);
@@ -1846,7 +1855,7 @@
                int display_frame_num = pending_frame.first;
                assert(display_frame_num > 0);
                PendingFrame frame = move(pending_frame.second);
-               int64_t dts = last_dts + (TIMEBASE / MAX_FPS);
+               int64_t dts = last_dts + (TIMEBASE / TYPICAL_FPS);
                printf("Finalizing encode: Encoding leftover frame %d as 
P-frame instead of B-frame.\n", display_frame_num);
                encode_frame(frame, encoding_frame_num++, display_frame_num, 
gop_start_display_frame_num, FRAME_P, frame.pts, dts, frame.duration, 
frame.ycbcr_coefficients);
                last_dts = dts;
diff -Nru nageru-2.2.0/nageru/quicksync_encoder_impl.h 
nageru-2.2.1/nageru/quicksync_encoder_impl.h
--- nageru-2.2.0/nageru/quicksync_encoder_impl.h        2022-11-15 
00:25:44.000000000 +0100
+++ nageru-2.2.1/nageru/quicksync_encoder_impl.h        2023-04-17 
18:35:47.000000000 +0200
@@ -59,7 +59,7 @@
 
        // So we never get negative dts.
        int64_t global_delay() const {
-               return int64_t(ip_period - 1) * (TIMEBASE / MAX_FPS);
+               return int64_t(ip_period - 1) * (TIMEBASE / TYPICAL_FPS);
        }
 
 private:
@@ -209,7 +209,7 @@
        static constexpr int initial_qp = 15;
        static constexpr int minimal_qp = 0;
        static constexpr int intra_period = 30;
-       static constexpr int intra_idr_period = MAX_FPS;  // About a second; 
more at lower frame rates. Not ideal.
+       static constexpr int intra_idr_period = TYPICAL_FPS;  // About a 
second; more at lower frame rates. Not ideal.
 
        // Quality settings that are meant to be static, but might be overridden
        // by the profile.
diff -Nru nageru-2.2.0/NEWS nageru-2.2.1/NEWS
--- nageru-2.2.0/NEWS   2022-11-15 00:25:44.000000000 +0100
+++ nageru-2.2.1/NEWS   2023-04-17 18:35:47.000000000 +0200
@@ -1,3 +1,20 @@
+Nageru and Futatabi 2.2.1, April 17th, 2023
+
+  - Work around an issue with OpenGL on Wayland, causing all
+    displays to be blank.
+
+  - Several fixes related to video inputs; in particular:
+    - Fix crashes when the master clock goes faster than 60 Hz
+      (which could happen primarily if an SRT input is the master).
+    - Be more resilient to errors in hardware video decoding
+      when the stream starts out broken (e.g., not on a key frame)
+      but recovers.
+    - Multiple fixes related to hardware acceleration on nVidia.
+    - Incoming frames of too high resolution (larger than 8 MB)
+      will be refused instead of crashing. Such videos may be
+      supported better in the future.
+
+
 Nageru and Futatabi 2.2.0, November 15th, 2022
 
   - Support AV1 output, via SVT-AV1. Note that this is still somewhat
diff -Nru nageru-2.2.0/README nageru-2.2.1/README
--- nageru-2.2.0/README 2022-11-15 00:25:44.000000000 +0100
+++ nageru-2.2.1/README 2023-04-17 18:35:47.000000000 +0200
@@ -73,8 +73,6 @@
  - libjpeg, for encoding MJPEG streams when VA-API JPEG support is not
    available.
 
- - Zita-resampler, for adjusting audio to be in sync with video.
-
  - Protocol Buffers (protobuf), for storing various forms of settings and
    state.
 
@@ -109,7 +107,7 @@
  - SQLite, for storing state.
 
 
-If on Debian bullsey or something similar, you can install everything you need
+If on Debian bullseye or something similar, you can install everything you need
 with:
 
   apt install qtbase5-dev libqt5opengl5-dev qt5-default \
diff -Nru nageru-2.2.0/shared/mux.cpp nageru-2.2.1/shared/mux.cpp
--- nageru-2.2.0/shared/mux.cpp 2022-11-15 00:25:44.000000000 +0100
+++ nageru-2.2.1/shared/mux.cpp 2023-04-17 18:35:47.000000000 +0200
@@ -164,6 +164,8 @@
 
 void Mux::add_packet(const AVPacket &pkt, int64_t pts, int64_t dts, AVRational 
timebase, int stream_index_override)
 {
+       assert(pts >= dts);
+
        AVPacket pkt_copy;
        av_init_packet(&pkt_copy);
        if (av_packet_ref(&pkt_copy, &pkt) < 0) {
diff -Nru nageru-2.2.0/shared/va_display.cpp nageru-2.2.1/shared/va_display.cpp
--- nageru-2.2.0/shared/va_display.cpp  2022-11-15 00:25:44.000000000 +0100
+++ nageru-2.2.1/shared/va_display.cpp  2023-04-17 18:35:47.000000000 +0200
@@ -104,7 +104,7 @@
                        break;
                }
        }
-       if (!found_profile) {
+       if (found_profile == VAProfileNone) {
                if (error != nullptr) *error = "Can't find entry points for 
suitable codec profile";
                return nullptr;
        }

Bug#1034527: unblock: nageru/2.2.1-1

Reply via email to