this filter accepts 8bit frame (RGB24) and outputs 10bit/float frame, and there's no reference image, so it is not feasible to use criteria such as PNSR, SSIM.
I choose the same method described in the paper to demo the filter effect, that means the frames before/after the filter are reduced by 3 stops. The native video (test.native.mp4) is created from 7 png files @ https://github.com/gabrieleilertsen/hdrcnn/tree/master/data (the size of the image is enlarged to 1920*1080 with extra area filled with white) with command line: ffmpeg -f image2 -i ./img_%03d.png -c:v libx264 -preset veryslow -crf 1 test.native.mp4. And two rgb24 videos are generated before/after the filter with -3 stops by modifying the code a little, see in the video folder at https://drive.google.com/drive/folders/1URsRY5g-VdE-kHlP5vQoLoimMIZ-SX00?usp=sharing for your convenient, I also dump png files from generated videos and combine the before/after pngs into one file, see in png folder at the google drive. > -----Original Message----- > From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf > Of Liu Steven > Sent: Monday, November 05, 2018 3:57 PM > To: FFmpeg development discussions and patches <ffmpeg- > de...@ffmpeg.org> > Cc: Liu Steven <l...@chinaffmpeg.org> > Subject: Re: [FFmpeg-devel] [PATCH V4] Add a filter implementing HDR > image generation from a single exposure using deep CNNs > > > > > 在 2018年11月5日,下午3:42,Guo, Yejun <yejun....@intel.com> 写 > 道: > > > > ask for comment or merge, thanks. > Will push after 24 hours if there have no objections. > > > >> -----Original Message----- > >> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On > Behalf > >> Of Guo, Yejun > >> Sent: Monday, October 29, 2018 11:19 AM > >> To: ffmpeg-devel@ffmpeg.org > >> Subject: Re: [FFmpeg-devel] [PATCH V4] Add a filter implementing HDR > >> image generation from a single exposure using deep CNNs > >> > >> any more comment? thanks. > >> > >>> -----Original Message----- > >>> From: Guo, Yejun > >>> Sent: Tuesday, October 23, 2018 6:46 AM > >>> To: ffmpeg-devel@ffmpeg.org > >>> Cc: Guo, Yejun <yejun....@intel.com>; Guo > >>> Subject: [PATCH V4] Add a filter implementing HDR image generation > >>> from a single exposure using deep CNNs > >>> > >>> see the algorithm's paper and code below. > >>> > >>> the filter's parameter looks like: > >>> > >> > sdr2hdr=model_filename=/path_to_tensorflow_graph.pb:out_fmt=gbrp10l > >>> e > >>> > >>> The input of the deep CNN model is RGB24 while the output is float > >>> for each color channel. This is the filter's default behavior to > >>> output format with gbrpf32le. And gbrp10le is also supported as the > >>> output, so we can see the rendering result in a player, as a reference. > >>> > >>> To generate the model file, we need modify the original script a little. > >>> - set name='y' for y_final within script at > >>> https://github.com/gabrieleilertsen/hdrcnn/blob/master/network.py > >>> - add the following code to the script at > >>> > https://github.com/gabrieleilertsen/hdrcnn/blob/master/hdrcnn_predict. > >>> py > >>> > >>> graph = tf.graph_util.convert_variables_to_constants(sess, > >>> sess.graph_def, > >>> ["y"]) tf.train.write_graph(graph, '.', 'graph.pb', as_text=False) > >>> > >>> The filter only works when tensorflow C api is supported in the > >>> system, native backend is not supported since there are some > >>> different types of layers in the deep CNN model, besides CONV and > >> DEPTH_TO_SPACE. > >>> > >>> https://arxiv.org/pdf/1710.07480.pdf: > >>> author = "Eilertsen, Gabriel and Kronander, Joel, and Denes, Gyorgy > >> and > >>> Mantiuk, Rafał and Unger, Jonas", > >>> title = "HDR image reconstruction from a single exposure using > >>> deep > >>> CNNs", > >>> journal = "ACM Transactions on Graphics (TOG)", > >>> number = "6", > >>> volume = "36", > >>> articleno = "178", > >>> year = "2017" > >>> > >>> https://github.com/gabrieleilertsen/hdrcnn > >>> > >>> btw, as a whole solution, metadata should also be generated from the > >>> sdr video, so to be encoded as a HDR video. Not supported yet. > >>> This patch just focuses on this paper. > >>> > >>> Signed-off-by: Guo, Yejun <yejun....@intel.com> > >>> --- > >>> configure | 1 + > >>> doc/filters.texi | 35 +++++++ > >>> libavfilter/Makefile | 1 + > >>> libavfilter/allfilters.c | 1 + > >>> libavfilter/vf_sdr2hdr.c | 268 > >>> +++++++++++++++++++++++++++++++++++++++++++++++ > >>> 5 files changed, 306 insertions(+) > >>> create mode 100644 libavfilter/vf_sdr2hdr.c > >>> > >>> diff --git a/configure b/configure > >>> index 85d5dd5..5e2efba 100755 > >>> --- a/configure > >>> +++ b/configure > >>> @@ -3438,6 +3438,7 @@ scale2ref_filter_deps="swscale" > >>> scale_filter_deps="swscale" > >>> scale_qsv_filter_deps="libmfx" > >>> select_filter_select="pixelutils" > >>> +sdr2hdr_filter_deps="libtensorflow" > >>> sharpness_vaapi_filter_deps="vaapi" > >>> showcqt_filter_deps="avcodec avformat swscale" > >>> showcqt_filter_suggest="libfontconfig libfreetype" > >>> diff --git a/doc/filters.texi b/doc/filters.texi index > >>> 17e2549..bba9f87 100644 > >>> --- a/doc/filters.texi > >>> +++ b/doc/filters.texi > >>> @@ -14672,6 +14672,41 @@ Scale a subtitle stream (b) to match the > >>> main video (a) in size before overlayin @end example @end itemize > >>> > >>> +@section sdr2hdr > >>> + > >>> +HDR image generation from a single exposure using deep CNNs with > >>> TensorFlow C library. > >>> + > >>> +@itemize > >>> +@item > >>> +paper: see @url{https://arxiv.org/pdf/1710.07480.pdf} > >>> + > >>> +@item > >>> +code with model and trained parameters: see > >>> +@url{https://github.com/gabrieleilertsen/hdrcnn} > >>> +@end itemize > >>> + > >>> +The filter accepts the following options: > >>> + > >>> +@table @option > >>> + > >>> +@item model_filename > >>> +Set path to model file specifying network architecture and its > parameters. > >>> + > >>> +@item out_fmt > >>> +the data format of the filter's output. > >>> + > >>> +It accepts the following values: > >>> +@table @samp > >>> +@item gbrpf32le > >>> +force gbrpf32le output > >>> + > >>> +@item gbrp10le > >>> +force gbrp10le output > >>> +@end table > >>> + > >>> +Default value is @samp{gbrpf32le}. > >>> + > >>> +@end table > >>> + > >>> @anchor{selectivecolor} > >>> @section selectivecolor > >>> > >>> diff --git a/libavfilter/Makefile b/libavfilter/Makefile index > >>> 62cc2f5..88e7da6 > >>> 100644 > >>> --- a/libavfilter/Makefile > >>> +++ b/libavfilter/Makefile > >>> @@ -360,6 +360,7 @@ OBJS-$(CONFIG_SOBEL_OPENCL_FILTER) += > >>> vf_convolution_opencl.o opencl.o > >>> OBJS-$(CONFIG_SPLIT_FILTER) += split.o > >>> OBJS-$(CONFIG_SPP_FILTER) += vf_spp.o > >>> OBJS-$(CONFIG_SR_FILTER) += vf_sr.o > >>> +OBJS-$(CONFIG_SDR2HDR_FILTER) += vf_sdr2hdr.o > >>> OBJS-$(CONFIG_SSIM_FILTER) += vf_ssim.o framesync.o > >>> OBJS-$(CONFIG_STEREO3D_FILTER) += vf_stereo3d.o > >>> OBJS-$(CONFIG_STREAMSELECT_FILTER) += f_streamselect.o > >>> framesync.o > >>> diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c > >>> index 5e72803..1645c0f > >>> 100644 > >>> --- a/libavfilter/allfilters.c > >>> +++ b/libavfilter/allfilters.c > >>> @@ -319,6 +319,7 @@ extern AVFilter ff_vf_scale_npp; extern > >>> AVFilter ff_vf_scale_qsv; extern AVFilter ff_vf_scale_vaapi; > >>> extern AVFilter ff_vf_scale2ref; > >>> +extern AVFilter ff_vf_sdr2hdr; > >>> extern AVFilter ff_vf_select; > >>> extern AVFilter ff_vf_selectivecolor; extern AVFilter > >>> ff_vf_sendcmd; diff --git a/libavfilter/vf_sdr2hdr.c > >>> b/libavfilter/vf_sdr2hdr.c new file mode > >>> 100644 index 0000000..109b907 > >>> --- /dev/null > >>> +++ b/libavfilter/vf_sdr2hdr.c > >>> @@ -0,0 +1,268 @@ > >>> +/* > >>> + * Copyright (c) 2018 Guo Yejun > >>> + * > >>> + * This file is part of FFmpeg. > >>> + * > >>> + * FFmpeg is free software; you can redistribute it and/or > >>> + * modify it under the terms of the GNU Lesser General Public > >>> + * License as published by the Free Software Foundation; either > >>> + * version 2.1 of the License, or (at your option) any later version. > >>> + * > >>> + * FFmpeg is distributed in the hope that it will be useful, > >>> + * but WITHOUT ANY WARRANTY; without even the implied warranty > of > >>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See > the > >>> GNU > >>> + * Lesser General Public License for more details. > >>> + * > >>> + * You should have received a copy of the GNU Lesser General Public > >>> + * License along with FFmpeg; if not, write to the Free Software > >>> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA > >>> +02110-1301 USA */ > >>> + > >>> +/** > >>> + * @file > >>> + * Filter implementing HDR image generation from a single exposure > >>> +using > >>> deep CNNs. > >>> + * https://arxiv.org/pdf/1710.07480.pdf > >>> + */ > >>> + > >>> +#include "avfilter.h" > >>> +#include "formats.h" > >>> +#include "internal.h" > >>> +#include "libavutil/opt.h" > >>> +#include "libavutil/qsort.h" > >>> +#include "libavformat/avio.h" > >>> +#include "libswscale/swscale.h" > >>> +#include "dnn_interface.h" > >>> +#include <math.h> > >>> + > >>> +typedef struct SDR2HDRContext { > >>> + const AVClass *class; > >>> + > >>> + char* model_filename; > >>> + enum AVPixelFormat out_fmt; > >>> + DNNModule* dnn_module; > >>> + DNNModel* model; > >>> + DNNData input, output; > >>> +} SDR2HDRContext; > >>> + > >>> +#define OFFSET(x) offsetof(SDR2HDRContext, x) #define FLAGS > >>> +AV_OPT_FLAG_FILTERING_PARAM | AV_OPT_FLAG_VIDEO_PARAM > >> static > >>> const > >>> +AVOption sdr2hdr_options[] = { > >>> + { "model_filename", "path to model file specifying network > >>> +architecture > >>> and its parameters", OFFSET(model_filename), AV_OPT_TYPE_STRING, > >>> {.str=NULL}, 0, 0, FLAGS }, > >>> + { "out_fmt", "the data format of the filter's output, it could > >>> + be gbrpf32le > >>> [default] or gbrp10le", OFFSET(out_fmt), AV_OPT_TYPE_PIXEL_FMT, > >>> {.i64=AV_PIX_FMT_GBRPF32LE}, AV_PIX_FMT_NONE, AV_PIX_FMT_NB, > >> FLAGS }, > >>> + { NULL } > >>> +}; > >>> + > >>> +AVFILTER_DEFINE_CLASS(sdr2hdr); > >>> + > >>> +static av_cold int init(AVFilterContext* context) { > >>> + SDR2HDRContext* ctx = context->priv; > >>> + > >>> + if (ctx->out_fmt != AV_PIX_FMT_GBRPF32LE && ctx->out_fmt != > >>> AV_PIX_FMT_GBRP10LE) { > >>> + av_log(context, AV_LOG_ERROR, "could not support the output > >>> format\n"); > >>> + return AVERROR(ENOSYS); > >>> + } > >>> + > >>> + ctx->dnn_module = ff_get_dnn_module(DNN_TF); > >>> + if (!ctx->dnn_module){ > >>> + av_log(context, AV_LOG_ERROR, "could not create DNN module > >>> + for > >>> tensorflow backend\n"); > >>> + return AVERROR(ENOMEM); > >>> + } > >>> + if (!ctx->model_filename){ > >>> + av_log(context, AV_LOG_ERROR, "model file for network was > >>> + not > >>> specified\n"); > >>> + return AVERROR(EIO); > >>> + } > >>> + if (!ctx->dnn_module->load_model) { > >>> + av_log(context, AV_LOG_ERROR, "load_model for network was > >>> + not > >>> specified\n"); > >>> + return AVERROR(EIO); > >>> + } > >>> + ctx->model = (ctx->dnn_module->load_model)(ctx- > >model_filename); > >>> + if (!ctx->model){ > >>> + av_log(context, AV_LOG_ERROR, "could not load DNN model\n"); > >>> + return AVERROR(EIO); > >>> + } > >>> + return 0; > >>> +} > >>> + > >>> +static int query_formats(AVFilterContext* context) { > >>> + const enum AVPixelFormat in_formats[] = {AV_PIX_FMT_RGB24, > >>> + AV_PIX_FMT_NONE}; > >>> + enum AVPixelFormat out_formats[2]; > >>> + SDR2HDRContext* ctx = context->priv; > >>> + AVFilterFormats* formats_list; > >>> + int ret = 0; > >>> + > >>> + formats_list = ff_make_format_list(in_formats); > >>> + if ((ret = ff_formats_ref(formats_list, > >>> + &context->inputs[0]->out_formats)) > >>> < 0) > >>> + return ret; > >>> + > >>> + out_formats[0] = ctx->out_fmt; > >>> + out_formats[1] = AV_PIX_FMT_NONE; > >>> + formats_list = ff_make_format_list(out_formats); > >>> + if ((ret = ff_formats_ref(formats_list, > >>> + &context->outputs[0]->in_formats)) > >>> < 0) > >>> + return ret; > >>> + > >>> + return 0; > >>> +} > >>> + > >>> +static int config_props(AVFilterLink* inlink) { > >>> + AVFilterContext* context = inlink->dst; > >>> + SDR2HDRContext* ctx = context->priv; > >>> + AVFilterLink* outlink = context->outputs[0]; > >>> + DNNReturnType result; > >>> + > >>> + // the dnn model is tied with resolution due to deconv layer of > >> tensorflow > >>> + // now just support 1920*1080 and so the magic numbers within this > file > >>> + if (inlink->w != 1920 || inlink->h != 1080) { > >>> + av_log(context, AV_LOG_ERROR, "only support frame size with > >>> 1920*1080\n"); > >>> + return AVERROR(ENOSYS); > >>> + } > >>> + > >>> + ctx->input.width = 1920; > >>> + ctx->input.height = 1088; //the model requires height is a multiple > >>> of > 32, > >>> + ctx->input.channels = 3; > >>> + > >>> + result = (ctx->model->set_input_output)(ctx->model->model, > >>> + &ctx- > >>>> input, &ctx->output); > >>> + if (result != DNN_SUCCESS){ > >>> + av_log(context, AV_LOG_ERROR, "could not set input and > >>> + output for > >>> the model\n"); > >>> + return AVERROR(EIO); > >>> + } > >>> + > >>> + memset(ctx->input.data, 0, ctx->input.channels * > >>> + ctx->input.width > >>> + * ctx- > >>>> input.height * sizeof(float)); > >>> + outlink->h = 1080; > >>> + outlink->w = 1920; > >>> + return 0; > >>> +} > >>> + > >>> +static float qsort_comparison_function_float(const void *a, const > >>> +void > >>> +*b) { > >>> + return *(const float *)a - *(const float *)b; } > >>> + > >>> +static int filter_frame(AVFilterLink* inlink, AVFrame* in) { > >>> + DNNReturnType dnn_result = DNN_SUCCESS; > >>> + AVFilterContext* context = inlink->dst; > >>> + SDR2HDRContext* ctx = context->priv; > >>> + AVFilterLink* outlink = context->outputs[0]; > >>> + AVFrame* out = ff_get_video_buffer(outlink, outlink->w, outlink- > >h); > >>> + int total_pixels = in->height * in->width; > >>> + > >>> + if (!out){ > >>> + av_log(context, AV_LOG_ERROR, "could not allocate memory > >>> + for > >>> output frame\n"); > >>> + av_frame_free(&in); > >>> + return AVERROR(ENOMEM); > >>> + } > >>> + > >>> + av_frame_copy_props(out, in); > >>> + > >>> + for (int i = 0; i < in->linesize[0] * in->height; ++i) { > >>> + ctx->input.data[i] = in->data[0][i] / 255.0f; > >>> + } > >>> + > >>> + dnn_result = (ctx->dnn_module->execute_model)(ctx->model); > >>> + if (dnn_result != DNN_SUCCESS){ > >>> + av_log(context, AV_LOG_ERROR, "failed to execute loaded > >> model\n"); > >>> + return AVERROR(EIO); > >>> + } > >>> + > >>> + if (ctx->out_fmt == AV_PIX_FMT_GBRPF32LE) { > >>> + float* outg = (float*)out->data[0]; > >>> + float* outb = (float*)out->data[1]; > >>> + float* outr = (float*)out->data[2]; > >>> + for (int i = 0; i < total_pixels; ++i) { > >>> + float r = ctx->output.data[i*3]; > >>> + float g = ctx->output.data[i*3+1]; > >>> + float b = ctx->output.data[i*3+2]; > >>> + outr[i] = r; > >>> + outg[i] = g; > >>> + outb[i] = b; > >>> + } > >>> + } else { > >>> + // here, we just use a rough mapping to the 10bit contents > >>> + // meta data generation for HDR video encoding is not supported > yet > >>> + float* converted_data = (float*)av_malloc(total_pixels * 3 > >>> + * > >>> sizeof(float)); > >>> + int16_t* outg = (int16_t*)out->data[0]; > >>> + int16_t* outb = (int16_t*)out->data[1]; > >>> + int16_t* outr = (int16_t*)out->data[2]; > >>> + > >>> + float max = 1.0f; > >>> + for (int i = 0; i < total_pixels * 3; ++i) { > >>> + float d = ctx->output.data[i]; > >>> + d = sqrt(d); > >>> + converted_data[i] = d; > >>> + max = FFMAX(d, max); > >>> + } > >>> + > >>> + if (max > 1.0f) { > >>> + AV_QSORT(converted_data, total_pixels * 3, float, > >>> qsort_comparison_function_float); > >>> + // 0.5% pixels are clipped > >>> + max = converted_data[(int)(total_pixels * 3 * 0.995)]; > >>> + max = FFMAX(max, 1.0f); > >>> + > >>> + for (int i = 0; i < total_pixels * 3; ++i) { > >>> + float d = ctx->output.data[i]; > >>> + d = sqrt(d); > >>> + d = FFMIN(d, max); > >>> + converted_data[i] = d; > >>> + } > >>> + } > >>> + > >>> + for (int i = 0; i < total_pixels; ++i) { > >>> + float r = converted_data[i*3]; > >>> + float g = converted_data[i*3+1]; > >>> + float b = converted_data[i*3+2]; > >>> + outr[i] = r / max * 1023; > >>> + outg[i] = g / max * 1023; > >>> + outb[i] = b / max * 1023; > >>> + } > >>> + > >>> + av_free(converted_data); > >>> + } > >>> + > >>> + av_frame_free(&in); > >>> + return ff_filter_frame(outlink, out); } > >>> + > >>> +static av_cold void uninit(AVFilterContext* context) { > >>> + SDR2HDRContext* ctx = context->priv; > >>> + > >>> + if (ctx->dnn_module){ > >>> + (ctx->dnn_module->free_model)(&ctx->model); > >>> + av_freep(&ctx->dnn_module); > >>> + } > >>> +} > >>> + > >>> +static const AVFilterPad sdr2hdr_inputs[] = { > >>> + { > >>> + .name = "default", > >>> + .type = AVMEDIA_TYPE_VIDEO, > >>> + .config_props = config_props, > >>> + .filter_frame = filter_frame, > >>> + }, > >>> + { NULL } > >>> +}; > >>> + > >>> +static const AVFilterPad sdr2hdr_outputs[] = { > >>> + { > >>> + .name = "default", > >>> + .type = AVMEDIA_TYPE_VIDEO, > >>> + }, > >>> + { NULL } > >>> +}; > >>> + > >>> +AVFilter ff_vf_sdr2hdr = { > >>> + .name = "sdr2hdr", > >>> + .description = NULL_IF_CONFIG_SMALL("HDR image generation > from a > >>> single exposure using deep CNNs."), > >>> + .priv_size = sizeof(SDR2HDRContext), > >>> + .init = init, > >>> + .uninit = uninit, > >>> + .query_formats = query_formats, > >>> + .inputs = sdr2hdr_inputs, > >>> + .outputs = sdr2hdr_outputs, > >>> + .priv_class = &sdr2hdr_class, > >>> + .flags = AVFILTER_FLAG_SUPPORT_TIMELINE_GENERIC, > >>> +}; > >>> -- > >>> 2.7.4 > >> > >> _______________________________________________ > >> ffmpeg-devel mailing list > >> ffmpeg-devel@ffmpeg.org > >> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > _______________________________________________ > > ffmpeg-devel mailing list > > ffmpeg-devel@ffmpeg.org > > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel