Re: [FFmpeg-devel] [PATCH v2] avfilter/vf_bwdif: add x86 SIMD

2016-03-10 Thread Thomas Mundt
>>> Thomas Mundt wrote:
>> This new patch adds x86 SIMD support up to 12 bit. Please comment.
> 
> Not much use I guess, but on sse2 8 bit content it tests OK = faster +
> md5sum the same as without the patch.
> 
> Are you considering going further with this?
>
> Being sharper than yadif/preserving weave is nice, but for some SD
> (viewing scaled up) yadif wins on moving low angle diagonals, which end
> up stepped.
> 
> From memory when testing intel h/w deint it did a nice motion adaptive
> deint of the same scene without steps. The possible difference being
> that its "bob"
> also did some sort of edge detection/interpolation.
>
We testet deinterlacers at work last year. Professional hard- and software and 
also ffmpeg.
All my colleagues preferred w3fdif over yadif because it looks more homogeneous 
and reminds a little of the sharpness and visual impression of crt monitors.
Yadif definitely better reproduces moving diagonals and stills, but harms the 
picture too much at details and slightly blurs it.

So I tried to combine the best of both. I was successful with stills, but 
didn´t find a useful edge detection that helps more than it harms.
There are some working edge detection models, but very complicated and speed 
was also very important.
The most promising and fast method was comparing vertical and diagonal 
differences around and horizontal nearby the interpolated pixel.
But I always found samples were it leads to heavy artefacts. Especially visible 
with our new oled monitors.

The resulting bwdif was the best compromise between quality (of course 
subjective) and speed for general purpose content.
This simd patch is faster than yadif.

With the snooker sample yadif performs better mostly because there are not many 
details.
But you can see yadif´s artefacts in the players face at the end. Also bwdif 
performs slightly better than w3fdif here.

Hardware scalers these days seem to have mature deinterlacers.
In our test Aja, Lynx and other professional scalers and also nvidia and intel 
gpus outperformed all software except Amberfin icr.
Unfortunately I didn´t find any continuative hints in the linked intel 
documentation.
Maybe direct access of the gpu scaler would be the best deinterlacing filter in 
ffmpeg, but I have no idea how this could work.
So for now I´m finished with this patch. I hope it will be applied.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH v2] avfilter/vf_bwdif: add x86 SIMD

2016-03-10 Thread Andy Furniss

Thomas Mundt wrote:

This new patch adds x86 SIMD support up to 12 bit. Please comment.


Not much use I guess, but on sse2 8 bit content it tests OK = faster +
md5sum the same as without the patch.

Are you considering going further with this?

Being sharper than yadif/preserving weave is nice, but for some SD
(viewing scaled up) yadif wins on moving low angle diagonals, which end
up stepped.

From memory when testing intel h/w deint it did a nice motion adaptive
deint of the same scene without steps. The possible difference being 
that its "bob"

also did some sort of edge detection/interpolation.

https://drive.google.com/file/d/0BxP5-S1t9VEEUUR0QnVYRU8yczQ/view?usp=sharing

the scene in question, bwdif jaggies the cue when at a shallow angle.

The algorithm used by the h/w I tested. p259

https://01.org/sites/default/files/documentation/intel_os_gfx_prm_vol7_-_3d-media-gpgpu_0.pdf
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] [PATCH v2] avfilter/vf_bwdif: add x86 SIMD

2016-03-05 Thread Thomas Mundt
This new patch adds x86 SIMD support up to 12 bit.
Please comment.

Signed-off-by: Thomas Mundt 
---
 libavfilter/bwdif.h |  72 +++
 libavfilter/vf_bwdif.c  |  69 +++
 libavfilter/x86/Makefile|   2 +
 libavfilter/x86/vf_bwdif.asm| 266 
 libavfilter/x86/vf_bwdif_init.c |  78 
 5 files changed, 432 insertions(+), 55 deletions(-)
 create mode 100644 libavfilter/bwdif.h
 create mode 100644 libavfilter/x86/vf_bwdif.asm
 create mode 100644 libavfilter/x86/vf_bwdif_init.c

diff --git a/libavfilter/bwdif.h b/libavfilter/bwdif.h
new file mode 100644
index 000..8b42c76
--- /dev/null
+++ b/libavfilter/bwdif.h
@@ -0,0 +1,72 @@
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef AVFILTER_BWDIF_H
+#define AVFILTER_BWDIF_H
+
+#include "libavutil/pixdesc.h"
+#include "avfilter.h"
+
+enum BWDIFMode {
+BWDIF_MODE_SEND_FRAME = 0, ///< send 1 frame for each frame
+BWDIF_MODE_SEND_FIELD = 1, ///< send 1 frame for each field
+};
+
+enum BWDIFParity {
+BWDIF_PARITY_TFF  =  0, ///< top field first
+BWDIF_PARITY_BFF  =  1, ///< bottom field first
+BWDIF_PARITY_AUTO = -1, ///< auto detection
+};
+
+enum BWDIFDeint {
+BWDIF_DEINT_ALL= 0, ///< deinterlace all frames
+BWDIF_DEINT_INTERLACED = 1, ///< only deinterlace frames marked as 
interlaced
+};
+
+typedef struct BWDIFContext {
+const AVClass *class;
+
+int mode;   ///< BWDIFMode
+int parity; ///< BWDIFParity
+int deint;  ///< BWDIFDeint
+
+int frame_pending;
+
+AVFrame *cur;
+AVFrame *next;
+AVFrame *prev;
+AVFrame *out;
+
+void (*filter_intra)(void *dst1, void *cur1, int w, int prefs, int mrefs,
+ int prefs3, int mrefs3, int parity, int clip_max);
+void (*filter_line)(void *dst, void *prev, void *cur, void *next,
+int w, int prefs, int mrefs, int prefs2, int mrefs2,
+int prefs3, int mrefs3, int prefs4, int mrefs4,
+int parity, int clip_max);
+void (*filter_edge)(void *dst, void *prev, void *cur, void *next,
+int w, int prefs, int mrefs, int prefs2, int mrefs2,
+int parity, int clip_max, int spat);
+
+const AVPixFmtDescriptor *csp;
+int inter_field;
+int eof;
+} BWDIFContext;
+
+void ff_bwdif_init_x86(BWDIFContext *bwdif);
+
+#endif /* AVFILTER_BWDIF_H */
diff --git a/libavfilter/vf_bwdif.c b/libavfilter/vf_bwdif.c
index 7985054..d402aa4 100644
--- a/libavfilter/vf_bwdif.c
+++ b/libavfilter/vf_bwdif.c
@@ -37,6 +37,7 @@
 #include "formats.h"
 #include "internal.h"
 #include "video.h"
+#include "bwdif.h"
 
 /*
  * Filter coefficients coef_lf and coef_hf taken from BBC PH-2071 (Weston 3 
Field Deinterlacer).
@@ -48,51 +49,6 @@ static const uint16_t coef_lf[2] = { 4309, 213 };
 static const uint16_t coef_hf[3] = { 5570, 3801, 1016 };
 static const uint16_t coef_sp[2] = { 5077, 981 };
 
-enum BWDIFMode {
-BWDIF_MODE_SEND_FRAME = 0, ///< send 1 frame for each frame
-BWDIF_MODE_SEND_FIELD = 1, ///< send 1 frame for each field
-};
-
-enum BWDIFParity {
-BWDIF_PARITY_TFF  =  0, ///< top field first
-BWDIF_PARITY_BFF  =  1, ///< bottom field first
-BWDIF_PARITY_AUTO = -1, ///< auto detection
-};
-
-enum BWDIFDeint {
-BWDIF_DEINT_ALL= 0, ///< deinterlace all frames
-BWDIF_DEINT_INTERLACED = 1, ///< only deinterlace frames marked as 
interlaced
-};
-
-typedef struct BWDIFContext {
-const AVClass *class;
-
-int mode;   ///< BWDIFMode
-int parity; ///< BWDIFParity
-int deint;  ///< BWDIFDeint
-
-int frame_pending;
-
-AVFrame *cur;
-AVFrame *next;
-AVFrame *prev;
-AVFrame *out;
-
-void (*filter_intra)(void *dst1, void *cur1, int w, int prefs, int mrefs,
- int prefs3, int mrefs3, int parity, int clip_max);
-void (*filter_line)(void *dst, void *prev, void *cur, void *next,
-int w, int prefs, int mrefs, int prefs2, int mrefs2,
-int prefs3, int mrefs3, int prefs4, int mrefs4,
-int parity, int