Re: [libav-devel] [PATCH 8/8] hevcdsp: add x86 SIMD for MC

2015-08-19 Thread Anton Khirnov
Quoting James Almer (2015-08-20 00:34:58)
> On 19/08/15 4:43 PM, Anton Khirnov wrote:
> > ---
> >  libavcodec/hevc.c |   6 +-
> >  libavcodec/hevc.h |   2 +-
> >  libavcodec/hevcdsp.c  |  24 +-
> >  libavcodec/hevcdsp.h  |   5 +-
> >  libavcodec/hevcdsp_template.c |   8 +-
> >  libavcodec/x86/Makefile   |   3 +-
> >  libavcodec/x86/hevc_mc.asm| 816 
> > ++
> >  libavcodec/x86/hevcdsp_init.c | 405 +
> >  8 files changed, 1258 insertions(+), 11 deletions(-)
> >  create mode 100644 libavcodec/x86/hevc_mc.asm
> 
> I'm getting segmentation faults with quite a few of samples.
> For example http://www.elecard.com/assets/files/other/clips/bbb_1080p_c.ts

Cannot reproduce here. Can you give me more details (system, where
exactly does it crash, etc.)?

-- 
Anton Khirnov
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 8/8] hevcdsp: add x86 SIMD for MC

2015-08-19 Thread James Almer
On 19/08/15 8:23 PM, Ronald S. Bultje wrote:
> Hi,
> 
> On Wed, Aug 19, 2015 at 6:34 PM, James Almer  wrote:
> 
>> On 19/08/15 4:43 PM, Anton Khirnov wrote:
>>> ---
>>>  libavcodec/hevc.c |   6 +-
>>>  libavcodec/hevc.h |   2 +-
>>>  libavcodec/hevcdsp.c  |  24 +-
>>>  libavcodec/hevcdsp.h  |   5 +-
>>>  libavcodec/hevcdsp_template.c |   8 +-
>>>  libavcodec/x86/Makefile   |   3 +-
>>>  libavcodec/x86/hevc_mc.asm| 816
>> ++
>>>  libavcodec/x86/hevcdsp_init.c | 405 +
>>>  8 files changed, 1258 insertions(+), 11 deletions(-)
>>>  create mode 100644 libavcodec/x86/hevc_mc.asm
>>
>> I'm getting segmentation faults with quite a few of samples.
>> For example http://www.elecard.com/assets/files/other/clips/bbb_1080p_c.ts
> 
> 
> So, at the risk of godwin, why was this reimplemented from scratch, rather
> than basing it on what ffmpeg has? How could this possibly be an advantage
> to our users?

Or OpenHEVC for that matter, which is the source of almost every hevc asm
optimization, x86 or otherwise, and a project that afaik branched off libav.

> 
> Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 8/8] hevcdsp: add x86 SIMD for MC

2015-08-19 Thread Ronald S. Bultje
Hi,

On Wed, Aug 19, 2015 at 6:34 PM, James Almer  wrote:

> On 19/08/15 4:43 PM, Anton Khirnov wrote:
> > ---
> >  libavcodec/hevc.c |   6 +-
> >  libavcodec/hevc.h |   2 +-
> >  libavcodec/hevcdsp.c  |  24 +-
> >  libavcodec/hevcdsp.h  |   5 +-
> >  libavcodec/hevcdsp_template.c |   8 +-
> >  libavcodec/x86/Makefile   |   3 +-
> >  libavcodec/x86/hevc_mc.asm| 816
> ++
> >  libavcodec/x86/hevcdsp_init.c | 405 +
> >  8 files changed, 1258 insertions(+), 11 deletions(-)
> >  create mode 100644 libavcodec/x86/hevc_mc.asm
>
> I'm getting segmentation faults with quite a few of samples.
> For example http://www.elecard.com/assets/files/other/clips/bbb_1080p_c.ts


So, at the risk of godwin, why was this reimplemented from scratch, rather
than basing it on what ffmpeg has? How could this possibly be an advantage
to our users?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 8/8] hevcdsp: add x86 SIMD for MC

2015-08-19 Thread James Almer
On 19/08/15 4:43 PM, Anton Khirnov wrote:
> ---
>  libavcodec/hevc.c |   6 +-
>  libavcodec/hevc.h |   2 +-
>  libavcodec/hevcdsp.c  |  24 +-
>  libavcodec/hevcdsp.h  |   5 +-
>  libavcodec/hevcdsp_template.c |   8 +-
>  libavcodec/x86/Makefile   |   3 +-
>  libavcodec/x86/hevc_mc.asm| 816 
> ++
>  libavcodec/x86/hevcdsp_init.c | 405 +
>  8 files changed, 1258 insertions(+), 11 deletions(-)
>  create mode 100644 libavcodec/x86/hevc_mc.asm

I'm getting segmentation faults with quite a few of samples.
For example http://www.elecard.com/assets/files/other/clips/bbb_1080p_c.ts
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 2/2] imgutils: Fix a typo in avcodec_get_pix_fmt_loss

2015-08-19 Thread Anton Khirnov
Quoting Luca Barbato (2015-08-18 16:22:20)
> If the candidate does not have alpha and the source does have alpha
> report the loss of alpha.
> 
> CC: libav-sta...@libav.org
> ---
>  libavcodec/imgconvert.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/libavcodec/imgconvert.c b/libavcodec/imgconvert.c
> index 1f6d587..6b389ac 100644
> --- a/libavcodec/imgconvert.c
> +++ b/libavcodec/imgconvert.c
> @@ -76,7 +76,7 @@ int avcodec_get_pix_fmt_loss(enum AVPixelFormat dst_pix_fmt,
>  loss |= FF_LOSS_COLORSPACE;
>  
>  if (has_alpha && !(dst_desc->flags & AV_PIX_FMT_FLAG_ALPHA) &&
> - (dst_desc->flags & AV_PIX_FMT_FLAG_ALPHA))
> + (src_desc->flags & AV_PIX_FMT_FLAG_ALPHA))
>  loss |= FF_LOSS_ALPHA;
>  
>  if (dst_pix_fmt == AV_PIX_FMT_PAL8 && !is_gray(src_desc))
> -- 
> 2.5.0
> 

Ok

-- 
Anton Khirnov
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/2] imgutils: Use av_pix_fmt_get_chroma_sub_sample in avcodec_get_chroma_sub_sample

2015-08-19 Thread Anton Khirnov
Quoting Luca Barbato (2015-08-18 16:22:19)
> Avoid a NULL dereference.
> 
> CC: libav-sta...@libav.org
> ---
>  libavcodec/imgconvert.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/libavcodec/imgconvert.c b/libavcodec/imgconvert.c
> index 2d32602..1f6d587 100644
> --- a/libavcodec/imgconvert.c
> +++ b/libavcodec/imgconvert.c
> @@ -41,9 +41,7 @@
>  
>  void avcodec_get_chroma_sub_sample(enum AVPixelFormat pix_fmt, int *h_shift, 
> int *v_shift)
>  {
> -const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(pix_fmt);
> -*h_shift = desc->log2_chroma_w;
> -*v_shift = desc->log2_chroma_h;
> +av_pix_fmt_get_chroma_sub_sample(pix_fmt, h_shift, v_shift);
>  }

Since this function has no way to signal errors, I would argue that a
segfault would be better than leaving random possibly uninitialized data
around.

-- 
Anton Khirnov
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH 2/8] hevcdsp: fix a function name

2015-08-19 Thread Anton Khirnov
put_weighted_pred_avg should be put_unweighted_pred_avg, there is no
weighting there.
---
 libavcodec/hevc.c | 8 
 libavcodec/hevcdsp.c  | 2 +-
 libavcodec/hevcdsp.h  | 6 +++---
 libavcodec/hevcdsp_template.c | 8 
 4 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/libavcodec/hevc.c b/libavcodec/hevc.c
index 6395563..f17c313 100644
--- a/libavcodec/hevc.c
+++ b/libavcodec/hevc.c
@@ -1806,8 +1806,8 @@ static void hls_prediction_unit(HEVCContext *s, int x0, 
int y0,
  dst0, s->frame->linesize[0],
  tmp, tmp2, tmpstride, nPbW, nPbH);
 } else {
-s->hevcdsp.put_weighted_pred_avg(dst0, s->frame->linesize[0],
- tmp, tmp2, tmpstride, nPbW, nPbH);
+s->hevcdsp.put_unweighted_pred_avg(dst0, s->frame->linesize[0],
+   tmp, tmp2, tmpstride, nPbW, 
nPbH);
 }
 
 chroma_mc(s, tmp, tmp2, tmpstride, ref0->frame,
@@ -1832,8 +1832,8 @@ static void hls_prediction_unit(HEVCContext *s, int x0, 
int y0,
  dst2, s->frame->linesize[2], tmp2, 
tmp4,
  tmpstride, nPbW / 2, nPbH / 2);
 } else {
-s->hevcdsp.put_weighted_pred_avg(dst1, s->frame->linesize[1], tmp, 
tmp3, tmpstride, nPbW/2, nPbH/2);
-s->hevcdsp.put_weighted_pred_avg(dst2, s->frame->linesize[2], 
tmp2, tmp4, tmpstride, nPbW/2, nPbH/2);
+s->hevcdsp.put_unweighted_pred_avg(dst1, s->frame->linesize[1], 
tmp, tmp3, tmpstride, nPbW/2, nPbH/2);
+s->hevcdsp.put_unweighted_pred_avg(dst2, s->frame->linesize[2], 
tmp2, tmp4, tmpstride, nPbW/2, nPbH/2);
 }
 }
 }
diff --git a/libavcodec/hevcdsp.c b/libavcodec/hevcdsp.c
index 0abee9b..216101a 100644
--- a/libavcodec/hevcdsp.c
+++ b/libavcodec/hevcdsp.c
@@ -162,7 +162,7 @@ void ff_hevc_dsp_init(HEVCDSPContext *hevcdsp, int 
bit_depth)
 hevcdsp->put_hevc_epel[1][1] = FUNC(put_hevc_epel_hv, depth);   \
 \
 hevcdsp->put_unweighted_pred   = FUNC(put_unweighted_pred, depth);  \
-hevcdsp->put_weighted_pred_avg = FUNC(put_weighted_pred_avg, depth);\
+hevcdsp->put_unweighted_pred_avg = FUNC(put_unweighted_pred_avg, depth);   
 \
 \
 hevcdsp->weighted_pred = FUNC(weighted_pred, depth);\
 hevcdsp->weighted_pred_avg = FUNC(weighted_pred_avg, depth);\
diff --git a/libavcodec/hevcdsp.h b/libavcodec/hevcdsp.h
index aad96db..7278464 100644
--- a/libavcodec/hevcdsp.h
+++ b/libavcodec/hevcdsp.h
@@ -67,9 +67,9 @@ typedef struct HEVCDSPContext {
 
 void (*put_unweighted_pred)(uint8_t *dst, ptrdiff_t dststride, int16_t 
*src,
 ptrdiff_t srcstride, int width, int height);
-void (*put_weighted_pred_avg)(uint8_t *dst, ptrdiff_t dststride,
-  int16_t *src1, int16_t *src2,
-  ptrdiff_t srcstride, int width, int height);
+void (*put_unweighted_pred_avg)(uint8_t *dst, ptrdiff_t dststride,
+int16_t *src1, int16_t *src2,
+ptrdiff_t srcstride, int width, int 
height);
 void (*weighted_pred)(uint8_t denom, int16_t wlxFlag, int16_t olxFlag,
   uint8_t *dst, ptrdiff_t dststride, int16_t *src,
   ptrdiff_t srcstride, int width, int height);
diff --git a/libavcodec/hevcdsp_template.c b/libavcodec/hevcdsp_template.c
index ae7e021..390f683 100644
--- a/libavcodec/hevcdsp_template.c
+++ b/libavcodec/hevcdsp_template.c
@@ -1033,10 +1033,10 @@ static void FUNC(put_unweighted_pred)(uint8_t *_dst, 
ptrdiff_t _dststride,
 }
 }
 
-static void FUNC(put_weighted_pred_avg)(uint8_t *_dst, ptrdiff_t _dststride,
-int16_t *src1, int16_t *src2,
-ptrdiff_t srcstride,
-int width, int height)
+static void FUNC(put_unweighted_pred_avg)(uint8_t *_dst, ptrdiff_t _dststride,
+  int16_t *src1, int16_t *src2,
+  ptrdiff_t srcstride,
+  int width, int height)
 {
 int x, y;
 pixel *dst  = (pixel *)_dst;
-- 
2.0.0

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH 5/8] hevcdsp: split the pred functions by width

2015-08-19 Thread Anton Khirnov
This should allow for more efficient SIMD.
---
 libavcodec/hevc.c | 118 +-
 libavcodec/hevcdsp.c  |  33 ++--
 libavcodec/hevcdsp.h  |  36 -
 libavcodec/hevcdsp_template.c |  81 ++---
 4 files changed, 174 insertions(+), 94 deletions(-)

diff --git a/libavcodec/hevc.c b/libavcodec/hevc.c
index 3dc510d..5da8249 100644
--- a/libavcodec/hevc.c
+++ b/libavcodec/hevc.c
@@ -1725,32 +1725,32 @@ static void hls_prediction_unit(HEVCContext *s, int x0, 
int y0,
 
 if ((s->sh.slice_type == P_SLICE && s->ps.pps->weighted_pred_flag) ||
 (s->sh.slice_type == B_SLICE && s->ps.pps->weighted_bipred_flag)) {
-s->hevcdsp.weighted_pred(s->sh.luma_log2_weight_denom,
- 
s->sh.luma_weight_l0[current_mv.ref_idx[0]],
- 
s->sh.luma_offset_l0[current_mv.ref_idx[0]],
- dst0, s->frame->linesize[0], tmp,
- tmpstride, nPbW, nPbH);
+s->hevcdsp.weighted_pred[pred_idx](s->sh.luma_log2_weight_denom,
+   
s->sh.luma_weight_l0[current_mv.ref_idx[0]],
+   
s->sh.luma_offset_l0[current_mv.ref_idx[0]],
+   dst0, s->frame->linesize[0], 
tmp,
+   tmpstride, nPbH);
 } else {
-s->hevcdsp.put_unweighted_pred(dst0, s->frame->linesize[0], tmp, 
tmpstride, nPbW, nPbH);
+s->hevcdsp.put_unweighted_pred[pred_idx](dst0, 
s->frame->linesize[0], tmp, tmpstride, nPbH);
 }
 chroma_mc(s, tmp, tmp2, tmpstride, ref0->frame,
   ¤t_mv.mv[0], x0 / 2, y0 / 2, nPbW / 2, nPbH / 2, 
pred_idx);
 
 if ((s->sh.slice_type == P_SLICE && s->ps.pps->weighted_pred_flag) ||
 (s->sh.slice_type == B_SLICE && s->ps.pps->weighted_bipred_flag)) {
-s->hevcdsp.weighted_pred(s->sh.chroma_log2_weight_denom,
- 
s->sh.chroma_weight_l0[current_mv.ref_idx[0]][0],
- 
s->sh.chroma_offset_l0[current_mv.ref_idx[0]][0],
- dst1, s->frame->linesize[1], tmp, 
tmpstride,
- nPbW / 2, nPbH / 2);
-s->hevcdsp.weighted_pred(s->sh.chroma_log2_weight_denom,
- 
s->sh.chroma_weight_l0[current_mv.ref_idx[0]][1],
- 
s->sh.chroma_offset_l0[current_mv.ref_idx[0]][1],
- dst2, s->frame->linesize[2], tmp2, 
tmpstride,
- nPbW / 2, nPbH / 2);
+
s->hevcdsp.weighted_pred_chroma[pred_idx](s->sh.chroma_log2_weight_denom,
+  
s->sh.chroma_weight_l0[current_mv.ref_idx[0]][0],
+  
s->sh.chroma_offset_l0[current_mv.ref_idx[0]][0],
+  dst1, 
s->frame->linesize[1], tmp, tmpstride,
+  nPbH / 2);
+
s->hevcdsp.weighted_pred_chroma[pred_idx](s->sh.chroma_log2_weight_denom,
+  
s->sh.chroma_weight_l0[current_mv.ref_idx[0]][1],
+  
s->sh.chroma_offset_l0[current_mv.ref_idx[0]][1],
+  dst2, 
s->frame->linesize[2], tmp2, tmpstride,
+  nPbH / 2);
 } else {
-s->hevcdsp.put_unweighted_pred(dst1, s->frame->linesize[1], tmp, 
tmpstride, nPbW/2, nPbH/2);
-s->hevcdsp.put_unweighted_pred(dst2, s->frame->linesize[2], tmp2, 
tmpstride, nPbW/2, nPbH/2);
+s->hevcdsp.put_unweighted_pred_chroma[pred_idx](dst1, 
s->frame->linesize[1], tmp,  tmpstride, nPbH/2);
+s->hevcdsp.put_unweighted_pred_chroma[pred_idx](dst2, 
s->frame->linesize[2], tmp2, tmpstride, nPbH/2);
 }
 } else if (!current_mv.pred_flag[0] && current_mv.pred_flag[1]) {
 DECLARE_ALIGNED(16, int16_t, tmp [MAX_PB_SIZE * MAX_PB_SIZE]);
@@ -1761,13 +1761,13 @@ static void hls_prediction_unit(HEVCContext *s, int x0, 
int y0,
 
 if ((s->sh.slice_type == P_SLICE && s->ps.pps->weighted_pred_flag) ||
 (s->sh.slice_type == B_SLICE && s->ps.pps->weighted_bipred_flag)) {
-s->hevcdsp.weighted_pred(s->sh.luma_log2_weight_denom,
-  
s->sh.luma_weight_l1[current_mv.ref_idx[1]],
-  
s->sh.luma_offset_l1[current_mv.ref_idx[1]],
-  dst0, s->frame->linesize[0], tmp, 
tmpstride,
- 

[libav-devel] [PATCH 7/8] checkasm: add HEVC MC tests

2015-08-19 Thread Anton Khirnov
---
 tests/checkasm/Makefile   |   1 +
 tests/checkasm/checkasm.c |   3 +
 tests/checkasm/checkasm.h |   1 +
 tests/checkasm/hevc_mc.c  | 294 ++
 4 files changed, 299 insertions(+)
 create mode 100644 tests/checkasm/hevc_mc.c

diff --git a/tests/checkasm/Makefile b/tests/checkasm/Makefile
index 9498ebf..4024137 100644
--- a/tests/checkasm/Makefile
+++ b/tests/checkasm/Makefile
@@ -2,6 +2,7 @@
 AVCODECOBJS-$(CONFIG_BSWAPDSP) += bswapdsp.o
 AVCODECOBJS-$(CONFIG_H264PRED) += h264pred.o
 AVCODECOBJS-$(CONFIG_H264QPEL) += h264qpel.o
+AVCODECOBJS-$(CONFIG_HEVC_DECODER) += hevc_mc.o
 
 CHECKASMOBJS-$(CONFIG_AVCODEC) += $(AVCODECOBJS-yes)
 
diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c
index b564e7e..1984bca 100644
--- a/tests/checkasm/checkasm.c
+++ b/tests/checkasm/checkasm.c
@@ -66,6 +66,9 @@ static const struct {
 #if CONFIG_H264QPEL
 { "h264qpel", checkasm_check_h264qpel },
 #endif
+#if CONFIG_HEVC_DECODER
+{ "hevc_mc", checkasm_check_hevc_mc },
+#endif
 { NULL }
 };
 
diff --git a/tests/checkasm/checkasm.h b/tests/checkasm/checkasm.h
index 443546a..8ef250e 100644
--- a/tests/checkasm/checkasm.h
+++ b/tests/checkasm/checkasm.h
@@ -32,6 +32,7 @@
 void checkasm_check_bswapdsp(void);
 void checkasm_check_h264pred(void);
 void checkasm_check_h264qpel(void);
+void checkasm_check_hevc_mc(void);
 
 intptr_t (*checkasm_check_func(intptr_t (*func)(), const char *name, ...))() 
av_printf_format(2, 3);
 int checkasm_bench_func(void);
diff --git a/tests/checkasm/hevc_mc.c b/tests/checkasm/hevc_mc.c
new file mode 100644
index 000..b1b1dd6
--- /dev/null
+++ b/tests/checkasm/hevc_mc.c
@@ -0,0 +1,294 @@
+/*
+ * Copyright (c) 2015 Anton Khirnov
+ *
+ * This file is part of Libav.
+ *
+ * Libav is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * Libav is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with Libav; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include 
+
+#include "checkasm.h"
+
+#include "libavcodec/avcodec.h"
+#include "libavcodec/hevcdsp.h"
+
+#include "libavutil/common.h"
+#include "libavutil/intreadwrite.h"
+
+// max PU size + interpolation stencil
+#define BUF_SIZE (FFALIGN(64 + 7, 16) * (64 + 7) * 2)
+
+#define PIXEL_SIZE(depth) ((depth + 7) / 8)
+
+#define randomize_buffers(buf, size, depth) \
+do {\
+uint32_t mask = pixel_mask[depth - 8];  \
+int i;  \
+for (i = 0; i < size; i += 4) { \
+uint32_t r = rnd() & mask;  \
+AV_WN32A(buf + i, r);   \
+}   \
+} while (0)
+
+static const uint32_t pixel_mask[3] = { 0x, 0x01ff01ff, 0x03ff03ff };
+
+static const int pred_heights[][7] = {
+[2]  = {  8, 4, 2, 0 },
+[4]  = { 16, 8, 4, 2, 0 },
+[6]  = {  8, 0 },
+[8]  = { 32, 16, 8, 4, 2, 0 },
+[12] = { 16, 0 },
+[16] = { 64, 32, 16, 12, 8, 4, 0 },
+[24] = { 32, 0 },
+[32] = { 64, 32, 24, 16, 8, 0 },
+[48] = { 64, 0 },
+[64] = { 64, 48, 32, 16, 0 },
+};
+
+static const int pred_widths[] = { 4, 8, 12, 16, 24, 32, 48, 64 };
+
+static const char *interp_names[2][2] = { { "pixels", "h" }, { "v", "hv" } };
+
+static void unweighted_pred(uint8_t *dst0, uint8_t *dst1, int16_t *src0, 
int16_t *src1,
+int width, int bit_depth)
+{
+const int srcstride = FFALIGN(width, 16) * sizeof(*src0);
+const int dststride = FFALIGN(width, 16) * PIXEL_SIZE(bit_depth);
+int i;
+
+randomize_buffers(src0, BUF_SIZE * sizeof(*src0), 8);
+if (src1)
+randomize_buffers(src1, BUF_SIZE * sizeof(*src1), 8);
+
+memset(dst0, 0, BUF_SIZE * sizeof(*dst0));
+memset(dst1, 0, BUF_SIZE * sizeof(*dst1));
+
+for (i = 0; i < FF_ARRAY_ELEMS(pred_heights[i]); i++) {
+int height = pred_heights[width][i];
+
+if (!height)
+break;
+
+if (!src1) {
+call_ref(dst0, dststride, src0,   srcstride, height);
+call_new(dst1, dststride, src0,   srcstride, height);
+} else {
+call_ref(dst0, dststride, src0, src1, srcstride, height);
+call_new(dst1, dststride, src0, src1, srcstride, height);
+}
+if (memcmp(dst0, dst1, dststride * height))
+fail();
+
+if (!src1)
+bench_new(dst1, dststride, src0,   srcstride, h

[libav-devel] [PATCH 1/8] hevc: avoid invalid shifts of negative values

2015-08-19 Thread Anton Khirnov
---
 libavcodec/hevc.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/libavcodec/hevc.c b/libavcodec/hevc.c
index 0dfe7a2..6395563 100644
--- a/libavcodec/hevc.c
+++ b/libavcodec/hevc.c
@@ -1494,7 +1494,7 @@ static void luma_mc(HEVCContext *s, int16_t *dst, 
ptrdiff_t dststride,
 
 x_off += mv->x >> 2;
 y_off += mv->y >> 2;
-src   += y_off * srcstride + (x_off << s->ps.sps->pixel_shift);
+src   += y_off * srcstride + (x_off * (1 << s->ps.sps->pixel_shift));
 
 if (x_off < extra_left || y_off < extra_top ||
 x_off >= pic_width - block_w - ff_hevc_qpel_extra_after[mx] ||
@@ -1548,8 +1548,8 @@ static void chroma_mc(HEVCContext *s, int16_t *dst1, 
int16_t *dst2,
 
 x_off += mv->x >> 3;
 y_off += mv->y >> 3;
-src1  += y_off * src1stride + (x_off << s->ps.sps->pixel_shift);
-src2  += y_off * src2stride + (x_off << s->ps.sps->pixel_shift);
+src1  += y_off * src1stride + (x_off * (1 << s->ps.sps->pixel_shift));
+src2  += y_off * src2stride + (x_off * (1 << s->ps.sps->pixel_shift));
 
 if (x_off < EPEL_EXTRA_BEFORE || y_off < EPEL_EXTRA_AFTER ||
 x_off >= pic_width - block_w - EPEL_EXTRA_AFTER ||
-- 
2.0.0

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH 4/8] hevcdsp: split the epel functions by width

2015-08-19 Thread Anton Khirnov
This should allow for more efficient SIMD.
---
 libavcodec/hevc.c | 29 --
 libavcodec/hevcdsp.c  | 18 ---
 libavcodec/hevcdsp.h  |  6 ++--
 libavcodec/hevcdsp_template.c | 71 +--
 4 files changed, 85 insertions(+), 39 deletions(-)

diff --git a/libavcodec/hevc.c b/libavcodec/hevc.c
index 7070106..3dc510d 100644
--- a/libavcodec/hevc.c
+++ b/libavcodec/hevc.c
@@ -1533,7 +1533,7 @@ static void luma_mc(HEVCContext *s, int16_t *dst, 
ptrdiff_t dststride,
  */
 static void chroma_mc(HEVCContext *s, int16_t *dst1, int16_t *dst2,
   ptrdiff_t dststride, AVFrame *ref, const Mv *mv,
-  int x_off, int y_off, int block_w, int block_h)
+  int x_off, int y_off, int block_w, int block_h, int 
pred_idx)
 {
 HEVCLocalContext *lc = &s->HEVClc;
 uint8_t *src1= ref->data[1];
@@ -1571,8 +1571,8 @@ static void chroma_mc(HEVCContext *s, int16_t *dst1, 
int16_t *dst2,
 
 src1 = lc->edge_emu_buffer + buf_offset1;
 src1stride = edge_emu_stride;
-s->hevcdsp.put_hevc_epel[!!my][!!mx](dst1, dststride, src1, src1stride,
- block_w, block_h, mx, my, 
lc->mc_buffer);
+s->hevcdsp.put_hevc_epel[!!my][!!mx][pred_idx](dst1, dststride, src1, 
src1stride,
+   block_h, mx, my, 
lc->mc_buffer);
 
 s->vdsp.emulated_edge_mc(lc->edge_emu_buffer, src2 - offset2,
  edge_emu_stride, src2stride,
@@ -1583,16 +1583,13 @@ static void chroma_mc(HEVCContext *s, int16_t *dst1, 
int16_t *dst2,
 src2 = lc->edge_emu_buffer + buf_offset2;
 src2stride = edge_emu_stride;
 
-s->hevcdsp.put_hevc_epel[!!my][!!mx](dst2, dststride, src2, src2stride,
- block_w, block_h, mx, my,
- lc->mc_buffer);
+s->hevcdsp.put_hevc_epel[!!my][!!mx][pred_idx](dst2, dststride, src2, 
src2stride,
+   block_h, mx, my, 
lc->mc_buffer);
 } else {
-s->hevcdsp.put_hevc_epel[!!my][!!mx](dst1, dststride, src1, src1stride,
- block_w, block_h, mx, my,
- lc->mc_buffer);
-s->hevcdsp.put_hevc_epel[!!my][!!mx](dst2, dststride, src2, src2stride,
- block_w, block_h, mx, my,
- lc->mc_buffer);
+s->hevcdsp.put_hevc_epel[!!my][!!mx][pred_idx](dst1, dststride, src1, 
src1stride,
+   block_h, mx, my, 
lc->mc_buffer);
+s->hevcdsp.put_hevc_epel[!!my][!!mx][pred_idx](dst2, dststride, src2, 
src2stride,
+   block_h, mx, my, 
lc->mc_buffer);
 }
 }
 
@@ -1737,7 +1734,7 @@ static void hls_prediction_unit(HEVCContext *s, int x0, 
int y0,
 s->hevcdsp.put_unweighted_pred(dst0, s->frame->linesize[0], tmp, 
tmpstride, nPbW, nPbH);
 }
 chroma_mc(s, tmp, tmp2, tmpstride, ref0->frame,
-  ¤t_mv.mv[0], x0 / 2, y0 / 2, nPbW / 2, nPbH / 2);
+  ¤t_mv.mv[0], x0 / 2, y0 / 2, nPbW / 2, nPbH / 2, 
pred_idx);
 
 if ((s->sh.slice_type == P_SLICE && s->ps.pps->weighted_pred_flag) ||
 (s->sh.slice_type == B_SLICE && s->ps.pps->weighted_bipred_flag)) {
@@ -1774,7 +1771,7 @@ static void hls_prediction_unit(HEVCContext *s, int x0, 
int y0,
 }
 
 chroma_mc(s, tmp, tmp2, tmpstride, ref1->frame,
-  ¤t_mv.mv[1], x0/2, y0/2, nPbW/2, nPbH/2);
+  ¤t_mv.mv[1], x0/2, y0/2, nPbW/2, nPbH/2, pred_idx);
 
 if ((s->sh.slice_type == P_SLICE && s->ps.pps->weighted_pred_flag) ||
 (s->sh.slice_type == B_SLICE && s->ps.pps->weighted_bipred_flag)) {
@@ -1816,9 +1813,9 @@ static void hls_prediction_unit(HEVCContext *s, int x0, 
int y0,
 }
 
 chroma_mc(s, tmp, tmp2, tmpstride, ref0->frame,
-  ¤t_mv.mv[0], x0 / 2, y0 / 2, nPbW / 2, nPbH / 2);
+  ¤t_mv.mv[0], x0 / 2, y0 / 2, nPbW / 2, nPbH / 2, 
pred_idx);
 chroma_mc(s, tmp3, tmp4, tmpstride, ref1->frame,
-  ¤t_mv.mv[1], x0 / 2, y0 / 2, nPbW / 2, nPbH / 2);
+  ¤t_mv.mv[1], x0 / 2, y0 / 2, nPbW / 2, nPbH / 2, 
pred_idx);
 
 if ((s->sh.slice_type == P_SLICE && s->ps.pps->weighted_pred_flag) ||
 (s->sh.slice_type == B_SLICE && s->ps.pps->weighted_bipred_flag)) {
diff --git a/libavcodec/hevcdsp.c b/libavcodec/hevcdsp.c
index 4e311a6..7cb273b 100644
--- a/libavcodec/hevcdsp.c
+++ b/libavcodec/hevcdsp.c
@@ -122,6 +122,12 @@ void ff_hevc_dsp_init(HEVCDSPContext *hevcdsp, int 
bit_depth)
 hevcdsp->put_hevc_qpel[1][0][i] = FUNC(put_hevc_qpel_v_  ## wi

[libav-devel] [PATCH 8/8] hevcdsp: add x86 SIMD for MC

2015-08-19 Thread Anton Khirnov
---
 libavcodec/hevc.c |   6 +-
 libavcodec/hevc.h |   2 +-
 libavcodec/hevcdsp.c  |  24 +-
 libavcodec/hevcdsp.h  |   5 +-
 libavcodec/hevcdsp_template.c |   8 +-
 libavcodec/x86/Makefile   |   3 +-
 libavcodec/x86/hevc_mc.asm| 816 ++
 libavcodec/x86/hevcdsp_init.c | 405 +
 8 files changed, 1258 insertions(+), 11 deletions(-)
 create mode 100644 libavcodec/x86/hevc_mc.asm

diff --git a/libavcodec/hevc.c b/libavcodec/hevc.c
index dd54525..9cae92c 100644
--- a/libavcodec/hevc.c
+++ b/libavcodec/hevc.c
@@ -38,9 +38,9 @@
 #include "golomb.h"
 #include "hevc.h"
 
-const uint8_t ff_hevc_qpel_extra_before[4] = { 0, 3, 3, 2 };
-const uint8_t ff_hevc_qpel_extra_after[4]  = { 0, 3, 4, 4 };
-const uint8_t ff_hevc_qpel_extra[4]= { 0, 6, 7, 6 };
+const uint8_t ff_hevc_qpel_extra_before[4] = { 0, 3, 3, 3 };
+const uint8_t ff_hevc_qpel_extra_after[4]  = { 0, 4, 4, 4 };
+const uint8_t ff_hevc_qpel_extra[4]= { 0, 7, 7, 7 };
 
 static const uint8_t scan_1x1[1] = { 0 };
 
diff --git a/libavcodec/hevc.h b/libavcodec/hevc.h
index c6e05bc..7c87a55 100644
--- a/libavcodec/hevc.h
+++ b/libavcodec/hevc.h
@@ -740,7 +740,7 @@ typedef struct HEVCPredContext {
 } HEVCPredContext;
 
 typedef struct HEVCLocalContext {
-DECLARE_ALIGNED(16, int16_t, mc_buffer[(MAX_PB_SIZE + 7) * MAX_PB_SIZE]);
+DECLARE_ALIGNED(16, int16_t, mc_buffer[(MAX_PB_SIZE + 24) * MAX_PB_SIZE]);
 uint8_t cabac_state[HEVC_CONTEXTS];
 
 uint8_t first_qp_group;
diff --git a/libavcodec/hevcdsp.c b/libavcodec/hevcdsp.c
index ab9ba3b..2b29b19 100644
--- a/libavcodec/hevcdsp.c
+++ b/libavcodec/hevcdsp.c
@@ -89,7 +89,7 @@ static const int8_t transform[32][32] = {
   90, -90,  88, -85,  82, -78,  73, -67,  61, -54,  46, -38,  31, -22,  
13,  -4 },
 };
 
-DECLARE_ALIGNED(16, const int8_t, ff_hevc_epel_filters[7][16]) = {
+DECLARE_ALIGNED(16, const int16_t, ff_hevc_epel_coeffs[7][16]) = {
 { -2, 58, 10, -2, -2, 58, 10, -2, -2, 58, 10, -2, -2, 58, 10, -2 },
 { -4, 54, 16, -2, -4, 54, 16, -2, -4, 54, 16, -2, -4, 54, 16, -2 },
 { -6, 46, 28, -4, -6, 46, 28, -4, -6, 46, 28, -4, -6, 46, 28, -4 },
@@ -99,6 +99,28 @@ DECLARE_ALIGNED(16, const int8_t, 
ff_hevc_epel_filters[7][16]) = {
 { -2, 10, 58, -2, -2, 10, 58, -2, -2, 10, 58, -2, -2, 10, 58, -2 },
 };
 
+DECLARE_ALIGNED(16, const int8_t, ff_hevc_epel_coeffs8[7][16]) = {
+{ -2, 58, 10, -2, -2, 58, 10, -2, -2, 58, 10, -2, -2, 58, 10, -2 },
+{ -4, 54, 16, -2, -4, 54, 16, -2, -4, 54, 16, -2, -4, 54, 16, -2 },
+{ -6, 46, 28, -4, -6, 46, 28, -4, -6, 46, 28, -4, -6, 46, 28, -4 },
+{ -4, 36, 36, -4, -4, 36, 36, -4, -4, 36, 36, -4, -4, 36, 36, -4 },
+{ -4, 28, 46, -6, -4, 28, 46, -6, -4, 28, 46, -6, -4, 28, 46, -6 },
+{ -2, 16, 54, -4, -2, 16, 54, -4, -2, 16, 54, -4, -2, 16, 54, -4 },
+{ -2, 10, 58, -2, -2, 10, 58, -2, -2, 10, 58, -2, -2, 10, 58, -2 },
+};
+
+DECLARE_ALIGNED(16, const int16_t, ff_hevc_qpel_coeffs[3][8]) = {
+{ -1, 4, -10, 58, 17, -5,  1,  0 },
+{ -1, 4, -11, 40, 40, -11, 4, -1 },
+{  0, 1,  -5, 17, 58, -10, 4, -1 },
+};
+
+DECLARE_ALIGNED(16, const int8_t, ff_hevc_qpel_coeffs8[3][16]) = {
+{ -1, 4, -10, 58, 17, -5,  1,  0, -1, 4, -10, 58, 17, -5,  1,  0 },
+{ -1, 4, -11, 40, 40, -11, 4, -1, -1, 4, -11, 40, 40, -11, 4, -1 },
+{  0, 1,  -5, 17, 58, -10, 4, -1,  0, 1,  -5, 17, 58, -10, 4, -1 },
+};
+
 #define BIT_DEPTH 8
 #include "hevcdsp_template.c"
 #undef BIT_DEPTH
diff --git a/libavcodec/hevcdsp.h b/libavcodec/hevcdsp.h
index ee3aa70..4daa2e5 100644
--- a/libavcodec/hevcdsp.h
+++ b/libavcodec/hevcdsp.h
@@ -118,6 +118,9 @@ void ff_hevc_dsp_init(HEVCDSPContext *hpc, int bit_depth);
 
 void ff_hevc_dsp_init_x86(HEVCDSPContext *c, const int bit_depth);
 
-extern const int8_t ff_hevc_epel_filters[7][16];
+extern const int16_t ff_hevc_epel_coeffs[7][16];
+extern const int8_t ff_hevc_epel_coeffs8[7][16];
+extern const int16_t ff_hevc_qpel_coeffs[3][8];
+extern const int8_t ff_hevc_qpel_coeffs8[3][16];
 
 #endif /* AVCODEC_HEVCDSP_H */
diff --git a/libavcodec/hevcdsp_template.c b/libavcodec/hevcdsp_template.c
index 8bb0a57..1f9adee 100644
--- a/libavcodec/hevcdsp_template.c
+++ b/libavcodec/hevcdsp_template.c
@@ -999,7 +999,7 @@ static inline void FUNC(put_hevc_epel_h)(int16_t *dst, 
ptrdiff_t dststride,
 int x, y;
 pixel *src = (pixel *)_src;
 ptrdiff_t srcstride  = _srcstride / sizeof(pixel);
-const int8_t *filter = ff_hevc_epel_filters[mx - 1];
+const int16_t *filter = ff_hevc_epel_coeffs[mx - 1];
 int8_t filter_0 = filter[0];
 int8_t filter_1 = filter[1];
 int8_t filter_2 = filter[2];
@@ -1021,7 +1021,7 @@ static inline void FUNC(put_hevc_epel_v)(int16_t *dst, 
ptrdiff_t dststride,
 int x, y;
 pixel *src = (pixel *)_src;
 ptrdiff_t srcstride = _srcstride / sizeof(pixel);
-const int8_t *filter = ff_hevc_epel_filters[my - 1];
+const int16_t *filter = ff_hevc_epel_co

[libav-devel] [PATCH 6/8] hevc: change the stride of the MC buffer to be in bytes instead of elements

2015-08-19 Thread Anton Khirnov
Currently, the frame stride is passed in bytes, while the MC buffer size
is in int16_t elements, This can be confusing, so pass both strides in
bytes.
---
 libavcodec/hevc.c |  2 +-
 libavcodec/hevcdsp_template.c | 12 
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/libavcodec/hevc.c b/libavcodec/hevc.c
index 5da8249..dd54525 100644
--- a/libavcodec/hevc.c
+++ b/libavcodec/hevc.c
@@ -1666,7 +1666,7 @@ static void hls_prediction_unit(HEVCContext *s, int x0, 
int y0,
 RefPicList  *refPicList = s->ref->refPicList;
 HEVCFrame *ref0, *ref1;
 
-int tmpstride = MAX_PB_SIZE;
+int tmpstride = MAX_PB_SIZE * sizeof(int16_t);
 
 uint8_t *dst0 = POS(0, x0, y0);
 uint8_t *dst1 = POS(1, x0, y0);
diff --git a/libavcodec/hevcdsp_template.c b/libavcodec/hevcdsp_template.c
index df570f3..8bb0a57 100644
--- a/libavcodec/hevcdsp_template.c
+++ b/libavcodec/hevcdsp_template.c
@@ -785,6 +785,7 @@ FUNC(put_hevc_qpel_pixels)(int16_t *dst, ptrdiff_t 
dststride,
 pixel *src  = (pixel *)_src;
 ptrdiff_t srcstride = _srcstride / sizeof(pixel);
 
+dststride /= sizeof(*dst);
 for (y = 0; y < height; y++) {
 for (x = 0; x < width; x++)
 dst[x] = src[x] << (14 - BIT_DEPTH);
@@ -832,6 +833,7 @@ static void FUNC(put_hevc_qpel_h ## H)(int16_t *dst,  
ptrdiff_t dststride, \
 pixel *src = (pixel*)_src; 
\
 ptrdiff_t srcstride = _srcstride / sizeof(pixel);  
\

\
+dststride /= sizeof(*dst); 
\
 for (y = 0; y < height; y++) { 
\
 for (x = 0; x < width; x++)
\
 dst[x] = QPEL_FILTER_ ## H(src, 1) >> (BIT_DEPTH - 8); 
\
@@ -850,6 +852,7 @@ static void FUNC(put_hevc_qpel_v ## V)(int16_t *dst,  
ptrdiff_t dststride, \
 pixel *src = (pixel*)_src; 
\
 ptrdiff_t srcstride = _srcstride / sizeof(pixel);  
\

\
+dststride /= sizeof(*dst); 
\
 for (y = 0; y < height; y++)  {
\
 for (x = 0; x < width; x++)
\
 dst[x] = QPEL_FILTER_ ## V(src, srcstride) >> (BIT_DEPTH - 8); 
\
@@ -873,6 +876,7 @@ static void FUNC(put_hevc_qpel_h ## H ## v ## V)(int16_t 
*dst, \
 int16_t tmp_array[(MAX_PB_SIZE + 7) * MAX_PB_SIZE];
\
 int16_t *tmp = tmp_array;  
\

\
+dststride /= sizeof(*dst); 
\
 src -= ff_hevc_qpel_extra_before[V] * srcstride;   
\

\
 for (y = 0; y < height + ff_hevc_qpel_extra[V]; y++) { 
\
@@ -972,6 +976,7 @@ static inline void FUNC(put_hevc_epel_pixels)(int16_t *dst, 
ptrdiff_t dststride,
 pixel *src  = (pixel *)_src;
 ptrdiff_t srcstride = _srcstride / sizeof(pixel);
 
+dststride /= sizeof(*dst);
 for (y = 0; y < height; y++) {
 for (x = 0; x < width; x++)
 dst[x] = src[x] << (14 - BIT_DEPTH);
@@ -999,6 +1004,7 @@ static inline void FUNC(put_hevc_epel_h)(int16_t *dst, 
ptrdiff_t dststride,
 int8_t filter_1 = filter[1];
 int8_t filter_2 = filter[2];
 int8_t filter_3 = filter[3];
+dststride /= sizeof(*dst);
 for (y = 0; y < height; y++) {
 for (x = 0; x < width; x++)
 dst[x] = EPEL_FILTER(src, 1) >> (BIT_DEPTH - 8);
@@ -1021,6 +1027,7 @@ static inline void FUNC(put_hevc_epel_v)(int16_t *dst, 
ptrdiff_t dststride,
 int8_t filter_2 = filter[2];
 int8_t filter_3 = filter[3];
 
+dststride /= sizeof(*dst);
 for (y = 0; y < height; y++) {
 for (x = 0; x < width; x++)
 dst[x] = EPEL_FILTER(src, srcstride) >> (BIT_DEPTH - 8);
@@ -1046,6 +1053,7 @@ static inline void FUNC(put_hevc_epel_hv)(int16_t *dst, 
ptrdiff_t dststride,
 int16_t tmp_array[(MAX_PB_SIZE + 3) * MAX_PB_SIZE];
 int16_t *tmp = tmp_array;
 
+dststride /= sizeof(*dst);
 src -= EPEL_EXTRA_BEFORE * srcstride;
 
 for (y = 0; y < height + EPEL_EXTRA; y++) {
@@ -1122,6 +1130,7 @@ FUNC(put_unweighted_pred)(uint8_t *_dst, ptrdiff_t 
_dststride,
 #else
 int offset = 0;
 #endif
+srcstride /= sizeof(*src);
 for (y = 0; y < height; y++) {
 for (x = 0; x < width; x++)
 dst[x] = av_clip_pixel((src[x] + offset) >> shift);
@@ -114

[libav-devel] [PATCH 3/8] hevcdsp: split the qpel functions by width instead of by the subpixel fraction

2015-08-19 Thread Anton Khirnov
This should allow for more efficient SIMD.

Keep the C versions as they are now, to allow the compiler to inline the
interpolation coefficients.
---
 libavcodec/hevc.c | 19 -
 libavcodec/hevcdsp.c  | 30 ++---
 libavcodec/hevcdsp.h  |  6 ++---
 libavcodec/hevcdsp_template.c | 63 ---
 4 files changed, 89 insertions(+), 29 deletions(-)

diff --git a/libavcodec/hevc.c b/libavcodec/hevc.c
index f17c313..7070106 100644
--- a/libavcodec/hevc.c
+++ b/libavcodec/hevc.c
@@ -1479,7 +1479,7 @@ static void hls_mvd_coding(HEVCContext *s, int x0, int 
y0, int log2_cb_size)
  */
 static void luma_mc(HEVCContext *s, int16_t *dst, ptrdiff_t dststride,
 AVFrame *ref, const Mv *mv, int x_off, int y_off,
-int block_w, int block_h)
+int block_w, int block_h, int pred_idx)
 {
 HEVCLocalContext *lc = &s->HEVClc;
 uint8_t *src = ref->data[0];
@@ -1513,8 +1513,8 @@ static void luma_mc(HEVCContext *s, int16_t *dst, 
ptrdiff_t dststride,
 src = lc->edge_emu_buffer + buf_offset;
 srcstride = edge_emu_stride;
 }
-s->hevcdsp.put_hevc_qpel[my][mx](dst, dststride, src, srcstride, block_w,
- block_h, lc->mc_buffer);
+s->hevcdsp.put_hevc_qpel[!!my][!!mx][pred_idx](dst, dststride, src, 
srcstride,
+   block_h, mx, my, 
lc->mc_buffer);
 }
 
 /**
@@ -1651,6 +1651,11 @@ static void hls_prediction_unit(HEVCContext *s, int x0, 
int y0,
 int nPbW, int nPbH,
 int log2_cb_size, int partIdx)
 {
+static const int pred_indices[] = {
+[4] = 0, [8] = 1, [12] = 2, [16] = 3, [24] = 4, [32] = 5, [48] = 6, 
[64] = 7,
+};
+const int pred_idx = pred_indices[nPbW];
+
 #define POS(c_idx, x, y)   
   \
 &s->frame->data[c_idx][((y) >> s->ps.sps->vshift[c_idx]) * 
s->frame->linesize[c_idx] + \
(((x) >> s->ps.sps->hshift[c_idx]) << 
s->ps.sps->pixel_shift)]
@@ -1719,7 +1724,7 @@ static void hls_prediction_unit(HEVCContext *s, int x0, 
int y0,
 DECLARE_ALIGNED(16, int16_t, tmp2[MAX_PB_SIZE * MAX_PB_SIZE]);
 
 luma_mc(s, tmp, tmpstride, ref0->frame,
-¤t_mv.mv[0], x0, y0, nPbW, nPbH);
+¤t_mv.mv[0], x0, y0, nPbW, nPbH, pred_idx);
 
 if ((s->sh.slice_type == P_SLICE && s->ps.pps->weighted_pred_flag) ||
 (s->sh.slice_type == B_SLICE && s->ps.pps->weighted_bipred_flag)) {
@@ -1755,7 +1760,7 @@ static void hls_prediction_unit(HEVCContext *s, int x0, 
int y0,
 DECLARE_ALIGNED(16, int16_t, tmp2[MAX_PB_SIZE * MAX_PB_SIZE]);
 
 luma_mc(s, tmp, tmpstride, ref1->frame,
-¤t_mv.mv[1], x0, y0, nPbW, nPbH);
+¤t_mv.mv[1], x0, y0, nPbW, nPbH, pred_idx);
 
 if ((s->sh.slice_type == P_SLICE && s->ps.pps->weighted_pred_flag) ||
 (s->sh.slice_type == B_SLICE && s->ps.pps->weighted_bipred_flag)) {
@@ -1792,9 +1797,9 @@ static void hls_prediction_unit(HEVCContext *s, int x0, 
int y0,
 DECLARE_ALIGNED(16, int16_t, tmp4[MAX_PB_SIZE * MAX_PB_SIZE]);
 
 luma_mc(s, tmp, tmpstride, ref0->frame,
-¤t_mv.mv[0], x0, y0, nPbW, nPbH);
+¤t_mv.mv[0], x0, y0, nPbW, nPbH, pred_idx);
 luma_mc(s, tmp2, tmpstride, ref1->frame,
-¤t_mv.mv[1], x0, y0, nPbW, nPbH);
+¤t_mv.mv[1], x0, y0, nPbW, nPbH, pred_idx);
 
 if ((s->sh.slice_type == P_SLICE && s->ps.pps->weighted_pred_flag) ||
 (s->sh.slice_type == B_SLICE && s->ps.pps->weighted_bipred_flag)) {
diff --git a/libavcodec/hevcdsp.c b/libavcodec/hevcdsp.c
index 216101a..4e311a6 100644
--- a/libavcodec/hevcdsp.c
+++ b/libavcodec/hevcdsp.c
@@ -116,6 +116,12 @@ void ff_hevc_dsp_init(HEVCDSPContext *hevcdsp, int 
bit_depth)
 #undef FUNC
 #define FUNC(a, depth) a ## _ ## depth
 
+#define QPEL_FUNC(i, width, depth)\
+hevcdsp->put_hevc_qpel[0][0][i] = FUNC(put_hevc_qpel_pixels_ ## width, 
depth);  \
+hevcdsp->put_hevc_qpel[0][1][i] = FUNC(put_hevc_qpel_h_  ## width, 
depth);  \
+hevcdsp->put_hevc_qpel[1][0][i] = FUNC(put_hevc_qpel_v_  ## width, 
depth);  \
+hevcdsp->put_hevc_qpel[1][1][i] = FUNC(put_hevc_qpel_hv_ ## width, 
depth);  \
+
 #define HEVC_DSP(depth) \
 hevcdsp->put_pcm= FUNC(put_pcm, depth); \
 hevcdsp->transquant_bypass[0]   = FUNC(transquant_bypass4x4, depth);\
@@ -139,22 +145,14 @@ void ff_hevc_dsp_init(HEVCDSPContext *hevcdsp, int 
bit_depth)
 hevcdsp->sao_edge_filter[2] = FUNC(sao_edge_filter_2, depth);   \
 hevcdsp->sao_edge_filter[3] = FUNC(sao_edge_filter_3, depth);   \
   

Re: [libav-devel] [PATCH 1/2] imgutils: Use av_pix_fmt_get_chroma_sub_sample in avcodec_get_chroma_sub_sample

2015-08-19 Thread Federico Tomassetti
two newbie questions:
1) Is this function still needed? Could we just call
ac_pix_fmt_get_chroma_sub_sample directly?
2) What about the return code? Isn't it lost? What happens in case of error?

Federico

On Tue, Aug 18, 2015 at 4:22 PM, Luca Barbato  wrote:
> Avoid a NULL dereference.
>
> CC: libav-sta...@libav.org
> ---
>  libavcodec/imgconvert.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/libavcodec/imgconvert.c b/libavcodec/imgconvert.c
> index 2d32602..1f6d587 100644
> --- a/libavcodec/imgconvert.c
> +++ b/libavcodec/imgconvert.c
> @@ -41,9 +41,7 @@
>
>  void avcodec_get_chroma_sub_sample(enum AVPixelFormat pix_fmt, int *h_shift, 
> int *v_shift)
>  {
> -const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(pix_fmt);
> -*h_shift = desc->log2_chroma_w;
> -*v_shift = desc->log2_chroma_h;
> +av_pix_fmt_get_chroma_sub_sample(pix_fmt, h_shift, v_shift);
>  }
>
>  static int is_gray(const AVPixFmtDescriptor *desc)
> --
> 2.5.0
>
> ___
> libav-devel mailing list
> libav-devel@libav.org
> https://lists.libav.org/mailman/listinfo/libav-devel



-- 
Website at http://tomassetti.me
GitHub https://github.com/ftomassetti
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 2/2] imgutils: Fix a typo in avcodec_get_pix_fmt_loss

2015-08-19 Thread Federico Tomassetti
It seems pretty straighforward to me,

Federico

On Tue, Aug 18, 2015 at 4:22 PM, Luca Barbato  wrote:
> If the candidate does not have alpha and the source does have alpha
> report the loss of alpha.
>
> CC: libav-sta...@libav.org
> ---
>  libavcodec/imgconvert.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/libavcodec/imgconvert.c b/libavcodec/imgconvert.c
> index 1f6d587..6b389ac 100644
> --- a/libavcodec/imgconvert.c
> +++ b/libavcodec/imgconvert.c
> @@ -76,7 +76,7 @@ int avcodec_get_pix_fmt_loss(enum AVPixelFormat dst_pix_fmt,
>  loss |= FF_LOSS_COLORSPACE;
>
>  if (has_alpha && !(dst_desc->flags & AV_PIX_FMT_FLAG_ALPHA) &&
> - (dst_desc->flags & AV_PIX_FMT_FLAG_ALPHA))
> + (src_desc->flags & AV_PIX_FMT_FLAG_ALPHA))
>  loss |= FF_LOSS_ALPHA;
>
>  if (dst_pix_fmt == AV_PIX_FMT_PAL8 && !is_gray(src_desc))
> --
> 2.5.0
>
> ___
> libav-devel mailing list
> libav-devel@libav.org
> https://lists.libav.org/mailman/listinfo/libav-devel



-- 
Website at http://tomassetti.me
GitHub https://github.com/ftomassetti
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel