Re: [libav-devel] [PATCH 18/24] lavu: OpenCL hwcontext implementation

2017-02-21 Thread wm4
On Tue, 21 Feb 2017 22:43:05 +
Mark Thompson  wrote:

> On 21/02/17 14:21, wm4 wrote:
> > On Tue, 21 Feb 2017 13:18:55 +
> > Mark Thompson  wrote:
> >   
> >> On 21/02/17 08:00, wm4 wrote:  
> >>> On Mon, 20 Feb 2017 23:06:21 +
> >>> Mark Thompson  wrote:
> >>> 
> >> +#ifdef CL_ABGR
> >> +if (0)  
> >
> > wut  
> 
>  CL_ABGR didn't exist in OpenCL 1.2, which is really the target.  It's 
>  present in later versions, though, so it's included here if possible for 
>  symmetry.
> >>>
> >>> I meant what do the "if (0)"s do here? They're highly confusing at
> >>> best.
> >>
> >> Avoiding more gotos?  This function should really be a table, but it's 
> >> just tricky enough to make that annoying.  I'll see if I can improve it.  
> > 
> > Yeah, I think that would be better. Also you could say you're using
> > switch-case as some sort of computed goto, which it really is, but ugly
> > and hard to read.  
> 
> Right, taking this function to its (perhaps il)logical conclusion I can 
> replace it with:
> 
> static int opencl_get_plane_format(enum AVPixelFormat pixfmt,
>int plane, int width, int height,
>cl_image_format *image_format,
>cl_image_desc *image_desc)
> {
> const AVPixFmtDescriptor *desc;
> const AVComponentDescriptor *comp;
> int channels = 0, order = 0, depth = 0, step = 0;
> int wsub, hsub, alpha;
> int c;
> 
> desc = av_pix_fmt_desc_get(pixfmt);
> 
> // Only normal images are allowed.
> if (desc->flags & (AV_PIX_FMT_FLAG_BITSTREAM |
>AV_PIX_FMT_FLAG_HWACCEL   |
>AV_PIX_FMT_FLAG_PAL))
> return AVERROR(EINVAL);
> 
> wsub = 1 << desc->log2_chroma_w;
> hsub = 1 << desc->log2_chroma_h;
> // Subsampled components must be exact.
> if (width & wsub - 1 || height & hsub - 1)
> return AVERROR(EINVAL);
> 
> for (c = 0; c < desc->nb_components; c++) {
> comp = >comp[c];
> if (comp->plane != plane)
> continue;
> // The step size must be a power of two.
> if (comp->step != 1 && comp->step != 2 &&
> comp->step != 4 && comp->step != 8)
> return AVERROR(EINVAL);
> // The bits in each component must be packed in the
> // most-significant-bits of the relevant bytes.
> if (comp->shift + comp->depth != 8 &&
> comp->shift + comp->depth != 16)
> return AVERROR(EINVAL);
> // The depth must not vary between components.
> if (depth && comp->depth != depth)
> return AVERROR(EINVAL);
> // If a single data element crosses multiple bytes then
> // it must match the native endianness.
> if (comp->depth > 8 &&
> HAVE_BIGENDIAN == !(desc->flags & AV_PIX_FMT_FLAG_BE))
> return AVERROR(EINVAL);
> // A single data element must not contain multiple samples
> // from the same component.
> if (step && comp->step != step)
> return AVERROR(EINVAL);
> order = order * 10 + c + 1;
> depth = comp->depth;
> step  = comp->step;
> alpha = (desc->flags & AV_PIX_FMT_FLAG_ALPHA &&
>  c == desc->nb_components - 1);
> ++channels;
> }
> if (channels == 0)
> return AVERROR(ENOENT);
> 
> memset(image_format, 0, sizeof(*image_format));
> memset(image_desc,   0, sizeof(*image_desc));
> image_desc->image_type = CL_MEM_OBJECT_IMAGE2D;
> 
> if (plane == 0 || alpha) {
> image_desc->image_width = width;
> image_desc->image_height= height;
> image_desc->image_row_pitch = step * width;
> } else {
> image_desc->image_width = width  / wsub;
> image_desc->image_height= height / hsub;
> image_desc->image_row_pitch = step * width / wsub;
> }
> 
> if (depth <= 8) {
> image_format->image_channel_data_type = CL_UNORM_INT8;
> } else {
> if (depth <= 16)
> image_format->image_channel_data_type = CL_UNORM_INT16;
> else
> return AVERROR(EINVAL);
> }
> 
> #define CHANNEL_ORDER(order, type) \
> case order: image_format->image_channel_order = type; break;
> switch (order) {
> CHANNEL_ORDER(1,CL_R);
> CHANNEL_ORDER(2,CL_R);
> CHANNEL_ORDER(3,CL_R);
> CHANNEL_ORDER(4,CL_R);
> CHANNEL_ORDER(12,   CL_RG);
> CHANNEL_ORDER(23,   CL_RG);
> CHANNEL_ORDER(1234, CL_RGBA);
> CHANNEL_ORDER(3214, CL_BGRA);
> CHANNEL_ORDER(4123, CL_ARGB);
> #ifdef CL_ABGR
> CHANNEL_ORDER(4321, CL_ABGR);
> #endif
> default:
> return AVERROR(EINVAL);
> }
> #undef CHANNEL_ORDER
> 
> return 0;
> }
> 
> ...which gives 

Re: [libav-devel] [PATCH 18/24] lavu: OpenCL hwcontext implementation

2017-02-21 Thread Luca Barbato
On 21/02/2017 23:43, Mark Thompson wrote:
> On 21/02/17 14:21, wm4 wrote:
>> On Tue, 21 Feb 2017 13:18:55 +
>> Mark Thompson  wrote:
>>
>>> On 21/02/17 08:00, wm4 wrote:
 On Mon, 20 Feb 2017 23:06:21 +
 Mark Thompson  wrote:
   
>>> +#ifdef CL_ABGR
>>> +if (0)
>>
>> wut
>
> CL_ABGR didn't exist in OpenCL 1.2, which is really the target.  It's 
> present in later versions, though, so it's included here if possible for 
> symmetry.  

 I meant what do the "if (0)"s do here? They're highly confusing at
 best.  
>>>
>>> Avoiding more gotos?  This function should really be a table, but it's just 
>>> tricky enough to make that annoying.  I'll see if I can improve it.
>>
>> Yeah, I think that would be better. Also you could say you're using
>> switch-case as some sort of computed goto, which it really is, but ugly
>> and hard to read.
> 
> Right, taking this function to its (perhaps il)logical conclusion I can 
> replace it with:
> 

Looks cuter :)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 18/24] lavu: OpenCL hwcontext implementation

2017-02-21 Thread Mark Thompson
On 21/02/17 14:21, wm4 wrote:
> On Tue, 21 Feb 2017 13:18:55 +
> Mark Thompson  wrote:
> 
>> On 21/02/17 08:00, wm4 wrote:
>>> On Mon, 20 Feb 2017 23:06:21 +
>>> Mark Thompson  wrote:
>>>   
>> +#ifdef CL_ABGR
>> +if (0)
>
> wut

 CL_ABGR didn't exist in OpenCL 1.2, which is really the target.  It's 
 present in later versions, though, so it's included here if possible for 
 symmetry.  
>>>
>>> I meant what do the "if (0)"s do here? They're highly confusing at
>>> best.  
>>
>> Avoiding more gotos?  This function should really be a table, but it's just 
>> tricky enough to make that annoying.  I'll see if I can improve it.
> 
> Yeah, I think that would be better. Also you could say you're using
> switch-case as some sort of computed goto, which it really is, but ugly
> and hard to read.

Right, taking this function to its (perhaps il)logical conclusion I can replace 
it with:

static int opencl_get_plane_format(enum AVPixelFormat pixfmt,
   int plane, int width, int height,
   cl_image_format *image_format,
   cl_image_desc *image_desc)
{
const AVPixFmtDescriptor *desc;
const AVComponentDescriptor *comp;
int channels = 0, order = 0, depth = 0, step = 0;
int wsub, hsub, alpha;
int c;

desc = av_pix_fmt_desc_get(pixfmt);

// Only normal images are allowed.
if (desc->flags & (AV_PIX_FMT_FLAG_BITSTREAM |
   AV_PIX_FMT_FLAG_HWACCEL   |
   AV_PIX_FMT_FLAG_PAL))
return AVERROR(EINVAL);

wsub = 1 << desc->log2_chroma_w;
hsub = 1 << desc->log2_chroma_h;
// Subsampled components must be exact.
if (width & wsub - 1 || height & hsub - 1)
return AVERROR(EINVAL);

for (c = 0; c < desc->nb_components; c++) {
comp = >comp[c];
if (comp->plane != plane)
continue;
// The step size must be a power of two.
if (comp->step != 1 && comp->step != 2 &&
comp->step != 4 && comp->step != 8)
return AVERROR(EINVAL);
// The bits in each component must be packed in the
// most-significant-bits of the relevant bytes.
if (comp->shift + comp->depth != 8 &&
comp->shift + comp->depth != 16)
return AVERROR(EINVAL);
// The depth must not vary between components.
if (depth && comp->depth != depth)
return AVERROR(EINVAL);
// If a single data element crosses multiple bytes then
// it must match the native endianness.
if (comp->depth > 8 &&
HAVE_BIGENDIAN == !(desc->flags & AV_PIX_FMT_FLAG_BE))
return AVERROR(EINVAL);
// A single data element must not contain multiple samples
// from the same component.
if (step && comp->step != step)
return AVERROR(EINVAL);
order = order * 10 + c + 1;
depth = comp->depth;
step  = comp->step;
alpha = (desc->flags & AV_PIX_FMT_FLAG_ALPHA &&
 c == desc->nb_components - 1);
++channels;
}
if (channels == 0)
return AVERROR(ENOENT);

memset(image_format, 0, sizeof(*image_format));
memset(image_desc,   0, sizeof(*image_desc));
image_desc->image_type = CL_MEM_OBJECT_IMAGE2D;

if (plane == 0 || alpha) {
image_desc->image_width = width;
image_desc->image_height= height;
image_desc->image_row_pitch = step * width;
} else {
image_desc->image_width = width  / wsub;
image_desc->image_height= height / hsub;
image_desc->image_row_pitch = step * width / wsub;
}

if (depth <= 8) {
image_format->image_channel_data_type = CL_UNORM_INT8;
} else {
if (depth <= 16)
image_format->image_channel_data_type = CL_UNORM_INT16;
else
return AVERROR(EINVAL);
}

#define CHANNEL_ORDER(order, type) \
case order: image_format->image_channel_order = type; break;
switch (order) {
CHANNEL_ORDER(1,CL_R);
CHANNEL_ORDER(2,CL_R);
CHANNEL_ORDER(3,CL_R);
CHANNEL_ORDER(4,CL_R);
CHANNEL_ORDER(12,   CL_RG);
CHANNEL_ORDER(23,   CL_RG);
CHANNEL_ORDER(1234, CL_RGBA);
CHANNEL_ORDER(3214, CL_BGRA);
CHANNEL_ORDER(4123, CL_ARGB);
#ifdef CL_ABGR
CHANNEL_ORDER(4321, CL_ABGR);
#endif
default:
return AVERROR(EINVAL);
}
#undef CHANNEL_ORDER

return 0;
}

...which gives the supported formats of Beignet as:

yuv420p
yuv422p
yuv444p
yuv410p
yuv411p
gray
yuvj420p
yuvj422p
yuvj444p
nv12
nv21
argb
rgba
abgr
bgra
gray16le
yuv440p
yuvj440p
yuva420p
yuv420p16le
yuv422p16le
yuv444p16le
ya8
gbrp
gbrp16le
yuva422p
yuva444p
yuva420p16le
yuva422p16le
yuva444p16le
nv16
rgba64le
bgra64le
ya16le
gbrap
gbrap16le
p010le

:)

- Mark

Re: [libav-devel] [PATCH 18/24] lavu: OpenCL hwcontext implementation

2017-02-21 Thread wm4
On Tue, 21 Feb 2017 13:18:55 +
Mark Thompson  wrote:

> On 21/02/17 08:00, wm4 wrote:
> > On Mon, 20 Feb 2017 23:06:21 +
> > Mark Thompson  wrote:
> >   
>  +#ifdef CL_ABGR
>  +if (0)
> >>>
> >>> wut
> >>
> >> CL_ABGR didn't exist in OpenCL 1.2, which is really the target.  It's 
> >> present in later versions, though, so it's included here if possible for 
> >> symmetry.  
> > 
> > I meant what do the "if (0)"s do here? They're highly confusing at
> > best.  
> 
> Avoiding more gotos?  This function should really be a table, but it's just 
> tricky enough to make that annoying.  I'll see if I can improve it.

Yeah, I think that would be better. Also you could say you're using
switch-case as some sort of computed goto, which it really is, but ugly
and hard to read.

> >>> Those complex command queue operations might need a lock around them to
> >>> make them thread-safe. (I sure as hell don't want to need fragile
> >>> external locking around this API just because I might do
> >>> decoding/filtering/rendering on different threads to some degree.)
> >>
> >> OpenCL is meant to be thread-safe throughout - I don't think there is 
> >> anything more to do?  (Assuming correctness of drivers, of course.)
> >>
> >> There may be surprises wrt ordering/blocking if you let everything operate 
> >> on the same command queue, but that is why the user-supplied command 
> >> queues exist.  
> > 
> > I don't know much about OpenCL thread-safety (which is why I'm
> > inquiring). But the command queue that's used here is per hw device, so
> > it's widely shared.
> > 
> > Is it possible that multiple threads call av_hwframe_transfer_data()
> > concurrently? (Of course only as long as there are no conflicting write
> > accessed.)  
> 
> Yes, this is possible and fully allowed - all of the relevant OpenCL APIs are 
> explicitly thread-safe.
> 
> 
> From OpenCL 1.2:
> 
> (Section 2)
> """
> Thread-safe: An OpenCL API call is considered to be thread-safe if the 
> internal state as
> managed by OpenCL remains consistent when called simultaneously by multiple 
> host threads.
> OpenCL API calls that are thread-safe allow an application to call these 
> functions in multiple
> host threads without having to implement mutual exclusion across these host 
> threads i.e. they are
> also re-entrant-safe.
> """
> 
> (Appendix A)
> """
> All OpenCL API calls are thread-safe except clSetKernelArg. clSetKernelArg is 
> safe to call
> from any host thread, and is safe to call re-entrantly so long as concurrent 
> calls operate on
> different cl_kernel objects. However, the behavior of the cl_kernel object is 
> undefined if
> clSetKernelArg is called from multiple host threads on the same cl_kernel 
> object at the same
> time. Please note that there are additional limitations as to which OpenCL 
> APIs may be called
> from OpenCL callback functions -- please see section 5.9.
> """
> 
> (We don't do anything with event callbacks, and any user who does must be 
> aware of any possible issues themselves.)

OK, that's awesome. Sorry for making you dig out the OpenCL spec.
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 18/24] lavu: OpenCL hwcontext implementation

2017-02-21 Thread wm4
On Mon, 20 Feb 2017 23:06:21 +
Mark Thompson  wrote:

> >> +#ifdef CL_ABGR
> >> +if (0)  
> > 
> > wut  
> 
> CL_ABGR didn't exist in OpenCL 1.2, which is really the target.  It's present 
> in later versions, though, so it's included here if possible for symmetry.

I meant what do the "if (0)"s do here? They're highly confusing at
best.

> > Multiple API calls, looping over _all_ libavutil pixfmts... seems to do
> > a lot of work for a call you'd expect to be fast and often called.
> > Maybe not a problem.  
> 
> Often?  I'd expect this to be called once in some setup function for each 
> component and never thereafter unless something changes.  Also, nothing in 
> the loop calls anything external, except possibly realloc for valid formats.  
> (The opencl_get_plane_format() function is trying to be the one place where 
> you need to add new formats.)

OK, never mind then.

> >> +++pix_fmts_found;  
> > 
> > (Don't mind me, just stylistically objecting to pre-decrement.)  
> 
> (Well, don't mind me not caring, then.)

(OK, but still demonstratively frowning.)

> > Can't comment much on the actual code here.
> > 
> > Isn't it a bit strange that we also have hwcontext_cuda, and they don't
> > interact at all?  
> 
> Someone should add mapping between them...

That was just a brainfart of mine confusing those APIs (which I always
do, I'll just blame nVidia).

> > Those complex command queue operations might need a lock around them to
> > make them thread-safe. (I sure as hell don't want to need fragile
> > external locking around this API just because I might do
> > decoding/filtering/rendering on different threads to some degree.)  
> 
> OpenCL is meant to be thread-safe throughout - I don't think there is 
> anything more to do?  (Assuming correctness of drivers, of course.)
> 
> There may be surprises wrt ordering/blocking if you let everything operate on 
> the same command queue, but that is why the user-supplied command queues 
> exist.

I don't know much about OpenCL thread-safety (which is why I'm
inquiring). But the command queue that's used here is per hw device, so
it's widely shared.

Is it possible that multiple threads call av_hwframe_transfer_data()
concurrently? (Of course only as long as there are no conflicting write
accessed.)
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 18/24] lavu: OpenCL hwcontext implementation

2017-02-20 Thread Mark Thompson
On 20/02/17 06:39, wm4 wrote:
> On Sun, 19 Feb 2017 18:46:40 +
> Mark Thompson  wrote:
> 
>> ---
>>  configure  |   3 +
>>  doc/APIchanges |   4 +
>>  libavutil/Makefile |   2 +
>>  libavutil/hwcontext.c  |   3 +
>>  libavutil/hwcontext.h  |   1 +
>>  libavutil/hwcontext_internal.h |   1 +
>>  libavutil/hwcontext_opencl.c   | 983 
>> +
>>  libavutil/hwcontext_opencl.h   |  78 
>>  libavutil/version.h|   2 +-
>>  9 files changed, 1076 insertions(+), 1 deletion(-)
>>  create mode 100644 libavutil/hwcontext_opencl.c
>>  create mode 100644 libavutil/hwcontext_opencl.h
>>
>> ...
>> +
>> +static int opencl_get_plane_format(enum AVPixelFormat pixfmt,
>> +   int plane, int width, int height,
>> +   cl_image_format *image_format,
>> +   cl_image_desc *image_desc)
>> +{
>> +av_assert0(image_format && image_desc);
>> +
>> +memset(image_format, 0, sizeof(*image_format));
>> +memset(image_desc,   0, sizeof(*image_desc));
>> +image_desc->image_type = CL_MEM_OBJECT_IMAGE2D;
>> +
>> +switch (pixfmt) {
>> +case AV_PIX_FMT_YUV420P:
>> +if (width % 2)
>> +return AVERROR(EINVAL);
>> +if (plane > 2)
>> +return AVERROR(ENOENT);
>> +image_format->image_channel_order = CL_R;
>> +image_format->image_channel_data_type = CL_UNORM_INT8;
>> +image_desc->image_width = width  / (1 + (plane > 0));
>> +image_desc->image_height= height / (1 + (plane > 0));
>> +image_desc->image_row_pitch = image_desc->image_width;
>> +break;
>> +
>> +case AV_PIX_FMT_NV12:
>> +if (width % 2)
>> +return AVERROR(EINVAL);
>> +if (plane > 1)
>> +return AVERROR(ENOENT);
>> +image_format->image_channel_order = plane ? CL_RG : CL_R;
>> +image_format->image_channel_data_type = CL_UNORM_INT8;
>> +image_desc->image_width = width  / (1 + plane);
>> +image_desc->image_height= height / (1 + plane);
>> +image_desc->image_row_pitch = width;
>> +break;
>> +
>> +case AV_PIX_FMT_P010:
>> +if (width % 2)
>> +return AVERROR(EINVAL);
>> +if (plane > 1)
>> +return AVERROR(ENOENT);
>> +image_format->image_channel_order = plane ? CL_RG : CL_R;
>> +image_format->image_channel_data_type = CL_UNORM_INT16;
>> +image_desc->image_width = width  / (1 + plane);
>> +image_desc->image_height= height / (1 + plane);
>> +image_desc->image_row_pitch = 2 * width;
>> +break;
>> +
>> +case AV_PIX_FMT_RGBA:
>> +image_format->image_channel_order = CL_RGBA;
>> +if (0)
>> +case AV_PIX_FMT_BGRA:
>> +image_format->image_channel_order = CL_BGRA;
>> +if (0)
>> +case AV_PIX_FMT_ARGB:
>> +image_format->image_channel_order = CL_ARGB;
>> +#ifdef CL_ABGR
>> +if (0)
> 
> wut

CL_ABGR didn't exist in OpenCL 1.2, which is really the target.  It's present 
in later versions, though, so it's included here if possible for symmetry.

>> +case AV_PIX_FMT_ABGR:
>> +image_format->image_channel_order = CL_ABGR;
>> +#endif
>> +if (plane > 0)
>> +return AVERROR(ENOENT);
>> +image_format->image_channel_data_type = CL_UNORM_INT8;
>> +image_desc->image_width = width;
>> +image_desc->image_height= height;
>> +image_desc->image_row_pitch = 4 * width;
>> +break;
>> +
>> +default:
>> +return AVERROR(EINVAL);
>> +}
>> +
>> +return 0;
>> +}
>> +
>> +static int opencl_frames_get_constraints(AVHWDeviceContext *hwdev,
>> + const void *hwconfig,
>> + AVHWFramesConstraints *constraints)
>> +{
>> +AVOpenCLDeviceContext *hwctx = hwdev->hwctx;
>> +cl_uint nb_image_formats;
>> +cl_image_format *image_formats = NULL;
>> +cl_int cle;
>> +enum AVPixelFormat pix_fmt;
>> +int err, pix_fmts_found;
>> +size_t max_width, max_height;
>> +
>> +cle = clGetDeviceInfo(hwctx->device_id, CL_DEVICE_IMAGE2D_MAX_WIDTH,
>> +  sizeof(max_width), _width, NULL);
>> +if (cle != CL_SUCCESS) {
>> +av_log(hwdev, AV_LOG_ERROR, "Failed to query maximum "
>> +   "supported image width: %d.\n", cle);
>> +} else {
>> +constraints->max_width = max_width;
>> +}
>> +cle = clGetDeviceInfo(hwctx->device_id, CL_DEVICE_IMAGE2D_MAX_HEIGHT,
>> +  sizeof(max_height), _height, NULL);
>> +if (cle != CL_SUCCESS) {
>> +av_log(hwdev, AV_LOG_ERROR, "Failed to query maximum "
>> +   "supported image height: %d.\n", cle);
>> +} else {
>> +

Re: [libav-devel] [PATCH 18/24] lavu: OpenCL hwcontext implementation

2017-02-19 Thread wm4
On Sun, 19 Feb 2017 18:46:40 +
Mark Thompson  wrote:

> ---
>  configure  |   3 +
>  doc/APIchanges |   4 +
>  libavutil/Makefile |   2 +
>  libavutil/hwcontext.c  |   3 +
>  libavutil/hwcontext.h  |   1 +
>  libavutil/hwcontext_internal.h |   1 +
>  libavutil/hwcontext_opencl.c   | 983 
> +
>  libavutil/hwcontext_opencl.h   |  78 
>  libavutil/version.h|   2 +-
>  9 files changed, 1076 insertions(+), 1 deletion(-)
>  create mode 100644 libavutil/hwcontext_opencl.c
>  create mode 100644 libavutil/hwcontext_opencl.h
> 
> diff --git a/configure b/configure
> index 4635b73b0..5b78131ab 100755
> --- a/configure
> +++ b/configure
> @@ -244,6 +244,7 @@ External library support:
>--enable-nvenc   Nvidia video encoding
>--enable-omx OpenMAX IL
>--enable-omx-rpi OpenMAX IL for Raspberry Pi
> +  --enable-opencl  OpenCL processing
>--enable-vaapi   Video Acceleration API (mainly Unix/Intel)
>--enable-vda Apple Video Decode Acceleration [auto]
>--enable-vdpau   Nvidia Video Decode and Presentation API for Unix [auto]
> @@ -1267,6 +1268,7 @@ HWACCEL_LIBRARY_LIST="
>  mmal
>  nvenc
>  omx
> +opencl
>  vaapi
>  vda
>  vdpau
> @@ -4733,6 +4735,7 @@ enabled omx_rpi   && { check_header OMX_Core.h 
> ||
> { ! enabled cross_compile && add_cflags 
> -isystem/opt/vc/include/IL && check_header OMX_Core.h ; } ||
> die "ERROR: OpenMAX IL headers not found"; }
>  enabled omx   && require_header OMX_Core.h
> +enabled opencl&& require OpenCL CL/cl.h clGetPlatformIDs -lOpenCL
>  enabled openssl   && { { check_pkg_config openssl openssl/ssl.h 
> OPENSSL_init_ssl ||
>   check_pkg_config openssl openssl/ssl.h 
> SSL_library_init; } && {
> add_cflags $openssl_cflags && add_extralibs 
> $openssl_extralibs; } ||
> diff --git a/doc/APIchanges b/doc/APIchanges
> index 253454358..208ddd318 100644
> --- a/doc/APIchanges
> +++ b/doc/APIchanges
> @@ -13,6 +13,10 @@ libavutil: 2015-08-28
>  
>  API changes, most recent first:
>  
> +2017-xx-xx - xxx - lavu 55.34.0 - hwcontext.h hwcontext_opencl.h
> +  Add AV_HWDEVICE_TYPE_OPENCL and a new installed header with
> +  OpenCL-specific hwcontext definitions.
> +
>  2017-xx-xx - xxx - lavu 55.33.0 - pixfmt.h
>Add AV_PIX_FMT_OPENCL.
>  
> diff --git a/libavutil/Makefile b/libavutil/Makefile
> index 60e180c79..49e84854a 100644
> --- a/libavutil/Makefile
> +++ b/libavutil/Makefile
> @@ -115,6 +115,7 @@ OBJS-$(CONFIG_CUDA) += 
> hwcontext_cuda.o
>  OBJS-$(CONFIG_DXVA2)+= hwcontext_dxva2.o
>  OBJS-$(CONFIG_LIBMFX)   += hwcontext_qsv.o
>  OBJS-$(CONFIG_LZO)  += lzo.o
> +OBJS-$(CONFIG_OPENCL)   += hwcontext_opencl.o
>  OBJS-$(CONFIG_VAAPI)+= hwcontext_vaapi.o
>  OBJS-$(CONFIG_VDPAU)+= hwcontext_vdpau.o
>  
> @@ -123,6 +124,7 @@ OBJS += $(COMPAT_OBJS:%=../compat/%)
>  SKIPHEADERS-$(CONFIG_CUDA) += hwcontext_cuda.h
>  SKIPHEADERS-$(CONFIG_DXVA2)+= hwcontext_dxva2.h
>  SKIPHEADERS-$(CONFIG_LIBMFX)   += hwcontext_qsv.h
> +SKIPHEADERS-$(CONFIG_OPENCL)   += hwcontext_opencl.h
>  SKIPHEADERS-$(CONFIG_VAAPI)+= hwcontext_vaapi.h
>  SKIPHEADERS-$(CONFIG_VDPAU)+= hwcontext_vdpau.h
>  
> diff --git a/libavutil/hwcontext.c b/libavutil/hwcontext.c
> index e4087a15c..27aab3086 100644
> --- a/libavutil/hwcontext.c
> +++ b/libavutil/hwcontext.c
> @@ -45,6 +45,9 @@ static const HWContextType * const hw_table[] = {
>  #if CONFIG_VDPAU
>  _hwcontext_type_vdpau,
>  #endif
> +#if CONFIG_OPENCL
> +_hwcontext_type_opencl,
> +#endif
>  NULL,
>  };
>  
> diff --git a/libavutil/hwcontext.h b/libavutil/hwcontext.h
> index b81a833b2..752c09e6f 100644
> --- a/libavutil/hwcontext.h
> +++ b/libavutil/hwcontext.h
> @@ -30,6 +30,7 @@ enum AVHWDeviceType {
>  AV_HWDEVICE_TYPE_VAAPI,
>  AV_HWDEVICE_TYPE_DXVA2,
>  AV_HWDEVICE_TYPE_QSV,
> +AV_HWDEVICE_TYPE_OPENCL,
>  };
>  
>  typedef struct AVHWDeviceInternal AVHWDeviceInternal;
> diff --git a/libavutil/hwcontext_internal.h b/libavutil/hwcontext_internal.h
> index 1ac0f1f95..ecda70558 100644
> --- a/libavutil/hwcontext_internal.h
> +++ b/libavutil/hwcontext_internal.h
> @@ -159,6 +159,7 @@ int ff_hwframe_map_create(AVBufferRef *hwframe_ref,
>  
>  extern const HWContextType ff_hwcontext_type_cuda;
>  extern const HWContextType ff_hwcontext_type_dxva2;
> +extern const HWContextType ff_hwcontext_type_opencl;
>  extern const HWContextType ff_hwcontext_type_qsv;
>  extern const HWContextType ff_hwcontext_type_vaapi;
>  extern const HWContextType ff_hwcontext_type_vdpau;
> diff --git 

Re: [libav-devel] [PATCH 18/24] lavu: OpenCL hwcontext implementation

2017-02-19 Thread Luca Barbato
On 19/02/2017 19:46, Mark Thompson wrote:
> ---
>  configure  |   3 +
>  doc/APIchanges |   4 +
>  libavutil/Makefile |   2 +
>  libavutil/hwcontext.c  |   3 +
>  libavutil/hwcontext.h  |   1 +
>  libavutil/hwcontext_internal.h |   1 +
>  libavutil/hwcontext_opencl.c   | 983 
> +
>  libavutil/hwcontext_opencl.h   |  78 
>  libavutil/version.h|   2 +-
>  9 files changed, 1076 insertions(+), 1 deletion(-)
>  create mode 100644 libavutil/hwcontext_opencl.c
>  create mode 100644 libavutil/hwcontext_opencl.h

> +
> +av_log(hwdev, AV_LOG_DEBUG, "Format %s supported.\n",
> +   av_get_pix_fmt_name(pix_fmt));
> +
> +constraints->valid_sw_formats =
> +av_realloc_array(constraints->valid_sw_formats,
> + pix_fmts_found + 2,
> + sizeof(*constraints->valid_sw_formats));

Use av_reallocp_array maybe?

> +if (!constraints->valid_sw_formats) {
> +err = AVERROR(ENOMEM);
> +goto fail;
> +}
> +constraints->valid_sw_formats[pix_fmts_found] = pix_fmt;
> +constraints->valid_sw_formats[pix_fmts_found + 1] =
> +AV_PIX_FMT_NONE;

maybe that could be set outside the loop only once.

> +++pix_fmts_found;
> +}
> +

The rest seems ok.

lu
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel