[Beignet] [PATCH OCL2.0 4/4] Utest: Add pipe related test

2016-02-24 Thread Xiuli Pan
From: Pan Xiuli Add test case for builtin with user struct type and int type and runtime tset for creatPipe and pipe query. Signed-off-by: Pan Xiuli --- kernels/compiler_pipe_builtin.cl | 117 +++ utests/CMakeLists.txt| 4 +- utests/compiler_pi

[Beignet] [PATCH OCL2.0 3/4] Add pipe packet size check

2016-02-24 Thread Xiuli Pan
From: Pan Xiuli Get pipe packet type from metadata and pass type size to kernel check type size is fit in clSetKernelArg Signed-off-by: Pan Xiuli --- backend/src/backend/program.cpp | 2 ++ backend/src/backend/program.h | 1 + backend/src/ir/function.hpp | 1 + backe

[Beignet] [PATCH OCL2.0 1/4] Runtime: Add pipe related APIs

2016-02-24 Thread Xiuli Pan
From: Pan Xiuli Add clCreatePipe and clGetPipeInfo Signed-off-by: Pan Xiuli --- src/cl_api.c | 70 + src/cl_device_id.c | 3 ++ src/cl_device_id.h | 3 ++ src/cl_gt_device.h | 3 ++ src/cl_khr_icd.c | 4 +-- src/cl_mem.c | 100

[Beignet] [PATCH OCL2.0 2/4] Backend: Add Pipe Builtin support

2016-02-24 Thread Xiuli Pan
From: Pan Xiuli Add pipe builtin functions Signed-off-by: Pan Xiuli --- backend/src/backend/context.cpp| 5 + backend/src/backend/gen_reg_allocation.cpp | 3 +- backend/src/backend/program.h | 1 + backend/src/ir/function.cpp| 1 + backend/src/i

[Beignet] [PATCH OCL2.0 v4 1/2] Backend: Add built-in ctz function

2016-02-24 Thread Xiuli Pan
From: Pan Xiuli Gen doesn't have a tailing zero detection function. Use bit field reverse to reverse the interger first and leading zero detection to get the number of tailing zeros. Also add some workaroud for unsupport short and char type to get expected result. V2: Add missing file ocl_ctz.ll

[Beignet] [PATCH OCL2.0 v4 2/2] Utest: add a test case for built-in ctz function

2016-02-24 Thread Xiuli Pan
From: Pan Xiuli Check all type of ctz function and 0 num bound case. V2: Fix type warning Signed-off-by: Pan Xiuli --- kernels/compiler_ctz.cl | 16 + utests/CMakeLists.txt | 1 + utests/compiler_ctz.cpp | 62 + 3 files changed, 79

[Beignet] [PATCH OCL2.0 v4 2/2] Utest: add a test case for built-in ctz function

2016-02-29 Thread Xiuli Pan
From: Pan Xiuli Check all type of ctz function and 0 num bound case. V2: Fix type warning Signed-off-by: Pan Xiuli --- kernels/compiler_ctz.cl | 16 + utests/CMakeLists.txt | 1 + utests/compiler_ctz.cpp | 62 + 3 files changed, 79

[Beignet] [PATCH OCL2.0 v4 1/2] Backend: Add built-in ctz function

2016-02-29 Thread Xiuli Pan
From: Pan Xiuli Gen doesn't have a tailing zero detection function. Use bit field reverse to reverse the interger first and leading zero detection to get the number of tailing zeros. Also add some workaroud for unsupport short and char type to get expected result. V2: Add missing file ocl_ctz.ll

[Beignet] [PATCH OCL2.0 2/6] Utest: Add sampler test

2016-02-29 Thread Xiuli Pan
From: Pan Xiuli Reuse sampler test and add test for new api clCreateSamplerWithProperties. Signed-off-by: Pan Xiuli --- utests/CMakeLists.txt | 1 + utests/compiler_sampler.cpp | 14 +- 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/utests/CMakeLists.txt b/ut

[Beignet] [PATCH OCL2.0 1/6] Runtime: Add clCreateSamplerWithProperties

2016-02-29 Thread Xiuli Pan
From: Pan Xiuli Add api clCreateSamplerWithProperties Signed-off-by: Pan Xiuli --- src/cl_api.c | 62 src/cl_khr_icd.c | 2 +- 2 files changed, 63 insertions(+), 1 deletion(-) diff --git a/src/cl_api.c b/src/cl_api.c index 840d57f.

[Beignet] [PATCH OCL2.0 6/6] Backend: Refine type intptr_t to 64-bit

2016-02-29 Thread Xiuli Pan
From: Pan Xiuli Signed-off-by: Pan Xiuli --- backend/src/libocl/include/ocl_types.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/backend/src/libocl/include/ocl_types.h b/backend/src/libocl/include/ocl_types.h index eb4c3b4..1ca6bca 100644 --- a/backend/src/libocl/inc

[Beignet] [PATCH OCL2.0 4/6] OCL20: Implement clSetKernelExecInfo api

2016-02-29 Thread Xiuli Pan
From: Yang Rong The extra exec info need reloc, otherwize gpu can't read/write. And it don't need set to curbe. So reloc it to unused binding table. Signed-off-by: Yang Rong --- src/cl_api.c| 31 +-- src/cl_command_queue.c | 38 +

[Beignet] [PATCH OCL2.0 5/6] Runtime: Add support of OCL2.0 device queries

2016-02-29 Thread Xiuli Pan
From: Pan Xiuli Add device queries for OpenCL 2.0 Signed-off-by: Pan Xiuli --- src/cl_device_id.c | 14 +- src/cl_device_id.h | 16 ++-- src/cl_gt_device.h | 14 +- 3 files changed, 40 insertions(+), 4 deletions(-) diff --git a/src/cl_device_id.c b/src/cl_d

[Beignet] [PATCH OCL2.0 3/6] OCL20: Fix svm bugs

2016-02-29 Thread Xiuli Pan
From: Yang Rong 1. correct the context's svm list when delete. 2. Set svm sub buffer's offset when bind buffer. Signed-off-by: Yang Rong --- src/cl_command_queue.c | 6 +- src/cl_mem.c | 18 +++--- 2 files changed, 20 insertions(+), 4 deletions(-) diff --git a/src/c

[Beignet] [PATCH] Update to newest OpenCL 2.0 header

2016-02-29 Thread Xiuli Pan
From: Pan Xiuli Signed-off-by: Pan Xiuli --- include/CL/cl.h | 19 --- include/CL/cl_d3d10.h | 7 ++- include/CL/cl_d3d11.h | 7 ++- include/CL/cl_dx9_media_sharing.h | 9 +++-- include/CL/cl_egl.h | 9 +

[Beignet] [PATCH OCL20 1/2] runtime: extension size not enough.

2016-03-01 Thread Xiuli Pan
From: Luo Xionghu define a MACRO to hold the value. v2: use same MACRO in cl_extensions.h; add header file protection for cl_extension.h. Signed-off-by: Luo Xionghu Reviewed-by: "Yang, Rong R" --- src/cl_device_id.h | 5 - src/cl_extensions.c | 2 +- src/cl_extensions.h | 6 +- 3 fi

[Beignet] [PATCH OCL20 2/2] Runtime: Add extensions for OCL20

2016-03-01 Thread Xiuli Pan
From: Pan Xiuli Add extensions requested by spec to base extensions. Signed-off-by: Pan Xiuli --- src/cl_extensions.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/cl_extensions.h b/src/cl_extensions.h index c42d364..5b2f8a8 100644 --- a/src/cl_extensions.h +++

[Beignet] [PATCH OCL20 v2 2/4] Backend: Add Pipe Builtin support

2016-03-01 Thread Xiuli Pan
From: Pan Xiuli Add pipe builtin functions v2: Refine type size to be system determined Signed-off-by: Pan Xiuli --- backend/src/backend/context.cpp| 5 + backend/src/backend/gen_reg_allocation.cpp | 3 +- backend/src/backend/program.h | 1 + backend/src/ir/func

[Beignet] [PATCH OCL20 v2 1/4] Runtime: Add pipe related APIs

2016-03-01 Thread Xiuli Pan
From: Pan Xiuli Add clCreatePipe and clGetPipeInfo Signed-off-by: Pan Xiuli --- src/cl_api.c | 70 + src/cl_device_id.c | 3 ++ src/cl_device_id.h | 3 ++ src/cl_gt_device.h | 3 ++ src/cl_khr_icd.c | 4 +-- src/cl_mem.c | 100

[Beignet] [PATCH OCL20 v2 3/4] Add pipe packet size check

2016-03-01 Thread Xiuli Pan
From: Pan Xiuli Get pipe packet type from metadata and pass type size to kernel check type size is fit in clSetKernelArg Signed-off-by: Pan Xiuli --- backend/src/backend/program.cpp | 2 ++ backend/src/backend/program.h | 1 + backend/src/ir/function.hpp | 1 + backe

[Beignet] [PATCH OCL20 v2 4/4] Utest: Add pipe related test

2016-03-01 Thread Xiuli Pan
From: Pan Xiuli Add test case for builtin with user struct type and int type and runtime tset for creatPipe and pipe query. Signed-off-by: Pan Xiuli --- kernels/compiler_pipe_builtin.cl | 117 +++ utests/CMakeLists.txt| 4 +- utests/compiler_pi

[Beignet] [PATCH] Backend: Add extensions for compiler

2016-03-01 Thread Xiuli Pan
From: Pan Xiuli Signed-off-by: Pan Xiuli --- backend/src/libocl/tmpl/ocl_defines.tmpl.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/backend/src/libocl/tmpl/ocl_defines.tmpl.h b/backend/src/libocl/tmpl/ocl_defines.tmpl.h index 8d41449..ae30e08 100644 --- a/backend/src/libocl/tmpl/ocl

[Beignet] [PATCH OCL20] Backend: Add uncompatiblePCHOptions for OCL20

2016-03-01 Thread Xiuli Pan
From: Pan Xiuli Signed-off-by: Pan Xiuli --- backend/src/backend/program.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/src/backend/program.cpp b/backend/src/backend/program.cpp index afae77b..0119670 100644 --- a/backend/src/backend/program.cpp +++ b/backend/sr

[Beignet] [PATCH OCL20] Runtime: Add API clCreateCommandQueueWithProperties

2016-03-01 Thread Xiuli Pan
From: Luo Xionghu Contributor: Luo Xionghu Signed-off-by: Pan Xiuli --- src/cl_api.c | 69 +++ src/cl_khr_icd.c | 22 --- utests/profiling_exec.cpp | 3 ++- utests/utest_helper.cpp | 4 +-- 4 files changed, 86 in

[Beignet] [PATCH OCL20 v3 4/4] Utest: Add pipe related test

2016-03-01 Thread Xiuli Pan
From: Pan Xiuli Add test case for builtin with user struct type and int type and runtime tset for creatPipe and pipe query. Signed-off-by: Pan Xiuli --- kernels/compiler_pipe_builtin.cl | 117 +++ utests/CMakeLists.txt| 4 +- utests/compiler_pi

[Beignet] [PATCH OCL20 v3 1/4] Runtime: Add pipe related APIs

2016-03-01 Thread Xiuli Pan
From: Pan Xiuli Add clCreatePipe and clGetPipeInfo Signed-off-by: Pan Xiuli --- src/cl_api.c | 70 + src/cl_device_id.c | 3 ++ src/cl_device_id.h | 3 ++ src/cl_gt_device.h | 3 ++ src/cl_khr_icd.c | 4 +-- src/cl_mem.c | 100

[Beignet] [PATCH OCL20 v3 3/4] Backend: Add pipe packet size check

2016-03-01 Thread Xiuli Pan
From: Pan Xiuli Get pipe packet type from metadata and pass type size to kernel check type size is fit in clSetKernelArg Signed-off-by: Pan Xiuli --- backend/src/backend/program.cpp | 2 ++ backend/src/backend/program.h | 1 + backend/src/ir/function.hpp | 1 + backe

[Beignet] [PATCH OCL20 v3 2/4] Backend: Add Pipe Builtin support

2016-03-01 Thread Xiuli Pan
From: Pan Xiuli Add pipe builtin functions v2: Refine type size to be system determined Signed-off-by: Pan Xiuli --- backend/src/backend/context.cpp| 5 + backend/src/backend/gen_reg_allocation.cpp | 3 +- backend/src/backend/program.h | 1 + backend/src/ir/func

[Beignet] [PATCH OCL20 01/11] Runtime: Add API clCreateCommandQueueWithProperties

2016-03-02 Thread Xiuli Pan
From: Luo Xionghu Contributor: Luo Xionghu Signed-off-by: Pan Xiuli --- src/cl_api.c | 70 +++ src/cl_khr_icd.c | 6 +++- utests/profiling_exec.cpp | 3 +- utests/utest_helper.cpp | 4 +-- 4 files changed, 79 insertions(+),

[Beignet] [PATCH OCL20 02/11] OCL20: Fix svm bugs

2016-03-02 Thread Xiuli Pan
From: Yang Rong 1. correct the context's svm list when delete. 2. Set svm sub buffer's offset when bind buffer. Signed-off-by: Yang Rong Signed-off-by: Pan Xiuli --- src/cl_command_queue.c | 6 +- src/cl_mem.c | 18 +++--- 2 files changed, 20 insertions(+), 4 deleti

[Beignet] [PATCH OCL20 06/11] Runtime: Add support of OCL2.0 device queries

2016-03-02 Thread Xiuli Pan
From: Pan Xiuli Add device queries for OpenCL 2.0 Signed-off-by: Pan Xiuli --- src/cl_device_id.c | 14 +- src/cl_device_id.h | 16 ++-- src/cl_gt_device.h | 14 +- 3 files changed, 40 insertions(+), 4 deletions(-) diff --git a/src/cl_device_id.c b/src/cl_d

[Beignet] [PATCH OCL20 05/11] Utest: Add sampler test

2016-03-02 Thread Xiuli Pan
From: Pan Xiuli Reuse sampler test and add test for new api clCreateSamplerWithProperties. Signed-off-by: Pan Xiuli --- utests/CMakeLists.txt | 1 + utests/compiler_sampler.cpp | 14 +- 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/utests/CMakeLists.txt b/ut

[Beignet] [PATCH OCL20 09/11] Runtime: Add extensions for OCL20

2016-03-02 Thread Xiuli Pan
From: Pan Xiuli Add extensions requested by spec to base extensions. Signed-off-by: Pan Xiuli --- src/cl_extensions.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/cl_extensions.h b/src/cl_extensions.h index c42d364..5b2f8a8 100644 --- a/src/cl_extensions.h +++

[Beignet] [PATCH OCL20 04/11] Runtime: Add clCreateSamplerWithProperties

2016-03-02 Thread Xiuli Pan
From: Pan Xiuli Add api clCreateSamplerWithProperties Signed-off-by: Pan Xiuli --- src/cl_api.c | 62 ++ src/cl_device_id.c | 2 +- src/cl_khr_icd.c | 2 +- 3 files changed, 64 insertions(+), 2 deletions(-) diff --git a/src/cl_api.

[Beignet] [PATCH OCL20 08/11] runtime: extension size not enough.

2016-03-02 Thread Xiuli Pan
From: Luo Xionghu define a MACRO to hold the value. v2: use same MACRO in cl_extensions.h; add header file protection for cl_extension.h. Signed-off-by: Luo Xionghu Reviewed-by: "Yang, Rong R" Signed-off-by: Pan Xiuli --- src/cl_device_id.h | 5 - src/cl_extensions.c | 2 +- src/cl_ext

[Beignet] [PATCH OCL20 03/11] OCL20: Implement clSetKernelExecInfo api

2016-03-02 Thread Xiuli Pan
From: Yang Rong The extra exec info need reloc, otherwize gpu can't read/write. And it don't need set to curbe. So reloc it to unused binding table. Signed-off-by: Yang Rong Signed-off-by: Pan Xiuli --- src/cl_api.c| 31 +-- src/cl_command_queue.c

[Beignet] [PATCH OCL20 10/11] Backend: Add extensions for compiler

2016-03-02 Thread Xiuli Pan
From: Pan Xiuli Signed-off-by: Pan Xiuli --- backend/src/libocl/tmpl/ocl_defines.tmpl.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/backend/src/libocl/tmpl/ocl_defines.tmpl.h b/backend/src/libocl/tmpl/ocl_defines.tmpl.h index 8d41449..ae30e08 100644 --- a/backend/src/libocl/tmpl/ocl

[Beignet] [PATCH OCL20 11/11] Backend: Add uncompatiblePCHOptions for OCL20

2016-03-02 Thread Xiuli Pan
From: Pan Xiuli Signed-off-by: Pan Xiuli --- backend/src/backend/program.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/backend/src/backend/program.cpp b/backend/src/backend/program.cpp index 8f78e1f..bdcca38 100644 --- a/backend/src/backend/program.cpp +++ b/backend/sr

[Beignet] [PATCH OCL20 07/11] Backend: Refine type intptr_t to 64-bit

2016-03-02 Thread Xiuli Pan
From: Pan Xiuli Signed-off-by: Pan Xiuli --- backend/src/libocl/include/ocl_types.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/backend/src/libocl/include/ocl_types.h b/backend/src/libocl/include/ocl_types.h index eb4c3b4..1ca6bca 100644 --- a/backend/src/libocl/inc

[Beignet] [PATCH OCL20 3/5] Runtime: Add suport for sRGB to clEnqueueCopyImage

2016-03-03 Thread Xiuli Pan
From: Pan Xiuli Signed-off-by: Pan Xiuli --- src/cl_mem.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/cl_mem.c b/src/cl_mem.c index 0922e0a..6e5488e 100644 --- a/src/cl_mem.c +++ b/src/cl_mem.c @@ -1922,7 +1922,9 @@ cl_mem_kernel_copy_image(cl_command_queue queue,

[Beignet] [PATCH OCL20 5/5] Runtime: Add support for clGetMemObjectInfo

2016-03-03 Thread Xiuli Pan
From: Pan Xiuli clGetMemObjectInfo with CL_MEM_ASSOCIATED_MEMOBJECT should return the mem in cl_image_desc. As in CL_MEM_OBJECT_IMAGE1D_BUFFER we copy the buffer, add a workaround for it. Signed-off-by: Pan Xiuli --- src/cl_mem.c | 14 ++ src/cl_mem.h | 1 + 2 files changed, 11 in

[Beignet] [PATCH OCL20 2/5] Runtime: Refine clGetSupportedImageFormats to support CL_MEM_FLAGS

2016-03-03 Thread Xiuli Pan
From: Pan Xiuli sRGB writes are not supported now, and we should not return them if any write was set as CL_MEM_FLAGS. Signed-off-by: Pan Xiuli --- src/cl_api.c | 1 + src/cl_image.c | 5 + src/cl_image.h | 1 + 3 files changed, 7 insertions(+) diff --git a/src/cl_api.c b/src/cl_api.c i

[Beignet] [PATCH OCL20 1/5] Runtime: Add support for sRGB

2016-03-03 Thread Xiuli Pan
From: Pan Xiuli CL_sRGBA with CL_UNNORM_INT8 is the minimum request for OpenCL2.0 and CL_sBGRA is also support by hardware. None of the sRGB surface type suport hardware write. Signed-off-by: Pan Xiuli --- backend/src/ocl_common_defines.h | 7 ++- src/cl_api.c | 2 +-

[Beignet] [PATCH OCL20 4/5] Runtime: Add suport for sRGB to clEnqueueFillImage

2016-03-03 Thread Xiuli Pan
From: Pan Xiuli Signed-off-by: Pan Xiuli --- src/cl_mem.c | 22 +- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/src/cl_mem.c b/src/cl_mem.c index 6e5488e..c41417c 100644 --- a/src/cl_mem.c +++ b/src/cl_mem.c @@ -34,6 +34,7 @@ #include #include #include

[Beignet] [PATCH OCL20 4/5 v2] Runtime: Add suport for sRGB to clEnqueueFillImage

2016-03-08 Thread Xiuli Pan
From: Pan Xiuli Signed-off-by: Pan Xiuli --- src/cl_mem.c | 23 ++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/src/cl_mem.c b/src/cl_mem.c index 6e5488e..6d45828 100644 --- a/src/cl_mem.c +++ b/src/cl_mem.c @@ -34,6 +34,7 @@ #include #include #include

[Beignet] [PATCH OCL20 4/11 v2] Runtime: Add clCreateSamplerWithProperties

2016-03-08 Thread Xiuli Pan
From: Pan Xiuli Add api clCreateSamplerWithProperties v2: fix bug Signed-off-by: Pan Xiuli --- src/cl_api.c | 63 ++ src/cl_device_id.c | 2 +- src/cl_khr_icd.c | 12 +-- 3 files changed, 70 insertions(+), 7 deletions(-) di

[Beignet] [PATCH 2/2] Runtime: Add support for non uniform group size

2016-03-15 Thread Xiuli Pan
From: Pan Xiuli Enqueue multiple times if the the size is not uniform, at most 2 times for 1D, 4times for 2D and 8 times for 3D. Using the workdim offset of walker in batch buffer to keep work groups in series. TODO: handle events for the flush between multiple enqueues Signed-off-by: Pan Xiuli

[Beignet] [PATCH 1/2] Backend: Refine get_enqueued_local_size and get_local_size

2016-03-15 Thread Xiuli Pan
From: Pan Xiuli Use curbe register for these two size. Signed-off-by: Pan Xiuli --- backend/src/backend/gen_insn_selection.cpp | 7 +++-- backend/src/backend/program.h | 3 ++ backend/src/ir/profile.cpp | 4 +++ backend/src/ir/profile.hpp | 47 ++

[Beignet] [PATCH] Libocl: change prototype of math built-in for OCL2.0 spec

2016-03-21 Thread Xiuli Pan
From: Pan Xiuli Math built-in nolonger need address space, so remove them. Signed-off-by: Pan Xiuli --- backend/src/libocl/tmpl/ocl_math.tmpl.cl | 122 --- backend/src/libocl/tmpl/ocl_math.tmpl.h | 48 +++- 2 files changed, 25 insertions(+), 145 deletions(

[Beignet] [PATCH] Backend: Clang now support static, fix now

2016-03-21 Thread Xiuli Pan
From: Pan Xiuli Signed-off-by: Pan Xiuli --- backend/src/libocl/include/ocl_types.h | 2 -- 1 file changed, 2 deletions(-) diff --git a/backend/src/libocl/include/ocl_types.h b/backend/src/libocl/include/ocl_types.h index 88e5642..2ae1562 100644 --- a/backend/src/libocl/include/ocl_types.h ++

[Beignet] [PATCH OCL20 v2] Backend: Refine typedef of ptrint_t

2016-03-25 Thread Xiuli Pan
From: Pan Xiuli V2: refined with clang macro __INTPTR_WIDTH__ and clang built-in types __INTXX_TYPE__ Signed-off-by: Pan Xiuli --- backend/src/backend/program.cpp| 4 backend/src/libocl/CMakeLists.txt | 16 +++- backend/src/libocl/include/ocl_types.h | 9 +++

[Beignet] [PATCH] libocl: Refine return type of workitem built-in functions

2016-03-29 Thread Xiuli Pan
From: Pan Xiuli Signed-off-by: Pan Xiuli --- backend/src/libocl/include/ocl_workitem.h | 20 ++-- backend/src/libocl/src/ocl_workitem.cl| 8 2 files changed, 14 insertions(+), 14 deletions(-) diff --git a/backend/src/libocl/include/ocl_workitem.h b/backend/src/li

[Beignet] [PATCH] Runtime: Add SKL device id for new SKL device

2016-03-31 Thread Xiuli Pan
From: Pan Xiuli Add skylakd workstation device and desktop GT4 Signed-off-by: Pan Xiuli --- src/cl_device_data.h | 16 src/cl_device_id.c | 8 2 files changed, 20 insertions(+), 4 deletions(-) diff --git a/src/cl_device_data.h b/src/cl_device_data.h index 63e078f.

[Beignet] [PATCH] Utest: Add utest load spir for spir64

2016-04-04 Thread Xiuli Pan
From: Pan Xiuli Create a new spir file for OpenCL2.0 and spir64 Signed-off-by: Pan Xiuli --- kernels/compiler_ceil64.spir | Bin 0 -> 2152 bytes utests/load_program_from_spir.cpp | 5 - 2 files changed, 4 insertions(+), 1 deletion(-) create mode 100644 kernels/compiler_ceil64.spir

Re: [Beignet] [PATCH 3/7] GBE: add ocl 2.0 work_group_barrier support.

2016-04-04 Thread Xiuli Pan
Hi Ruiling, The patch set LGTM, only a small problem about syncFieldNum. Thanks Xiuli On Fri, Apr 01, 2016 at 02:53:24PM +0800, Ruiling Song wrote: > to do an image barrier, we need to: > 1. flush L3 RW cache. > 2. do a barrier gateway. > 3. flush sampler cache. > > Note the fence argument may

[Beignet] [PATCH OCL20] Runtime: Add new param_name to clGetProgramBuildInfo

2016-04-04 Thread Xiuli Pan
From: Pan Xiuli Add CL_PROGRAM_BUILD_GLOBAL_VARIABLE_TOTAL_SIZE for api clGetProgramBuildInfo, return the constantset size from backend. Signed-off-by: Pan Xiuli --- src/cl_api.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/cl_api.c b/src/cl_api.c index 5697e34..74e2b72 100644 --

Re: [Beignet] [PATCH 09/10] Backend: Full support workgroup reduce, scan inc/exc on DWORD and bellow datatypes

2016-04-06 Thread Xiuli Pan
On Thu, Mar 31, 2016 at 06:28:33PM +0300, grigore.lupe...@intel.com wrote: > From: Grigore Lupescu > > Signed-off-by: Grigore Lupescu > --- > backend/src/backend/gen_context.cpp | 329 > > 1 file changed, 180 insertions(+), 149 deletions(-) > > diff --git

Re: [Beignet] [PATCH V2 15/17] Utest: Add workgroup broadcast tests

2016-04-12 Thread Xiuli Pan
Hi Grigore, I found what is wrong with broadcast 2D and 3D, see inline comments. You are useing int for the input date! Thanks Xiuli On Mon, Apr 11, 2016 at 05:40:56PM +0300, Grigore Lupescu wrote: > From: Grigore Lupescu > > Added the following unit tests: > compiler_workgroup_broadcast_1D_in

Re: [Beignet] [PATCH V2 12/17] Utest: Add workgroup scan exclusive tests

2016-04-13 Thread Xiuli Pan
The numeric_limits min, max for float could not give inf and -inf, it will only return the minium and maximum positive finite. But the device will return inf and -inf. So the EXP value for the index 0 float max and min should refined with has_infinity and infinity. On Mon, Apr 11, 2016 at 05:40:00

Re: [Beignet] [PATCH V2 12/17] Utest: Add workgroup scan exclusive tests

2016-04-13 Thread Xiuli Pan
char short are not required by the spec, they have very special max and min value in the as EXP vlaue. We can just remove them. On Mon, Apr 11, 2016 at 05:40:00PM +0300, Grigore Lupescu wrote: > From: Grigore Lupescu > > Added the following unit tests: > compiler_workgroup_scan_exclusive_add_cha

[Beignet] [PATCH OCL20] Backend: Refine workgroup all with SIMD_ALL algorithm

2016-04-14 Thread Xiuli Pan
From: Pan Xiuli Fix the problem with AND implemention, use predicate simd width to get in-thread all and any result. Signed-off-by: Pan Xiuli --- backend/src/backend/gen_context.cpp | 193 +-- utests/compiler_workgroup_reduce.cpp | 5 +- 2 files changed, 117

[Beignet] [PATCH OCL20 V2] Backend: Refine workgroup all with SIMD_ALL algorithm

2016-04-24 Thread Xiuli Pan
From: Pan Xiuli Fix the problem with AND implemention, use predicate simd width to get in-thread all and any result. V2: Fix a typo in utest. Signed-off-by: Pan Xiuli --- backend/src/backend/gen_context.cpp | 193 +-- kernels/compiler_workgroup_reduce.cl | 2

[Beignet] [PATCH OCL20] Backend: Refine script for math.h

2016-04-25 Thread Xiuli Pan
From: Pan Xiuli We need generic memory space pointer for OpenCL2.0 header. Signed-off-by: Pan Xiuli --- backend/src/libocl/script/gen_vector.py | 5 +- backend/src/libocl/script/ocl_math.def | 84 + 2 files changed, 24 insertions(+), 65 deletions(-) diff --gi

[Beignet] [PATCH OCL20 V2] GBE: imm64 should not be in src1 per hardware spec. V2: Refine assert to check if the value of imm can not be a imm32

2016-04-25 Thread Xiuli Pan
From: Ruiling Song Signed-off-by: Ruiling Song Contributor: Pan Xiuli --- backend/src/backend/gen8_encoder.cpp | 5 +++-- backend/src/backend/gen_insn_selection.cpp | 12 2 files changed, 11 insertions(+), 6 deletions(-) diff --git a/backend/src/backend/gen8_encoder.cpp b

[Beignet] [PATCH OCL20 V3] Backend: Refine workgroup all with SIMD_ALL algorithm

2016-04-25 Thread Xiuli Pan
From: Pan Xiuli Fix the problem with AND implemention, use predicate simd width to get in-thread all and any result. V2: Fix a typo in utest. V3: Remove unnescessary mask. Signed-off-by: Pan Xiuli --- backend/src/backend/gen_context.cpp | 192 +-- kernels/compi

[Beignet] [PATCH] Backend: Chang scan limit for GVN pass

2016-04-26 Thread Xiuli Pan
From: Pan Xiuli Set memdep-block-scan-limit into llvm context to avoid unfinished GVN pass. V2: Revert remove praser llvm first Signed-off-by: Pan Xiuli --- backend/src/backend/program.cpp | 22 +- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/backend/src/ba

[Beignet] [PATCH OCL20 V2 3/4] Backend: Add workaround for instcombine will optimize fabs

2016-04-28 Thread Xiuli Pan
From: Pan Xiuli LLVM will combine %bc = bitcast float %x to i32 %and = and i32 %bc, 2147483647 into %and = bitcast float %fabs to i32 and fabs will ignore the denorm numbers, so need to workaround for denorm numbers. Signed-off-by: Pan Xiuli --- backend/src/libocl/tmpl/ocl_math.tmpl.cl |

[Beignet] [PATCH OCL20 V2 1/4] Backend: Chang scan limit for GVN pass

2016-04-28 Thread Xiuli Pan
From: Pan Xiuli Set memdep-block-scan-limit into llvm context to avoid unfinished GVN pass. V2: Revert remove praser llvm first Signed-off-by: Pan Xiuli --- backend/src/backend/program.cpp | 22 +- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/backend/src/ba

[Beignet] [PATCH OCL20 V2 4/4] Runtime: Add support for queue size and fix error handling

2016-04-28 Thread Xiuli Pan
From: Pan Xiuli V2: Remove check for device queue and add device queue flag. Signed-off-by: Pan Xiuli --- src/cl_api.c | 24 +--- src/cl_command_queue.h | 1 + src/cl_gt_device.h | 2 +- 3 files changed, 19 insertions(+), 8 deletions(-) diff --git a/src/cl_a

[Beignet] [PATCH OCL20 V2 2/4] Backend: Refine script for math.h

2016-04-28 Thread Xiuli Pan
From: Pan Xiuli We need generic memory space pointer for OpenCL2.0 header. Signed-off-by: Pan Xiuli --- backend/src/libocl/script/gen_vector.py | 5 +- backend/src/libocl/script/ocl_math.def | 84 + 2 files changed, 24 insertions(+), 65 deletions(-) diff --gi

[Beignet] [PATCH] Backend: Fix bug build with clang

2016-04-28 Thread Xiuli Pan
From: Pan Xiuli When using clang template name can not be the same with class variable. This bug will cause the gen ir load/store switch and casue self test error. Signed-off-by: Pan Xiuli --- backend/src/llvm/llvm_gen_backend.cpp | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-

[Beignet] [PATCH] Add support for gcc 6

2016-05-03 Thread Xiuli Pan
From: Pan Xiuli Signed-off-by: Pan Xiuli --- backend/src/ir/immediate.hpp | 2 +- utests/builtin_exp.cpp | 2 +- utests/utest_generator.py| 4 ++-- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/backend/src/ir/immediate.hpp b/backend/src/ir/immediate.hpp index 3141643.

[Beignet] [PATCH 4/5] Runtime: Fix memleak of barrier evnets

2016-05-04 Thread Xiuli Pan
From: Pan Xiuli Signed-off-by: Pan Xiuli --- src/cl_command_queue.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/cl_command_queue.c b/src/cl_command_queue.c index 7a432a0..b66928f 100644 --- a/src/cl_command_queue.c +++ b/src/cl_command_queue.c @@ -102,6 +102,7 @@ cl_command_queue_de

[Beignet] [PATCH 2/5] Backend: Fix memleak form abi::__cxa_demangle

2016-05-04 Thread Xiuli Pan
From: Pan Xiuli We need to free what we get from abi::__cxa_demangle Signed-off-by: Pan Xiuli --- backend/src/llvm/llvm_gen_backend.hpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/backend/src/llvm/llvm_gen_backend.hpp b/backend/src/llvm/llvm_gen_backend.hpp index 236

[Beignet] [PATCH 3/5] Utest: Fix utest memleaks

2016-05-04 Thread Xiuli Pan
From: Pan Xiuli Free all memory allocated and release all cl objects. Signed-off-by: Pan Xiuli --- utests/compare_image_2d_and_1d_array.cpp | 2 ++ utests/compiler_fill_image_1d_array.cpp | 1 + utests/compiler_function_qualifiers.cpp | 1 + utests/image_1D_buffer.cpp | 1 + ut

[Beignet] [PATCH 1/5] Backend: Fix printfs mem leak

2016-05-04 Thread Xiuli Pan
From: Pan Xiuli Should pass pointer of new printf_fmt into map for later delete. Signed-off-by: Pan Xiuli --- backend/src/ir/unit.cpp | 1 + backend/src/ir/unit.hpp | 2 +- backend/src/llvm/llvm_gen_backend.cpp | 2 +- backend/src/llvm/llvm_printf_parser.cpp |

[Beignet] [PATCH 5/5] Runtime: Fix memleak in build program for bin

2016-05-04 Thread Xiuli Pan
From: Pan Xiuli Signed-off-by: Pan Xiuli --- src/cl_program.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/cl_program.c b/src/cl_program.c index b4656ce..85dc184 100644 --- a/src/cl_program.c +++ b/src/cl_program.c @@ -293,6 +293,7 @@ cl_program_create_from_binary(c

[Beignet] [PATCH] Backend: Fix memleak in serialize_program

2016-05-04 Thread Xiuli Pan
From: Frank Dittrich Patch form: https://bugs.freedesktop.org/show_bug.cgi?id=93625 Signed-off-by: Frank Dittrich --- backend/src/gbe_bin_generater.cpp | 2 ++ 1 file changed, 2 insertions(+) diff --git a/backend/src/gbe_bin_generater.cpp b/backend/src/gbe_bin_generater.cpp index 8225d4a..7e

[Beignet] [PATCH 1/2] Benchmark: Fix benchmark bugs with image map

2016-05-05 Thread Xiuli Pan
From: Pan Xiuli The map in utest is changed, benchmark need chang as well. Signed-off-by: Pan Xiuli --- benchmark/benchmark_copy_image.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/benchmark/benchmark_copy_image.cpp b/benchmark/benchmark_copy_image.cpp index 1d35

[Beignet] [PATCH 2/2] Benchmark: Fix Benchmark heap use after free problem

2016-05-05 Thread Xiuli Pan
From: Pan Xiuli Memobject will be release by benchmark runer. Fixed bug: https://bugs.freedesktop.org/show_bug.cgi?id=93627 Signed-off-by: Pan Xiuli --- benchmark/benchmark_copy_buffer.cpp | 1 - benchmark/benchmark_copy_image.cpp | 1 - benchmark/benchmark_read_buffer.cpp

[Beignet] [PATCH 1/2] Backend: Copy workgroup emit function to gen8

2016-05-16 Thread Xiuli Pan
From: Pan Xiuli Since long type is not supported before gen8, need to make a copy for future change. Signed-off-by: Pan Xiuli --- backend/src/backend/gen8_context.cpp | 528 +++ backend/src/backend/gen8_context.hpp | 2 + backend/src/backend/gen_context.hpp |

[Beignet] [PATCH 2/2] Utest: Remove some unsuport work group tests

2016-05-16 Thread Xiuli Pan
From: Pan Xiuli HSW and IVB does not support long type, now hide these tests. Signed-off-by: Pan Xiuli --- utests/compiler_workgroup_broadcast.cpp | 6 +++--- utests/compiler_workgroup_reduce.cpp | 12 ++-- utests/compiler_workgroup_scan_exclusive.cpp | 12 ++-- u

[Beignet] [PATCH 2/4] Runtime: Add API clGetKernelSubGroupInfoKHR for subgroup extension

2016-05-16 Thread Xiuli Pan
From: Pan Xiuli Signed-off-by: Pan Xiuli --- include/CL/cl_intel.h | 27 + src/cl_api.c | 20 + src/cl_device_id.c| 83 +++ src/cl_device_id.h| 9 ++ 4 files changed, 139 insertions(+) diff --git

[Beignet] [PATCH 4/4] Utest: Add subgroup work item test cases

2016-05-16 Thread Xiuli Pan
From: Pan Xiuli Signed-off-by: Pan Xiuli --- kernels/builtin_max_sub_group_size.cl | 7 kernels/builtin_num_sub_groups.cl | 7 kernels/builtin_sub_group_id.cl | 7 kernels/builtin_sub_group_size.cl | 7 utests/CMakeLists.txt | 4 +++ utests

[Beignet] [PATCH 3/4] Backend: Add subgroup work item builtin functions

2016-05-16 Thread Xiuli Pan
From: Pan Xiuli Refine some old simd functions. Signed-off-by: Pan Xiuli --- backend/src/libocl/tmpl/ocl_simd.tmpl.cl | 18 ++ backend/src/libocl/tmpl/ocl_simd.tmpl.h| 5 + backend/src/llvm/llvm_gen_backend.cpp | 4 backend/src/llvm/llvm_gen_ocl_function.h

[Beignet] [PATCH 1/4] Runtime: Fix thread id calculation.

2016-05-16 Thread Xiuli Pan
From: Pan Xiuli Sould use curr with simd_sz to get thread simd_sz Signed-off-by: Pan Xiuli --- src/cl_command_queue_gen7.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/cl_command_queue_gen7.c b/src/cl_command_queue_gen7.c index 1921744..f6ee6b0 100644 --- a/src/cl_co

[Beignet] [PATCH V2 2/2] Utest: Remove some unsuport work group tests

2016-05-16 Thread Xiuli Pan
From: Pan Xiuli HSW and IVB does not support long type, now hide these tests. V2: Remove some unsupport kernel. Signed-off-by: Pan Xiuli --- kernels/compiler_workgroup_broadcast.cl | 11 ++- utests/compiler_workgroup_broadcast.cpp | 6 +++--- utests/compiler_workgroup_reduce

[Beignet] [PATCH 1/4] Backend: Refine returen value of sub_group_all/any to 1

2016-05-16 Thread Xiuli Pan
From: Pan Xiuli Signed-off-by: Pan Xiuli --- backend/src/backend/gen_insn_selection.cpp | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/backend/src/backend/gen_insn_selection.cpp b/backend/src/backend/gen_insn_selection.cpp index 07901a6..83b35cf 100644 --- a/backen

[Beignet] [PATCH 2/4] Utest: Remove old sub_group_all/any utest

2016-05-16 Thread Xiuli Pan
From: Pan Xiuli This utest does not follow spec, so just remove them. Signed-off-by: Pan Xiuli --- kernels/compiler_sub_group_all.cl | 12 --- kernels/compiler_sub_group_any.cl | 15 -- utests/CMakeLists.txt | 2 -- utests/compiler_sub_group_all.cpp | 43 --

[Beignet] [PATCH 4/4] Utest: Add test case for sub_group functions

2016-05-16 Thread Xiuli Pan
From: Pan Xiuli Long type need to be fixed before gen8, so hide them now. Signed-off-by: Pan Xiuli --- kernels/compiler_subgroup_broadcast.cl | 34 +++ kernels/compiler_subgroup_reduce.cl | 136 ++ kernels/compiler_subgroup_scan_exclusive.cl | 98 +++ kernels/compile

[Beignet] [PATCH 3/4] Backend: Add sub_group built-in functions for intel extension

2016-05-16 Thread Xiuli Pan
From: Pan Xiuli Add sub_group_reduce/exclusive/inclusive_max/min/add builtin functions. They share the in thread algorithm of work group functions. Signed-off-by: Pan Xiuli --- backend/src/backend/gen8_context.cpp | 23 backend/src/backend/gen8_context.hpp |

[Beignet] [PATCH V2 5/5] Runtime: Fix memleak in build program for bin

2016-05-17 Thread Xiuli Pan
From: Pan Xiuli V2: Remove repeat setting. Signed-off-by: Pan Xiuli --- src/cl_program.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/cl_program.c b/src/cl_program.c index b4656ce..e079a77 100644 --- a/src/cl_program.c +++ b/src/cl_program.c @@ -601,7 +601,7 @@ cl_pro

[Beignet] [PATCH 1/2] Backend: Add intel_sub_group_block_read/write form buffer

2016-05-19 Thread Xiuli Pan
From: Pan Xiuli Using OWORD_BLOCK_RW to read/write a block of data for a thread. Signed-off-by: Pan Xiuli --- backend/src/backend/gen/gen_mesa_disasm.c | 15 + backend/src/backend/gen_context.cpp| 63 ++ backend/src/backend/gen_context.hpp

[Beignet] [PATCH 2/2] Utest: Add tset case for block read/write buffer

2016-05-19 Thread Xiuli Pan
From: Pan Xiuli Signed-off-by: Pan Xiuli --- kernels/compiler_subgroup_block_read.cl | 31 + kernels/compiler_subgroup_block_write.cl | 27 + utests/CMakeLists.txt| 2 + utests/compiler_subgroup_block_read.cpp | 197 +++ utests/compi

[Beignet] [PATCH 01/12] Runtime: Fix thread id calculation.

2016-05-26 Thread Xiuli Pan
From: Pan Xiuli Sould use curr with simd_sz to get thread simd_sz Signed-off-by: Pan Xiuli --- src/cl_command_queue_gen7.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/cl_command_queue_gen7.c b/src/cl_command_queue_gen7.c index 4caa7e7..6a9cf1f 100644 --- a/src/cl_co

[Beignet] [PATCH 07/12] Backend: Add sub_group built-in functions for intel extension

2016-05-26 Thread Xiuli Pan
From: Pan Xiuli Add sub_group_reduce/exclusive/inclusive_max/min/add builtin functions. They share the in thread algorithm of work group functions. Signed-off-by: Pan Xiuli --- backend/src/backend/gen8_context.cpp | 23 backend/src/backend/gen8_context.hpp |

[Beignet] [PATCH 09/12] Backend: Add intel_sub_group_block_read/write form buffer

2016-05-26 Thread Xiuli Pan
From: Pan Xiuli Using owrod block read/write for a block of data for a thread. V2: Refine some register type. Signed-off-by: Pan Xiuli --- backend/src/backend/gen/gen_mesa_disasm.c | 15 + backend/src/backend/gen_context.cpp| 52 +++ backend/src/backend

[Beignet] [PATCH 04/12] Utest: Add subgroup work item test cases

2016-05-26 Thread Xiuli Pan
From: Pan Xiuli Signed-off-by: Pan Xiuli --- kernels/builtin_max_sub_group_size.cl | 7 kernels/builtin_num_sub_groups.cl | 7 kernels/builtin_sub_group_id.cl | 7 kernels/builtin_sub_group_size.cl | 7 utests/CMakeLists.txt | 4 +++ utests

[Beignet] [PATCH 10/12] Utest: Add tset case for block read/write buffer

2016-05-26 Thread Xiuli Pan
From: Pan Xiuli V2: Rename test case to buffer block read/write test Signed-off-by: Pan Xiuli --- kernels/compiler_subgroup_buffer_block_read.cl | 31 kernels/compiler_subgroup_buffer_block_write.cl | 27 utests/CMakeLists.txt | 2 + utests/compiler_sub

[Beignet] [PATCH 11/12] Backend: Add intel_sub_group_block_read/write form image

2016-05-26 Thread Xiuli Pan
From: Pan Xiuli Using meida block read/write to read data in block. In simd16 mode the need some reg relocation for later use. GEN7 has some different data port. Signed-off-by: Pan Xiuli --- backend/src/backend/gen/gen_mesa_disasm.c | 27 ++- backend/src/backend/gen7_encoder.cpp

  1   2   3   4   >