From: Pan Xiuli
Add test case for builtin with user struct type and int type and
runtime tset for creatPipe and pipe query.
Signed-off-by: Pan Xiuli
---
kernels/compiler_pipe_builtin.cl | 117 +++
utests/CMakeLists.txt| 4 +-
utests/compiler_pi
From: Pan Xiuli
Get pipe packet type from metadata and pass type size to kernel
check type size is fit in clSetKernelArg
Signed-off-by: Pan Xiuli
---
backend/src/backend/program.cpp | 2 ++
backend/src/backend/program.h | 1 +
backend/src/ir/function.hpp | 1 +
backe
From: Pan Xiuli
Add clCreatePipe and clGetPipeInfo
Signed-off-by: Pan Xiuli
---
src/cl_api.c | 70 +
src/cl_device_id.c | 3 ++
src/cl_device_id.h | 3 ++
src/cl_gt_device.h | 3 ++
src/cl_khr_icd.c | 4 +--
src/cl_mem.c | 100
From: Pan Xiuli
Add pipe builtin functions
Signed-off-by: Pan Xiuli
---
backend/src/backend/context.cpp| 5 +
backend/src/backend/gen_reg_allocation.cpp | 3 +-
backend/src/backend/program.h | 1 +
backend/src/ir/function.cpp| 1 +
backend/src/i
From: Pan Xiuli
Gen doesn't have a tailing zero detection function. Use bit field
reverse to reverse the interger first and leading zero detection
to get the number of tailing zeros. Also add some workaroud for
unsupport short and char type to get expected result.
V2: Add missing file ocl_ctz.ll
From: Pan Xiuli
Check all type of ctz function and 0 num bound case.
V2: Fix type warning
Signed-off-by: Pan Xiuli
---
kernels/compiler_ctz.cl | 16 +
utests/CMakeLists.txt | 1 +
utests/compiler_ctz.cpp | 62 +
3 files changed, 79
From: Pan Xiuli
Check all type of ctz function and 0 num bound case.
V2: Fix type warning
Signed-off-by: Pan Xiuli
---
kernels/compiler_ctz.cl | 16 +
utests/CMakeLists.txt | 1 +
utests/compiler_ctz.cpp | 62 +
3 files changed, 79
From: Pan Xiuli
Gen doesn't have a tailing zero detection function. Use bit field
reverse to reverse the interger first and leading zero detection
to get the number of tailing zeros. Also add some workaroud for
unsupport short and char type to get expected result.
V2: Add missing file ocl_ctz.ll
From: Pan Xiuli
Reuse sampler test and add test for new api
clCreateSamplerWithProperties.
Signed-off-by: Pan Xiuli
---
utests/CMakeLists.txt | 1 +
utests/compiler_sampler.cpp | 14 +-
2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/utests/CMakeLists.txt b/ut
From: Pan Xiuli
Add api clCreateSamplerWithProperties
Signed-off-by: Pan Xiuli
---
src/cl_api.c | 62
src/cl_khr_icd.c | 2 +-
2 files changed, 63 insertions(+), 1 deletion(-)
diff --git a/src/cl_api.c b/src/cl_api.c
index 840d57f.
From: Pan Xiuli
Signed-off-by: Pan Xiuli
---
backend/src/libocl/include/ocl_types.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/backend/src/libocl/include/ocl_types.h
b/backend/src/libocl/include/ocl_types.h
index eb4c3b4..1ca6bca 100644
--- a/backend/src/libocl/inc
From: Yang Rong
The extra exec info need reloc, otherwize gpu can't read/write.
And it don't need set to curbe.
So reloc it to unused binding table.
Signed-off-by: Yang Rong
---
src/cl_api.c| 31 +--
src/cl_command_queue.c | 38 +
From: Pan Xiuli
Add device queries for OpenCL 2.0
Signed-off-by: Pan Xiuli
---
src/cl_device_id.c | 14 +-
src/cl_device_id.h | 16 ++--
src/cl_gt_device.h | 14 +-
3 files changed, 40 insertions(+), 4 deletions(-)
diff --git a/src/cl_device_id.c b/src/cl_d
From: Yang Rong
1. correct the context's svm list when delete.
2. Set svm sub buffer's offset when bind buffer.
Signed-off-by: Yang Rong
---
src/cl_command_queue.c | 6 +-
src/cl_mem.c | 18 +++---
2 files changed, 20 insertions(+), 4 deletions(-)
diff --git a/src/c
From: Pan Xiuli
Signed-off-by: Pan Xiuli
---
include/CL/cl.h | 19 ---
include/CL/cl_d3d10.h | 7 ++-
include/CL/cl_d3d11.h | 7 ++-
include/CL/cl_dx9_media_sharing.h | 9 +++--
include/CL/cl_egl.h | 9 +
From: Luo Xionghu
define a MACRO to hold the value.
v2: use same MACRO in cl_extensions.h; add header file protection for
cl_extension.h.
Signed-off-by: Luo Xionghu
Reviewed-by: "Yang, Rong R"
---
src/cl_device_id.h | 5 -
src/cl_extensions.c | 2 +-
src/cl_extensions.h | 6 +-
3 fi
From: Pan Xiuli
Add extensions requested by spec to base extensions.
Signed-off-by: Pan Xiuli
---
src/cl_extensions.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/src/cl_extensions.h b/src/cl_extensions.h
index c42d364..5b2f8a8 100644
--- a/src/cl_extensions.h
+++
From: Pan Xiuli
Add pipe builtin functions
v2: Refine type size to be system determined
Signed-off-by: Pan Xiuli
---
backend/src/backend/context.cpp| 5 +
backend/src/backend/gen_reg_allocation.cpp | 3 +-
backend/src/backend/program.h | 1 +
backend/src/ir/func
From: Pan Xiuli
Add clCreatePipe and clGetPipeInfo
Signed-off-by: Pan Xiuli
---
src/cl_api.c | 70 +
src/cl_device_id.c | 3 ++
src/cl_device_id.h | 3 ++
src/cl_gt_device.h | 3 ++
src/cl_khr_icd.c | 4 +--
src/cl_mem.c | 100
From: Pan Xiuli
Get pipe packet type from metadata and pass type size to kernel
check type size is fit in clSetKernelArg
Signed-off-by: Pan Xiuli
---
backend/src/backend/program.cpp | 2 ++
backend/src/backend/program.h | 1 +
backend/src/ir/function.hpp | 1 +
backe
From: Pan Xiuli
Add test case for builtin with user struct type and int type and
runtime tset for creatPipe and pipe query.
Signed-off-by: Pan Xiuli
---
kernels/compiler_pipe_builtin.cl | 117 +++
utests/CMakeLists.txt| 4 +-
utests/compiler_pi
From: Pan Xiuli
Signed-off-by: Pan Xiuli
---
backend/src/libocl/tmpl/ocl_defines.tmpl.h | 3 +++
1 file changed, 3 insertions(+)
diff --git a/backend/src/libocl/tmpl/ocl_defines.tmpl.h
b/backend/src/libocl/tmpl/ocl_defines.tmpl.h
index 8d41449..ae30e08 100644
--- a/backend/src/libocl/tmpl/ocl
From: Pan Xiuli
Signed-off-by: Pan Xiuli
---
backend/src/backend/program.cpp | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/backend/src/backend/program.cpp b/backend/src/backend/program.cpp
index afae77b..0119670 100644
--- a/backend/src/backend/program.cpp
+++ b/backend/sr
From: Luo Xionghu
Contributor: Luo Xionghu
Signed-off-by: Pan Xiuli
---
src/cl_api.c | 69 +++
src/cl_khr_icd.c | 22 ---
utests/profiling_exec.cpp | 3 ++-
utests/utest_helper.cpp | 4 +--
4 files changed, 86 in
From: Pan Xiuli
Add test case for builtin with user struct type and int type and
runtime tset for creatPipe and pipe query.
Signed-off-by: Pan Xiuli
---
kernels/compiler_pipe_builtin.cl | 117 +++
utests/CMakeLists.txt| 4 +-
utests/compiler_pi
From: Pan Xiuli
Add clCreatePipe and clGetPipeInfo
Signed-off-by: Pan Xiuli
---
src/cl_api.c | 70 +
src/cl_device_id.c | 3 ++
src/cl_device_id.h | 3 ++
src/cl_gt_device.h | 3 ++
src/cl_khr_icd.c | 4 +--
src/cl_mem.c | 100
From: Pan Xiuli
Get pipe packet type from metadata and pass type size to kernel
check type size is fit in clSetKernelArg
Signed-off-by: Pan Xiuli
---
backend/src/backend/program.cpp | 2 ++
backend/src/backend/program.h | 1 +
backend/src/ir/function.hpp | 1 +
backe
From: Pan Xiuli
Add pipe builtin functions
v2: Refine type size to be system determined
Signed-off-by: Pan Xiuli
---
backend/src/backend/context.cpp| 5 +
backend/src/backend/gen_reg_allocation.cpp | 3 +-
backend/src/backend/program.h | 1 +
backend/src/ir/func
From: Luo Xionghu
Contributor: Luo Xionghu
Signed-off-by: Pan Xiuli
---
src/cl_api.c | 70 +++
src/cl_khr_icd.c | 6 +++-
utests/profiling_exec.cpp | 3 +-
utests/utest_helper.cpp | 4 +--
4 files changed, 79 insertions(+),
From: Yang Rong
1. correct the context's svm list when delete.
2. Set svm sub buffer's offset when bind buffer.
Signed-off-by: Yang Rong
Signed-off-by: Pan Xiuli
---
src/cl_command_queue.c | 6 +-
src/cl_mem.c | 18 +++---
2 files changed, 20 insertions(+), 4 deleti
From: Pan Xiuli
Add device queries for OpenCL 2.0
Signed-off-by: Pan Xiuli
---
src/cl_device_id.c | 14 +-
src/cl_device_id.h | 16 ++--
src/cl_gt_device.h | 14 +-
3 files changed, 40 insertions(+), 4 deletions(-)
diff --git a/src/cl_device_id.c b/src/cl_d
From: Pan Xiuli
Reuse sampler test and add test for new api
clCreateSamplerWithProperties.
Signed-off-by: Pan Xiuli
---
utests/CMakeLists.txt | 1 +
utests/compiler_sampler.cpp | 14 +-
2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/utests/CMakeLists.txt b/ut
From: Pan Xiuli
Add extensions requested by spec to base extensions.
Signed-off-by: Pan Xiuli
---
src/cl_extensions.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/src/cl_extensions.h b/src/cl_extensions.h
index c42d364..5b2f8a8 100644
--- a/src/cl_extensions.h
+++
From: Pan Xiuli
Add api clCreateSamplerWithProperties
Signed-off-by: Pan Xiuli
---
src/cl_api.c | 62 ++
src/cl_device_id.c | 2 +-
src/cl_khr_icd.c | 2 +-
3 files changed, 64 insertions(+), 2 deletions(-)
diff --git a/src/cl_api.
From: Luo Xionghu
define a MACRO to hold the value.
v2: use same MACRO in cl_extensions.h; add header file protection for
cl_extension.h.
Signed-off-by: Luo Xionghu
Reviewed-by: "Yang, Rong R"
Signed-off-by: Pan Xiuli
---
src/cl_device_id.h | 5 -
src/cl_extensions.c | 2 +-
src/cl_ext
From: Yang Rong
The extra exec info need reloc, otherwize gpu can't read/write.
And it don't need set to curbe.
So reloc it to unused binding table.
Signed-off-by: Yang Rong
Signed-off-by: Pan Xiuli
---
src/cl_api.c| 31 +--
src/cl_command_queue.c
From: Pan Xiuli
Signed-off-by: Pan Xiuli
---
backend/src/libocl/tmpl/ocl_defines.tmpl.h | 3 +++
1 file changed, 3 insertions(+)
diff --git a/backend/src/libocl/tmpl/ocl_defines.tmpl.h
b/backend/src/libocl/tmpl/ocl_defines.tmpl.h
index 8d41449..ae30e08 100644
--- a/backend/src/libocl/tmpl/ocl
From: Pan Xiuli
Signed-off-by: Pan Xiuli
---
backend/src/backend/program.cpp | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/backend/src/backend/program.cpp b/backend/src/backend/program.cpp
index 8f78e1f..bdcca38 100644
--- a/backend/src/backend/program.cpp
+++ b/backend/sr
From: Pan Xiuli
Signed-off-by: Pan Xiuli
---
backend/src/libocl/include/ocl_types.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/backend/src/libocl/include/ocl_types.h
b/backend/src/libocl/include/ocl_types.h
index eb4c3b4..1ca6bca 100644
--- a/backend/src/libocl/inc
From: Pan Xiuli
Signed-off-by: Pan Xiuli
---
src/cl_mem.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/src/cl_mem.c b/src/cl_mem.c
index 0922e0a..6e5488e 100644
--- a/src/cl_mem.c
+++ b/src/cl_mem.c
@@ -1922,7 +1922,9 @@ cl_mem_kernel_copy_image(cl_command_queue queue,
From: Pan Xiuli
clGetMemObjectInfo with CL_MEM_ASSOCIATED_MEMOBJECT should return
the mem in cl_image_desc. As in CL_MEM_OBJECT_IMAGE1D_BUFFER we
copy the buffer, add a workaround for it.
Signed-off-by: Pan Xiuli
---
src/cl_mem.c | 14 ++
src/cl_mem.h | 1 +
2 files changed, 11 in
From: Pan Xiuli
sRGB writes are not supported now, and we should not return them
if any write was set as CL_MEM_FLAGS.
Signed-off-by: Pan Xiuli
---
src/cl_api.c | 1 +
src/cl_image.c | 5 +
src/cl_image.h | 1 +
3 files changed, 7 insertions(+)
diff --git a/src/cl_api.c b/src/cl_api.c
i
From: Pan Xiuli
CL_sRGBA with CL_UNNORM_INT8 is the minimum request for OpenCL2.0
and CL_sBGRA is also support by hardware.
None of the sRGB surface type suport hardware write.
Signed-off-by: Pan Xiuli
---
backend/src/ocl_common_defines.h | 7 ++-
src/cl_api.c | 2 +-
From: Pan Xiuli
Signed-off-by: Pan Xiuli
---
src/cl_mem.c | 22 +-
1 file changed, 21 insertions(+), 1 deletion(-)
diff --git a/src/cl_mem.c b/src/cl_mem.c
index 6e5488e..c41417c 100644
--- a/src/cl_mem.c
+++ b/src/cl_mem.c
@@ -34,6 +34,7 @@
#include
#include
#include
From: Pan Xiuli
Signed-off-by: Pan Xiuli
---
src/cl_mem.c | 23 ++-
1 file changed, 22 insertions(+), 1 deletion(-)
diff --git a/src/cl_mem.c b/src/cl_mem.c
index 6e5488e..6d45828 100644
--- a/src/cl_mem.c
+++ b/src/cl_mem.c
@@ -34,6 +34,7 @@
#include
#include
#include
From: Pan Xiuli
Add api clCreateSamplerWithProperties
v2: fix bug
Signed-off-by: Pan Xiuli
---
src/cl_api.c | 63 ++
src/cl_device_id.c | 2 +-
src/cl_khr_icd.c | 12 +--
3 files changed, 70 insertions(+), 7 deletions(-)
di
From: Pan Xiuli
Enqueue multiple times if the the size is not uniform, at most 2
times for 1D, 4times for 2D and 8 times for 3D. Using the workdim
offset of walker in batch buffer to keep work groups in series.
TODO: handle events for the flush between multiple enqueues
Signed-off-by: Pan Xiuli
From: Pan Xiuli
Use curbe register for these two size.
Signed-off-by: Pan Xiuli
---
backend/src/backend/gen_insn_selection.cpp | 7 +++--
backend/src/backend/program.h | 3 ++
backend/src/ir/profile.cpp | 4 +++
backend/src/ir/profile.hpp | 47 ++
From: Pan Xiuli
Math built-in nolonger need address space, so remove them.
Signed-off-by: Pan Xiuli
---
backend/src/libocl/tmpl/ocl_math.tmpl.cl | 122 ---
backend/src/libocl/tmpl/ocl_math.tmpl.h | 48 +++-
2 files changed, 25 insertions(+), 145 deletions(
From: Pan Xiuli
Signed-off-by: Pan Xiuli
---
backend/src/libocl/include/ocl_types.h | 2 --
1 file changed, 2 deletions(-)
diff --git a/backend/src/libocl/include/ocl_types.h
b/backend/src/libocl/include/ocl_types.h
index 88e5642..2ae1562 100644
--- a/backend/src/libocl/include/ocl_types.h
++
From: Pan Xiuli
V2: refined with clang macro __INTPTR_WIDTH__ and
clang built-in types __INTXX_TYPE__
Signed-off-by: Pan Xiuli
---
backend/src/backend/program.cpp| 4
backend/src/libocl/CMakeLists.txt | 16 +++-
backend/src/libocl/include/ocl_types.h | 9 +++
From: Pan Xiuli
Signed-off-by: Pan Xiuli
---
backend/src/libocl/include/ocl_workitem.h | 20 ++--
backend/src/libocl/src/ocl_workitem.cl| 8
2 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/backend/src/libocl/include/ocl_workitem.h
b/backend/src/li
From: Pan Xiuli
Add skylakd workstation device and desktop GT4
Signed-off-by: Pan Xiuli
---
src/cl_device_data.h | 16
src/cl_device_id.c | 8
2 files changed, 20 insertions(+), 4 deletions(-)
diff --git a/src/cl_device_data.h b/src/cl_device_data.h
index 63e078f.
From: Pan Xiuli
Create a new spir file for OpenCL2.0 and spir64
Signed-off-by: Pan Xiuli
---
kernels/compiler_ceil64.spir | Bin 0 -> 2152 bytes
utests/load_program_from_spir.cpp | 5 -
2 files changed, 4 insertions(+), 1 deletion(-)
create mode 100644 kernels/compiler_ceil64.spir
Hi Ruiling,
The patch set LGTM, only a small problem about syncFieldNum.
Thanks
Xiuli
On Fri, Apr 01, 2016 at 02:53:24PM +0800, Ruiling Song wrote:
> to do an image barrier, we need to:
> 1. flush L3 RW cache.
> 2. do a barrier gateway.
> 3. flush sampler cache.
>
> Note the fence argument may
From: Pan Xiuli
Add CL_PROGRAM_BUILD_GLOBAL_VARIABLE_TOTAL_SIZE for api
clGetProgramBuildInfo, return the constantset size from backend.
Signed-off-by: Pan Xiuli
---
src/cl_api.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/src/cl_api.c b/src/cl_api.c
index 5697e34..74e2b72 100644
--
On Thu, Mar 31, 2016 at 06:28:33PM +0300, grigore.lupe...@intel.com wrote:
> From: Grigore Lupescu
>
> Signed-off-by: Grigore Lupescu
> ---
> backend/src/backend/gen_context.cpp | 329
>
> 1 file changed, 180 insertions(+), 149 deletions(-)
>
> diff --git
Hi Grigore,
I found what is wrong with broadcast 2D and 3D, see inline comments.
You are useing int for the input date!
Thanks
Xiuli
On Mon, Apr 11, 2016 at 05:40:56PM +0300, Grigore Lupescu wrote:
> From: Grigore Lupescu
>
> Added the following unit tests:
> compiler_workgroup_broadcast_1D_in
The numeric_limits min, max for float could not give inf and -inf, it
will only return the minium and maximum positive finite. But the device
will return inf and -inf.
So the EXP value for the index 0 float max and min should refined with
has_infinity and infinity.
On Mon, Apr 11, 2016 at 05:40:00
char short are not required by the spec, they have very special max and
min value in the as EXP vlaue. We can just remove them.
On Mon, Apr 11, 2016 at 05:40:00PM +0300, Grigore Lupescu wrote:
> From: Grigore Lupescu
>
> Added the following unit tests:
> compiler_workgroup_scan_exclusive_add_cha
From: Pan Xiuli
Fix the problem with AND implemention, use predicate simd width to
get in-thread all and any result.
Signed-off-by: Pan Xiuli
---
backend/src/backend/gen_context.cpp | 193 +--
utests/compiler_workgroup_reduce.cpp | 5 +-
2 files changed, 117
From: Pan Xiuli
Fix the problem with AND implemention, use predicate simd width to
get in-thread all and any result.
V2: Fix a typo in utest.
Signed-off-by: Pan Xiuli
---
backend/src/backend/gen_context.cpp | 193 +--
kernels/compiler_workgroup_reduce.cl | 2
From: Pan Xiuli
We need generic memory space pointer for OpenCL2.0 header.
Signed-off-by: Pan Xiuli
---
backend/src/libocl/script/gen_vector.py | 5 +-
backend/src/libocl/script/ocl_math.def | 84 +
2 files changed, 24 insertions(+), 65 deletions(-)
diff --gi
From: Ruiling Song
Signed-off-by: Ruiling Song
Contributor: Pan Xiuli
---
backend/src/backend/gen8_encoder.cpp | 5 +++--
backend/src/backend/gen_insn_selection.cpp | 12
2 files changed, 11 insertions(+), 6 deletions(-)
diff --git a/backend/src/backend/gen8_encoder.cpp
b
From: Pan Xiuli
Fix the problem with AND implemention, use predicate simd width to
get in-thread all and any result.
V2: Fix a typo in utest.
V3: Remove unnescessary mask.
Signed-off-by: Pan Xiuli
---
backend/src/backend/gen_context.cpp | 192 +--
kernels/compi
From: Pan Xiuli
Set memdep-block-scan-limit into llvm context to avoid unfinished GVN
pass.
V2: Revert remove praser llvm first
Signed-off-by: Pan Xiuli
---
backend/src/backend/program.cpp | 22 +-
1 file changed, 21 insertions(+), 1 deletion(-)
diff --git a/backend/src/ba
From: Pan Xiuli
LLVM will combine
%bc = bitcast float %x to i32
%and = and i32 %bc, 2147483647
into
%and = bitcast float %fabs to i32
and fabs will ignore the denorm numbers, so need to workaround for
denorm numbers.
Signed-off-by: Pan Xiuli
---
backend/src/libocl/tmpl/ocl_math.tmpl.cl |
From: Pan Xiuli
Set memdep-block-scan-limit into llvm context to avoid unfinished GVN
pass.
V2: Revert remove praser llvm first
Signed-off-by: Pan Xiuli
---
backend/src/backend/program.cpp | 22 +-
1 file changed, 21 insertions(+), 1 deletion(-)
diff --git a/backend/src/ba
From: Pan Xiuli
V2: Remove check for device queue and add device queue flag.
Signed-off-by: Pan Xiuli
---
src/cl_api.c | 24 +---
src/cl_command_queue.h | 1 +
src/cl_gt_device.h | 2 +-
3 files changed, 19 insertions(+), 8 deletions(-)
diff --git a/src/cl_a
From: Pan Xiuli
We need generic memory space pointer for OpenCL2.0 header.
Signed-off-by: Pan Xiuli
---
backend/src/libocl/script/gen_vector.py | 5 +-
backend/src/libocl/script/ocl_math.def | 84 +
2 files changed, 24 insertions(+), 65 deletions(-)
diff --gi
From: Pan Xiuli
When using clang template name can not be the same with class variable.
This bug will cause the gen ir load/store switch and casue self test
error.
Signed-off-by: Pan Xiuli
---
backend/src/llvm/llvm_gen_backend.cpp | 10 +-
1 file changed, 5 insertions(+), 5 deletions(-
From: Pan Xiuli
Signed-off-by: Pan Xiuli
---
backend/src/ir/immediate.hpp | 2 +-
utests/builtin_exp.cpp | 2 +-
utests/utest_generator.py| 4 ++--
3 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/backend/src/ir/immediate.hpp b/backend/src/ir/immediate.hpp
index 3141643.
From: Pan Xiuli
Signed-off-by: Pan Xiuli
---
src/cl_command_queue.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/src/cl_command_queue.c b/src/cl_command_queue.c
index 7a432a0..b66928f 100644
--- a/src/cl_command_queue.c
+++ b/src/cl_command_queue.c
@@ -102,6 +102,7 @@ cl_command_queue_de
From: Pan Xiuli
We need to free what we get from abi::__cxa_demangle
Signed-off-by: Pan Xiuli
---
backend/src/llvm/llvm_gen_backend.hpp | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/backend/src/llvm/llvm_gen_backend.hpp
b/backend/src/llvm/llvm_gen_backend.hpp
index 236
From: Pan Xiuli
Free all memory allocated and release all cl objects.
Signed-off-by: Pan Xiuli
---
utests/compare_image_2d_and_1d_array.cpp | 2 ++
utests/compiler_fill_image_1d_array.cpp | 1 +
utests/compiler_function_qualifiers.cpp | 1 +
utests/image_1D_buffer.cpp | 1 +
ut
From: Pan Xiuli
Should pass pointer of new printf_fmt into map for later delete.
Signed-off-by: Pan Xiuli
---
backend/src/ir/unit.cpp | 1 +
backend/src/ir/unit.hpp | 2 +-
backend/src/llvm/llvm_gen_backend.cpp | 2 +-
backend/src/llvm/llvm_printf_parser.cpp |
From: Pan Xiuli
Signed-off-by: Pan Xiuli
---
src/cl_program.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/cl_program.c b/src/cl_program.c
index b4656ce..85dc184 100644
--- a/src/cl_program.c
+++ b/src/cl_program.c
@@ -293,6 +293,7 @@ cl_program_create_from_binary(c
From: Frank Dittrich
Patch form: https://bugs.freedesktop.org/show_bug.cgi?id=93625
Signed-off-by: Frank Dittrich
---
backend/src/gbe_bin_generater.cpp | 2 ++
1 file changed, 2 insertions(+)
diff --git a/backend/src/gbe_bin_generater.cpp
b/backend/src/gbe_bin_generater.cpp
index 8225d4a..7e
From: Pan Xiuli
The map in utest is changed, benchmark need chang as well.
Signed-off-by: Pan Xiuli
---
benchmark/benchmark_copy_image.cpp | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/benchmark/benchmark_copy_image.cpp
b/benchmark/benchmark_copy_image.cpp
index 1d35
From: Pan Xiuli
Memobject will be release by benchmark runer.
Fixed bug:
https://bugs.freedesktop.org/show_bug.cgi?id=93627
Signed-off-by: Pan Xiuli
---
benchmark/benchmark_copy_buffer.cpp | 1 -
benchmark/benchmark_copy_image.cpp | 1 -
benchmark/benchmark_read_buffer.cpp
From: Pan Xiuli
Since long type is not supported before gen8, need to make a copy for
future change.
Signed-off-by: Pan Xiuli
---
backend/src/backend/gen8_context.cpp | 528 +++
backend/src/backend/gen8_context.hpp | 2 +
backend/src/backend/gen_context.hpp |
From: Pan Xiuli
HSW and IVB does not support long type, now hide these tests.
Signed-off-by: Pan Xiuli
---
utests/compiler_workgroup_broadcast.cpp | 6 +++---
utests/compiler_workgroup_reduce.cpp | 12 ++--
utests/compiler_workgroup_scan_exclusive.cpp | 12 ++--
u
From: Pan Xiuli
Signed-off-by: Pan Xiuli
---
include/CL/cl_intel.h | 27 +
src/cl_api.c | 20 +
src/cl_device_id.c| 83 +++
src/cl_device_id.h| 9 ++
4 files changed, 139 insertions(+)
diff --git
From: Pan Xiuli
Signed-off-by: Pan Xiuli
---
kernels/builtin_max_sub_group_size.cl | 7
kernels/builtin_num_sub_groups.cl | 7
kernels/builtin_sub_group_id.cl | 7
kernels/builtin_sub_group_size.cl | 7
utests/CMakeLists.txt | 4 +++
utests
From: Pan Xiuli
Refine some old simd functions.
Signed-off-by: Pan Xiuli
---
backend/src/libocl/tmpl/ocl_simd.tmpl.cl | 18 ++
backend/src/libocl/tmpl/ocl_simd.tmpl.h| 5 +
backend/src/llvm/llvm_gen_backend.cpp | 4
backend/src/llvm/llvm_gen_ocl_function.h
From: Pan Xiuli
Sould use curr with simd_sz to get thread simd_sz
Signed-off-by: Pan Xiuli
---
src/cl_command_queue_gen7.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/cl_command_queue_gen7.c b/src/cl_command_queue_gen7.c
index 1921744..f6ee6b0 100644
--- a/src/cl_co
From: Pan Xiuli
HSW and IVB does not support long type, now hide these tests.
V2: Remove some unsupport kernel.
Signed-off-by: Pan Xiuli
---
kernels/compiler_workgroup_broadcast.cl | 11 ++-
utests/compiler_workgroup_broadcast.cpp | 6 +++---
utests/compiler_workgroup_reduce
From: Pan Xiuli
Signed-off-by: Pan Xiuli
---
backend/src/backend/gen_insn_selection.cpp | 8 ++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/backend/src/backend/gen_insn_selection.cpp
b/backend/src/backend/gen_insn_selection.cpp
index 07901a6..83b35cf 100644
--- a/backen
From: Pan Xiuli
This utest does not follow spec, so just remove them.
Signed-off-by: Pan Xiuli
---
kernels/compiler_sub_group_all.cl | 12 ---
kernels/compiler_sub_group_any.cl | 15 --
utests/CMakeLists.txt | 2 --
utests/compiler_sub_group_all.cpp | 43 --
From: Pan Xiuli
Long type need to be fixed before gen8, so hide them now.
Signed-off-by: Pan Xiuli
---
kernels/compiler_subgroup_broadcast.cl | 34 +++
kernels/compiler_subgroup_reduce.cl | 136 ++
kernels/compiler_subgroup_scan_exclusive.cl | 98 +++
kernels/compile
From: Pan Xiuli
Add sub_group_reduce/exclusive/inclusive_max/min/add builtin functions.
They share the in thread algorithm of work group functions.
Signed-off-by: Pan Xiuli
---
backend/src/backend/gen8_context.cpp | 23
backend/src/backend/gen8_context.hpp |
From: Pan Xiuli
V2: Remove repeat setting.
Signed-off-by: Pan Xiuli
---
src/cl_program.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/cl_program.c b/src/cl_program.c
index b4656ce..e079a77 100644
--- a/src/cl_program.c
+++ b/src/cl_program.c
@@ -601,7 +601,7 @@ cl_pro
From: Pan Xiuli
Using OWORD_BLOCK_RW to read/write a block of data for a thread.
Signed-off-by: Pan Xiuli
---
backend/src/backend/gen/gen_mesa_disasm.c | 15 +
backend/src/backend/gen_context.cpp| 63 ++
backend/src/backend/gen_context.hpp
From: Pan Xiuli
Signed-off-by: Pan Xiuli
---
kernels/compiler_subgroup_block_read.cl | 31 +
kernels/compiler_subgroup_block_write.cl | 27 +
utests/CMakeLists.txt| 2 +
utests/compiler_subgroup_block_read.cpp | 197 +++
utests/compi
From: Pan Xiuli
Sould use curr with simd_sz to get thread simd_sz
Signed-off-by: Pan Xiuli
---
src/cl_command_queue_gen7.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/cl_command_queue_gen7.c b/src/cl_command_queue_gen7.c
index 4caa7e7..6a9cf1f 100644
--- a/src/cl_co
From: Pan Xiuli
Add sub_group_reduce/exclusive/inclusive_max/min/add builtin functions.
They share the in thread algorithm of work group functions.
Signed-off-by: Pan Xiuli
---
backend/src/backend/gen8_context.cpp | 23
backend/src/backend/gen8_context.hpp |
From: Pan Xiuli
Using owrod block read/write for a block of data for a thread.
V2: Refine some register type.
Signed-off-by: Pan Xiuli
---
backend/src/backend/gen/gen_mesa_disasm.c | 15 +
backend/src/backend/gen_context.cpp| 52 +++
backend/src/backend
From: Pan Xiuli
Signed-off-by: Pan Xiuli
---
kernels/builtin_max_sub_group_size.cl | 7
kernels/builtin_num_sub_groups.cl | 7
kernels/builtin_sub_group_id.cl | 7
kernels/builtin_sub_group_size.cl | 7
utests/CMakeLists.txt | 4 +++
utests
From: Pan Xiuli
V2: Rename test case to buffer block read/write test
Signed-off-by: Pan Xiuli
---
kernels/compiler_subgroup_buffer_block_read.cl | 31
kernels/compiler_subgroup_buffer_block_write.cl | 27
utests/CMakeLists.txt | 2 +
utests/compiler_sub
From: Pan Xiuli
Using meida block read/write to read data in block. In simd16 mode the
need some reg relocation for later use.
GEN7 has some different data port.
Signed-off-by: Pan Xiuli
---
backend/src/backend/gen/gen_mesa_disasm.c | 27 ++-
backend/src/backend/gen7_encoder.cpp
1 - 100 of 314 matches
Mail list logo