[Beignet] [PATCH V2] GBE: use native exp instruction when enough precision

2014-01-20 Thread Guo Yejun
for the input data with enough precision, use the native exp instruction, otherwise, use the software path to emulate the exp function. Signed-off-by: Guo Yejun yejun@intel.com --- backend/src/backend/gen_insn_selection.cpp |1 + backend/src/ir/instruction.hpp |2

Re: [Beignet] [PATCH] BGE: add param to switch the behavior of math func

2014-02-13 Thread Guo, Yejun
, :( Thanks Yejun -Original Message- From: Zhigang Gong [mailto:zhigang.g...@linux.intel.com] Sent: Thursday, February 13, 2014 3:11 PM To: Guo, Yejun Cc: beignet@lists.freedesktop.org Subject: Re: [Beignet] [PATCH] BGE: add param to switch the behavior of math func One typo in the subject

[Beignet] [PATCH V2] GBE: add param to switch the behavior of math func

2014-02-13 Thread Guo Yejun
will be added later. Signed-off-by: Guo Yejun yejun@intel.com --- backend/CMakeLists.txt |3 ++- backend/src/CMakeLists.txt | 13 ++--- backend/src/GBEConfig.h.in |1 + backend/src/backend/program.cpp | 12 +++- backend/src/ocl_stdlib.tmpl.h

Re: [Beignet] [PATCH V2] GBE: add param to switch the behavior of math func

2014-02-13 Thread Guo, Yejun
error is very big. Thanks Yejun -Original Message- From: Sun, Yi Sent: Friday, February 14, 2014 9:28 AM To: Guo, Yejun; beignet@lists.freedesktop.org Cc: Guo, Yejun Subject: RE: [Beignet] [PATCH V2] GBE: add param to switch the behavior of math func BTW, what kind of precision can

Re: [Beignet] [PATCH V2] GBE: add param to switch the behavior of math func

2014-02-14 Thread Guo, Yejun
use the function name like __gen_ocl_internal_fastpath_sin, do you think is it ok? Thanks. Thanks Yejun -Original Message- From: Zhigang Gong [mailto:zhigang.g...@linux.intel.com] Sent: Friday, February 14, 2014 3:04 PM To: Guo, Yejun Cc: beignet@lists.freedesktop.org Subject: Re

Re: [Beignet] [PATCH V2] GBE: add param to switch the behavior of math func

2014-02-14 Thread Guo, Yejun
[mailto:zhigang.g...@linux.intel.com] Sent: Friday, February 14, 2014 4:09 PM To: Guo, Yejun Cc: beignet@lists.freedesktop.org Subject: Re: [Beignet] [PATCH V2] GBE: add param to switch the behavior of math func On Fri, Feb 14, 2014 at 08:20:47AM +, Guo, Yejun wrote: With this design, we can

[Beignet] [PATCH V3] GBE: add param to switch the behavior of math func

2014-02-17 Thread Guo Yejun
support will be added later. Signed-off-by: Guo Yejun yejun@intel.com --- backend/src/backend/program.cpp | 12 backend/src/builtin_vector_proto.def |4 backend/src/ocl_stdlib.tmpl.h|8 3 files changed, 24 insertions(+) diff --git a/backend/src

[Beignet] [PATCH] GBE: support getelementptr with ConstantExpr operand

2014-02-26 Thread Guo Yejun
Add support during LLVM IR - Gen IR period when the first operand of getelementptr is ConstantExpr. utest is also added. Signed-off-by: Guo Yejun yejun@intel.com --- backend/src/llvm/llvm_gen_backend.cpp | 10 +- kernels/compiler_getelementptr_bitcast.cl | 18 +++ utests

[Beignet] [PATCH] GBE: show correct line number in build log

2014-02-27 Thread Guo Yejun
Sometimes, we insert some code into the kernel, it makes the line number reported in build log mismatch with the line number in the kernel from programer's view, use #line to correct it. Signed-off-by: Guo Yejun yejun@intel.com --- backend/src/backend/program.cpp |5 + 1 file changed

Re: [Beignet] [PATCH] merge some state buffers into one buffer

2014-03-06 Thread Guo, Yejun
, March 07, 2014 11:14 AM To: Guo, Yejun; beignet@lists.freedesktop.org Cc: Guo, Yejun Subject: RE: [Beignet] [PATCH] merge some state buffers into one buffer 2 comments. -Original Message- From: beignet-boun...@lists.freedesktop.org [mailto:beignet-boun...@lists.freedesktop.org] On Behalf

[Beignet] [PATCH 2/2] add test for __gen_ocl_simd_any and __gen_ocl_simd_all

2014-04-18 Thread Guo Yejun
Signed-off-by: Guo Yejun yejun@intel.com --- kernels/compiler_simd_all.cl | 12 kernels/compiler_simd_any.cl | 15 +++ utests/CMakeLists.txt| 2 ++ utests/compiler_simd_all.cpp | 43 +++ utests/compiler_simd_any.cpp

[Beignet] [PATCH 1/2] support __gen_ocl_simd_any and __gen_ocl_simd_all

2014-04-18 Thread Guo Yejun
: for(; ; ) { ... if (__gen_ocl_simd_any(...)) break; //the whole SIMD stop the searching } Signed-off-by: Guo Yejun yejun@intel.com --- backend/src/backend/gen_insn_selection.cpp | 63 ++ backend/src/ir/instruction.hpp | 4 ++ backend/src/ir/instruction.hxx | 2

[Beignet] [PATCH] add support for cross compiler

2014-04-23 Thread Guo Yejun
Signed-off-by: Guo Yejun yejun@intel.com --- CMake/FindLLVM.cmake | 2 +- backend/src/CMakeLists.txt | 8 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/CMake/FindLLVM.cmake b/CMake/FindLLVM.cmake index 97ee7db..556b3a9 100644 --- a/CMake/FindLLVM.cmake +++ b

[Beignet] [PATCH] fix typo when check local size with work dim

2014-04-24 Thread Guo Yejun
Signed-off-by: Guo Yejun yejun@intel.com --- src/cl_command_queue_gen7.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/cl_command_queue_gen7.c b/src/cl_command_queue_gen7.c index 891d6f1..90e275d 100644 --- a/src/cl_command_queue_gen7.c +++ b/src

Re: [Beignet] [PATCH] fix typo when check local size with work dim

2014-04-24 Thread Guo, Yejun
oh, yes, you are right, it is not a typo. Thanks Yejun -Original Message- From: Zhigang Gong [mailto:zhigang.g...@linux.intel.com] Sent: Friday, April 25, 2014 11:35 AM To: Guo, Yejun; beignet@lists.freedesktop.org Subject: RE: [Beignet] [PATCH] fix typo when check local size with work

[Beignet] [PATCH] do not serialize zero image/sampler info into binary

2014-05-06 Thread Guo Yejun
if there is no image/sampler used in kernel source, it is not necessary to serialize the zero image/sampler info into kernel binary. Signed-off-by: Guo Yejun yejun@intel.com --- backend/src/backend/program.cpp | 8 ++-- backend/src/backend/program.hpp | 4 ++-- backend/src/ir/image.hpp

[Beignet] [PATCH] separate runtime(libcl.so) and compiler(libgbe.so)

2014-05-09 Thread Guo Yejun
other .cpp files; to make libinterp.a small (the purpose to make libcl.so small), the macro GBE_COMPILER_AVAILABLE is used to make only the needed code active when build for libinterp.a. Signed-off-by: Guo Yejun yejun@intel.com --- backend/CMakeLists.txt | 3 ++ backend/src

[Beignet] [PATCH] correct L3 cache settings for baytrail

2014-05-22 Thread Guo Yejun
baytrail and ivb have different register bits layout for L3 cache, so, add a special path for baytrail. Signed-off-by: Guo Yejun yejun@intel.com --- src/intel/intel_gpgpu.c | 37 ++--- 1 file changed, 34 insertions(+), 3 deletions(-) diff --git a/src/intel

Re: [Beignet] [PATCH] correct L3 cache settings for baytrail

2014-05-22 Thread Guo, Yejun
, Ruiling Sent: Thursday, May 22, 2014 4:27 PM To: Guo, Yejun; beignet@lists.freedesktop.org Cc: Guo, Yejun Subject: RE: [Beignet] [PATCH] correct L3 cache settings for baytrail --- a/src/intel/intel_gpgpu.c +++ b/src/intel/intel_gpgpu.c @@ -309,14 +309,14 @@ static const uint32_t gpgpu_l3_config_reg1

Re: [Beignet] [PATCH] correct L3 cache settings for baytrail

2014-05-22 Thread Guo, Yejun
, May 22, 2014 5:06 PM To: Guo, Yejun; Song, Ruiling; beignet@lists.freedesktop.org Subject: RE: [Beignet] [PATCH] correct L3 cache settings for baytrail I agree with both of you. The gpgpu_l3_config_reg magic number array is really too confusing to understand. If those registers' definition

[Beignet] [PATCH V2] correct L3 cache settings for baytrail

2014-05-22 Thread Guo Yejun
baytrail and ivb have different register bits layout for L3 cache, so, add a special path for baytrail. Signed-off-by: Guo Yejun yejun@intel.com --- src/intel/intel_gpgpu.c | 30 +- 1 file changed, 29 insertions(+), 1 deletion(-) diff --git a/src/intel

Re: [Beignet] [PATCH] separate runtime(libcl.so) and compiler(libgbe.so)

2014-05-23 Thread Guo, Yejun
Ping for review, thanks. Thanks Yejun -Original Message- From: Guo, Yejun Sent: Friday, May 09, 2014 7:17 AM To: beignet@lists.freedesktop.org Cc: Guo, Yejun Subject: [PATCH] separate runtime(libcl.so) and compiler(libgbe.so) On embedded/handheld devices, storage and memory

Re: [Beignet] [PATCH V2] gbe_bin_generater: fix two bugs.

2014-05-26 Thread Guo, Yejun
one comment: iostream is already included at line 29 in gbe_bin_generater.cpp. Others are ok. Thanks Yejun -Original Message- From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of Zhigang Gong Sent: Friday, May 23, 2014 7:05 PM To: beignet@lists.freedesktop.org

Re: [Beignet] [PATCH] GBE: fix baytrail L3 cache configuration.

2014-05-27 Thread Guo, Yejun
LGTM, thanks. Thanks Yejun -Original Message- From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of Zhigang Gong Sent: Tuesday, May 27, 2014 4:23 PM To: beignet@lists.freedesktop.org Cc: Zhigang Gong Subject: [Beignet] [PATCH] GBE: fix baytrail L3 cache

[Beignet] [PATCH] extract libgbeinterp.so from runtime (libcl.so)

2014-05-30 Thread Guo Yejun
RTLD_DEEPBIND, then dlopen libgbeinterp.so with RTLD_DEEPBIND, to fix the std:cerr crash issue. Signed-off-by: Guo Yejun yejun@intel.com --- backend/CMakeLists.txt | 2 + backend/src/CMakeLists.txt | 5 +- backend/src/GBEConfig.h.in | 1 + src/CMakeLists.txt | 1 - src

Re: [Beignet] [PATCH V2 1/3] add [opencl-1.2] API clCompileProgram.

2014-06-05 Thread Guo, Yejun
I just review the API (compile and link) in src/cl_program.c, we need to check if compiler is supported on the platform, please reference code in function cl_program_build if (!CompilerSupported()) { err = CL_COMPILER/LINKER_NOT_AVAILABLE; goto error; } Btw, we'd better

[Beignet] [PATCH] remove RTLD_DEEPBIND to avoid stdc++ issues

2014-06-11 Thread Guo Yejun
-by: Guo Yejun yejun@intel.com --- src/cl_command_queue.c | 16 ++-- src/cl_command_queue_gen7.c | 60 ++--- src/cl_device_id.c | 2 +- src/cl_gbe_loader.cpp | 206 ++-- src/cl_gbe_loader.h | 37 +++- src/cl_kernel.c

Re: [Beignet] [PATCH] relax the build dependency on Gen GPU

2014-06-12 Thread Guo, Yejun
Ping for review, thanks. Thanks Yejun -Original Message- From: Guo, Yejun Sent: Friday, June 06, 2014 3:00 PM To: beignet@lists.freedesktop.org Subject: RE: [PATCH] relax the build dependency on Gen GPU Ping for review, thanks. Thanks Yejun -Original Message- From: Guo

[Beignet] [PATCH] clean code to remove gbe_kernel_set_const_buffer_size

2014-06-12 Thread Guo Yejun
this function is no longer needed. Signed-off-by: Guo Yejun yejun@intel.com --- backend/src/backend/context.cpp | 21 - backend/src/backend/context.hpp | 2 -- backend/src/backend/program.cpp | 8 backend/src/backend/program.h | 4 backend/src/backend

[Beignet] [PATCH] use LLVM_INSTALL_DIR as the path to clang/llvm-as/llvm-link

2014-06-12 Thread Guo Yejun
with the help of LLVM_INSTALL_DIR. Signed-off-by: Guo Yejun yejun@intel.com --- backend/src/CMakeLists.txt | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/backend/src/CMakeLists.txt b/backend/src/CMakeLists.txt index 6090174..1f2822d 100644 --- a/backend/src/CMakeLists.txt

[Beignet] [PATCH] add how to for cross compiler

2014-06-18 Thread Guo Yejun
Signed-off-by: Guo Yejun yejun@intel.com --- docs/howto/cross-compiler-howto.mdwn | 60 1 file changed, 60 insertions(+) create mode 100644 docs/howto/cross-compiler-howto.mdwn diff --git a/docs/howto/cross-compiler-howto.mdwn b/docs/howto/cross

[Beignet] [PATCH] fix crash when OCL_STRICT_CONFORMANCE is unset

2014-06-23 Thread Guo Yejun
Signed-off-by: Guo Yejun yejun@intel.com --- utests/utest_generator.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/utests/utest_generator.py b/utests/utest_generator.py index 8d2f2a8..dbce45b 100644 --- a/utests/utest_generator.py +++ b/utests/utest_generator.py

[Beignet] [PATCH] add BEIGNET_INSTALL_DIR to clean code

2014-06-24 Thread Guo Yejun
Signed-off-by: Guo Yejun yejun@intel.com --- CMakeLists.txt | 3 +++ backend/src/CMakeLists.txt | 29 + src/CMakeLists.txt | 2 +- 3 files changed, 17 insertions(+), 17 deletions(-) diff --git a/CMakeLists.txt b/CMakeLists.txt index 6c2d2a6

[Beignet] [PATCH] free build_log when the cl program is released

2014-07-17 Thread Guo Yejun
Signed-off-by: Guo Yejun yejun@intel.com --- src/cl_program.c | 5 + 1 file changed, 5 insertions(+) diff --git a/src/cl_program.c b/src/cl_program.c index 3fc2212..7d33f2a 100644 --- a/src/cl_program.c +++ b/src/cl_program.c @@ -76,6 +76,11 @@ cl_program_delete(cl_program p) p

[Beignet] [PATCH] clean llvm resource in compiler (libgbe.so)

2014-07-18 Thread Guo Yejun
since we have separated the compiler (libgbe.so) and the interpreter (libgbeinterp.so), the LLVM resource cleanup task should be done in the compiler instead of the GenProgram::~GenProgram which has no way to clean llvm resources in libgbeinterp.so Signed-off-by: Guo Yejun yejun@intel.com

[Beignet] [PATCH] correct the package dependency from REQUIRED to QUIET

2014-07-21 Thread Guo Yejun
Signed-off-by: Guo Yejun yejun@intel.com --- CMakeLists.txt | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/CMakeLists.txt b/CMakeLists.txt index 969c9de..95948a5 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -102,7 +102,7 @@ ELSE(DRM_INTEL_FOUND) ENDIF

[Beignet] [PATCH] enable CL_RG and CL_RA format for cl image

2014-07-21 Thread Guo Yejun
Signed-off-by: Guo Yejun yejun@intel.com --- src/cl_image.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/cl_image.c b/src/cl_image.c index ced9789..49b682d 100644 --- a/src/cl_image.c +++ b/src/cl_image.c @@ -134,7 +134,6 @@ cl_image_get_intel_format(const

[Beignet] [PATCH] fix three memory leaks

2014-07-24 Thread Guo Yejun
Signed-off-by: Guo Yejun yejun@intel.com --- backend/src/llvm/llvm_printf_parser.cpp | 3 ++- backend/src/llvm/llvm_to_gen.cpp| 1 + src/cl_command_queue.c | 2 ++ 3 files changed, 5 insertions(+), 1 deletion(-) diff --git a/backend/src/llvm/llvm_printf_parser.cpp b

Re: [Beignet] [PATCH] correct the package dependency from REQUIRED to QUIET

2014-08-28 Thread Guo, Yejun
Thanks, will refine the patch, verify it and send out again -Original Message- From: Yang, Rong R Sent: Friday, August 29, 2014 11:23 AM To: Guo, Yejun; beignet@lists.freedesktop.org Cc: Guo, Yejun Subject: RE: [Beignet] [PATCH] correct the package dependency from REQUIRED to QUIET Do

[Beignet] [PATCH] remove dependency for non-X runtime environment

2014-08-29 Thread Guo Yejun
Signed-off-by: Guo Yejun yejun@intel.com --- CMakeLists.txt | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/CMakeLists.txt b/CMakeLists.txt index 82dbc8d..3703a22 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -94,6 +94,10 @@ ELSE(DRM_INTEL_FOUND) MESSAGE

[Beignet] [PATCH] fix issue to create cl image from libva with non-zero offset

2014-09-25 Thread Guo Yejun
until we fill the surface state. Signed-off-by: Guo Yejun yejun@intel.com --- src/cl_driver.h | 2 +- src/cl_mem.c | 2 +- src/intel/intel_driver.c | 4 +--- src/intel/intel_gpgpu.c | 8 4 files changed, 7 insertions(+), 9 deletions(-) diff --git a/src

Re: [Beignet] [PATCH] fix issue to create cl image from libva with non-zero offset

2014-09-25 Thread Guo, Yejun
No problem, I'll add into utest in another patch. -Original Message- From: Zhigang Gong [mailto:zhigang.g...@linux.intel.com] Sent: Thursday, September 25, 2014 2:55 PM To: Guo, Yejun Cc: beignet@lists.freedesktop.org Subject: Re: [Beignet] [PATCH] fix issue to create cl image from libva

Re: [Beignet] [PATCH] Make use of write enable flag for mem bo map

2014-10-23 Thread Guo, Yejun
Yes, simpler. -Original Message- From: Zhenyu Wang [mailto:zhen...@linux.intel.com] Sent: Thursday, October 23, 2014 2:29 PM To: Guo, Yejun Cc: beignet@lists.freedesktop.org Subject: Re: [Beignet] [PATCH] Make use of write enable flag for mem bo map On 2014.10.23 06:23:58 +, Guo

Re: [Beignet] [PATCH v2 1/5] Make use of write enable flag for mem bo map

2014-10-23 Thread Guo, Yejun
LGTM, thanks. -Original Message- From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of Zhenyu Wang Sent: Thursday, October 23, 2014 3:19 PM To: beignet@lists.freedesktop.org Subject: [Beignet] [PATCH v2 1/5] Make use of write enable flag for mem bo map Use drm/intel

Re: [Beignet] [PATCH v2 5/5] Fix AUX buffer for page alignment

2014-10-23 Thread Guo, Yejun
LGTM, thanks. -Original Message- From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of Zhenyu Wang Sent: Thursday, October 23, 2014 3:19 PM To: beignet@lists.freedesktop.org Subject: [Beignet] [PATCH v2 5/5] Fix AUX buffer for page alignment Apply ALIGN() for aux

Re: [Beignet] [PATCH] Fix AUX buffer for really page aligned

2014-10-24 Thread Guo, Yejun
three comments in line, thanks. -Original Message- From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of Zhenyu Wang Sent: Wednesday, October 22, 2014 4:11 PM To: beignet@lists.freedesktop.org Subject: [Beignet] [PATCH] Fix AUX buffer for really page aligned Apply

[Beignet] [PATCH 1/2] support CL_MEM_USE_HOST_PTR with userptr for cl buffer

2014-11-02 Thread Guo Yejun
some code clean. Signed-off-by: Guo Yejun yejun@intel.com --- CMakeLists.txt | 11 +-- src/CMakeLists.txt | 5 + src/cl_api.c | 10 +++--- src/cl_driver.h | 3 +++ src/cl_driver_defs.c | 1 + src/cl_enqueue.c | 19

[Beignet] [PATCH] clean code, the logic is already at the beginning of function

2014-11-02 Thread Guo Yejun
Signed-off-by: Guo Yejun yejun@intel.com --- src/cl_mem.c | 16 1 file changed, 16 deletions(-) diff --git a/src/cl_mem.c b/src/cl_mem.c index 6162f7d..99784a8 100644 --- a/src/cl_mem.c +++ b/src/cl_mem.c @@ -378,22 +378,6 @@ cl_mem_new_buffer(cl_context ctx, goto error

Re: [Beignet] [PATCH 1/2] support CL_MEM_USE_HOST_PTR with userptr for cl buffer

2014-11-02 Thread Guo, Yejun
Yes, you are right, will modify accordingly. And also do more tests. -Original Message- From: Zhenyu Wang [mailto:zhen...@linux.intel.com] Sent: Monday, November 03, 2014 11:33 AM To: Guo, Yejun Cc: beignet@lists.freedesktop.org Subject: Re: [Beignet] [PATCH 1/2] support

[Beignet] [PATCH 3/3] add test for cl buffer created with CL_MEM_USE_HOST_PTR

2014-11-05 Thread Guo Yejun
Signed-off-by: Guo Yejun yejun@intel.com --- kernels/runtime_use_host_ptr_buffer.cl | 6 ++ utests/CMakeLists.txt | 6 ++ utests/runtime_use_host_ptr_buffer.cpp | 36 ++ 3 files changed, 48 insertions(+) create mode 100644 kernels

Re: [Beignet] [PATCH V2 1/3] support CL_MEM_USE_HOST_PTR with userptr for cl buffer

2014-11-06 Thread Guo, Yejun
drm_intel_bo_alloc_userptr to allocate bo, report CL_MEM_OBJECT_ALLOCATION_FAILURE if the allocated bo is NULL. - otherwise use the old method. -Original Message- From: Zhenyu Wang [mailto:zhen...@linux.intel.com] Sent: Friday, November 07, 2014 1:43 PM To: Zhigang Gong Cc: Guo, Yejun

[Beignet] [PATCH V2 2/3] enable CL_DEVICE_HOST_UNIFIED_MEMORY when userptr is supported

2014-11-07 Thread Guo Yejun
userptr is firstly checked at compile time with libdrm version, but it does not ensure the system has such capability (for exmaple, with old linux kernel), so also take a check at run time for the device info. V2: add runtime check to see if userptr is really supported Signed-off-by: Guo Yejun

[Beignet] [PATCH 3/3] add test for cl buffer created with CL_MEM_USE_HOST_PTR

2014-11-07 Thread Guo Yejun
Signed-off-by: Guo Yejun yejun@intel.com --- kernels/runtime_use_host_ptr_buffer.cl | 6 ++ utests/CMakeLists.txt | 6 ++ utests/runtime_use_host_ptr_buffer.cpp | 36 ++ 3 files changed, 48 insertions(+) create mode 100644 kernels

Re: [Beignet] [PATCH] fix issue to create cl image from libva with non-zero offset

2014-11-09 Thread Guo, Yejun
Hi Zhigang, The patch to add a utest for this fix is just sent, thanks. -Original Message- From: Zhigang Gong [mailto:zhigang.g...@linux.intel.com] Sent: Thursday, September 25, 2014 2:55 PM To: Guo, Yejun Cc: beignet@lists.freedesktop.org Subject: Re: [Beignet] [PATCH] fix issue

[Beignet] [PATCH 1/2] use posix_memalign instead of aligned_alloc to be more compatible

2014-11-09 Thread Guo Yejun
At some systems, function aligned_alloc is not supported. From Linux Programmer's Manual: The function aligned_alloc() was added to glibc in version 2.16. The function posix_memalign() is available since glibc 2.1.91. Signed-off-by: Guo Yejun yejun@intel.com --- src/cl_device_id.c | 3 ++- 1

[Beignet] [PATCH 2/2] use posix_memalign instead of aligned_alloc to be more compatible

2014-11-09 Thread Guo Yejun
At some systems, function aligned_alloc is not supported. From Linux Programmer's Manual: The function aligned_alloc() was added to glibc in version 2.16. The function posix_memalign() is available since glibc 2.1.91. Signed-off-by: Guo Yejun yejun@intel.com --- utests

[Beignet] [PATCH V2] use posix_memalign instead of aligned_alloc to be more compatible

2014-11-10 Thread Guo Yejun
At some systems, function aligned_alloc is not supported. From Linux Programmer's Manual: The function aligned_alloc() was added to glibc in version 2.16. The function posix_memalign() is available since glibc 2.1.91. V2: add check for return value of posix_memalign Signed-off-by: Guo Yejun yejun

[Beignet] [PATCH] re-enable userptr with fix: CPU access after GPU finishes the rendering

2014-11-18 Thread Guo Yejun
1. the wait logic is integrated into function cl_mem_map/unmap_auto 2. use cl_mem_map/unmap_auto for userptr inside clEnqueueRead/WriteBuffer 3. do not use cl_buffer_subdata for userptr, use cl_mem_map/memcpy instead Signed-off-by: Guo Yejun yejun@intel.com --- CMakeLists.txt | 13

[Beignet] [PATCH 2/2] add test for clCreateImageFromLibvaIntel

2014-11-20 Thread Guo Yejun
Signed-off-by: Guo Yejun yejun@intel.com --- kernels/runtime_climage_from_boname.cl | 8 ++ utests/CMakeLists.txt | 11 +- utests/runtime_climage_from_boname.cpp | 208 + 3 files changed, 226 insertions(+), 1 deletion(-) create mode 100644

Re: [Beignet] [PATCH] clean code, the logic is already at the beginning of function

2014-11-27 Thread Guo, Yejun
Ping for review, thanks. -Original Message- From: Guo, Yejun Sent: Monday, November 03, 2014 10:42 AM To: beignet@lists.freedesktop.org Cc: Guo, Yejun Subject: [PATCH] clean code, the logic is already at the beginning of function Signed-off-by: Guo Yejun yejun@intel.com --- src

[Beignet] [PATCH] refine utest of cl_mem_use_host_ptr

2014-11-27 Thread Guo Yejun
From application perspective, userptr is transparent. App does not need to know if userptr is enabled or not, just invokes standard OpenCL APIs. Signed-off-by: Guo Yejun yejun@intel.com --- utests/CMakeLists.txt | 5 - utests/runtime_use_host_ptr_buffer.cpp | 8

[Beignet] [PATCH] fix issue to pass utest of runtime_climage_from_boname for BDW

2014-11-27 Thread Guo Yejun
To create cl image from bo name with offset, the offset needs to be added into surface_base_addr_lo/hi. Signed-off-by: Guo Yejun yejun@intel.com --- src/intel/intel_gpgpu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/intel/intel_gpgpu.c b/src/intel

[Beignet] [PATCH 1/2] enable CL_MEM_ALLOC_HOST_PTR with user_ptr to avoid copy between GPU/CPU

2014-12-01 Thread Guo Yejun
, mem_host_flags Signed-off-by: Guo Yejun yejun@intel.com --- src/cl_device_id.c | 8 src/cl_mem.c | 37 +++-- src/cl_mem.h | 4 ++-- 3 files changed, 33 insertions(+), 16 deletions(-) diff --git a/src/cl_device_id.c b/src/cl_device_id.c index

Re: [Beignet] CL_MEM_USE_HOST_PTR involve extra copy?

2014-12-02 Thread Guo, Yejun
Hi, please check the latest code of beignet, there is no copy needed between CPU and GPU if the host_ptr provided by application is page aligned, and the page align limitation is expected to be removed some days later. You can also try CL_MEM_ALLOC_HOST_PTR to avoid the extra copy without

Re: [Beignet] CL_MEM_USE_HOST_PTR involve extra copy?

2014-12-02 Thread Guo, Yejun
/utests. From: spring_wind [mailto:spring_w...@yeah.net] Sent: Wednesday, December 03, 2014 10:02 AM To: Guo, Yejun Cc: beignet@lists.freedesktop.org Subject: Re:RE: [Beignet] CL_MEM_USE_HOST_PTR involve extra copy? You mean if I use CL_MEM_USE_HOST_PTR and host ptr is page aligned, I change the data

Re: [Beignet] [PATCH] remove the page align limitation for host_ptr of CL_MEM_USE_HOST_PTR

2014-12-16 Thread Guo, Yejun
Ok, will modify accordingly. -Original Message- From: Zhigang Gong [mailto:zhigang.g...@linux.intel.com] Sent: Tuesday, December 16, 2014 3:04 PM To: Guo, Yejun Cc: beignet@lists.freedesktop.org Subject: Re: [Beignet] [PATCH] remove the page align limitation for host_ptr

[Beignet] [PATCH 2/2] test the case that user ptr is not page aligned

2014-12-22 Thread Guo Yejun
the previous code only test user ptr that is page aligned, remove this limitation together with the driver. Signed-off-by: Guo Yejun yejun@intel.com --- benchmark/benchmark_use_host_ptr_buffer.cpp | 12 +--- utests/runtime_use_host_ptr_buffer.cpp | 14 ++ 2 files

Re: [Beignet] [PATCH V2 1/2] remove the page align limitation for host_ptr of CL_MEM_USE_HOST_PTR

2014-12-23 Thread Guo, Yejun
Thanks Zhigang, yes, I reproduced the failure in OpenCV, I'll take a look at it. -Original Message- From: Zhigang Gong [mailto:zhigang.g...@linux.intel.com] Sent: Tuesday, December 23, 2014 5:11 PM To: Guo, Yejun Cc: beignet@lists.freedesktop.org Subject: Re: [Beignet] [PATCH V2 1/2

[Beignet] [PATCH] replace hash_map with map

2014-12-24 Thread Guo Yejun
there is no strong evidence to show hash_map makes better performance for beignet, since hash_map requires std::hash which is not supported in some g++ old versions, so replace hash_map with map. Signed-off-by: Guo Yejun yejun@intel.com --- backend/src/CMakeLists.txt| 1

Re: [Beignet] [PATCH] replace hash_map with map

2014-12-25 Thread Guo, Yejun
Of Andi Kleen Sent: Friday, December 26, 2014 1:17 PM To: beignet@lists.freedesktop.org Subject: Re: [Beignet] [PATCH] replace hash_map with map Guo Yejun yejun@intel.com writes: there is no strong evidence to show hash_map makes better performance for beignet, since hash_map requires std

Re: [Beignet] [PATCH] replace hash_map with map

2014-12-25 Thread Guo, Yejun
: Friday, December 26, 2014 2:13 PM To: Andi Kleen Cc: Guo, Yejun; beignet@lists.freedesktop.org Subject: Re: [Beignet] [PATCH] replace hash_map with map On Fri, Dec 26, 2014 at 06:55:31AM +0100, Andi Kleen wrote: On Fri, Dec 26, 2014 at 05:26:19AM +, Guo, Yejun wrote: Hash_map is defined

Re: [Beignet] [PATCH] replace hash_map with map

2014-12-26 Thread Guo, Yejun
Not yet, there are still something need to be done recently, stay tuned, ☺ From: spring_wind [mailto:18969076...@yeah.net] Sent: Friday, December 26, 2014 8:55 PM To: Guo, Yejun Cc: beignet@lists.freedesktop.org Subject: Re:[Beignet] [PATCH] replace hash_map with map Great, is it enough to let

[Beignet] [PATCH 2/3] change Immediate::operator= from private to public

2014-12-28 Thread Guo Yejun
ImmediateIndex index(this-immediateNum()); this-immediates.push_back(imm); return index; } Signed-off-by: Guo Yejun yejun@intel.com --- backend/src/ir/immediate.hpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/backend/src/ir/immediate.hpp b/backend/src/ir

[Beignet] [PATCH 3/3] do not use C++11 features inside libgbeinterp

2014-12-28 Thread Guo Yejun
(the OpenCL compiler) which depends on LLVM/clang Signed-off-by: Guo Yejun yejun@intel.com --- backend/src/backend/program.cpp | 17 --- backend/src/backend/program.hpp | 6 +-- backend/src/ir/constant.cpp | 3 +- backend/src/ir/constant.hpp | 5 ++- backend/src/ir/function.hpp

[Beignet] [PATCH] fix utest build for some old gcc version

2014-12-30 Thread Guo Yejun
change the keyword from constexpr to const, update the code for explicit type conversion and std::map's iterator. Signed-off-by: Guo Yejun yejun@intel.com --- utests/compiler_displacement_map_element.cpp | 4 +-- utests/compiler_saturate.cpp | 2 +- utests

[Beignet] [PATCH 1/3] add option BUILD_STATIC_GBE_COMPILER to build static compiler

2014-12-30 Thread Guo Yejun
The offline compiler (gbe_bin_generater), depending on LLVM/clang, could only be built with C++11 features. To make it workable within old c/c++ version environment, add one CMAKE option to link against all static libraries. Signed-off-by: Guo Yejun yejun@intel.com --- backend/src

[Beignet] [PATCH 2/3] support CL_EMBEDDED_PROFILE with offline compiler

2014-12-30 Thread Guo Yejun
/your_path_to_compiler looks like: beignet.bc beignet.pch gbe_bin_generater include 3. Within old environment, build with STATIC_GBE_COMPILER_PATH=/your_path_to_compiler libcl.so and libgbeinterp.so will be built here, libgbe.so and gbe_bin_generater will not be built here. Signed-off-by: Guo Yejun yejun

[Beignet] [PATCH 3/3] add CMAKE option ENABLE_RTTI for some stl version

2014-12-30 Thread Guo Yejun
for some STL version, unable to build with -fno-rtti, have to enable it. Signed-off-by: Guo Yejun yejun@intel.com --- CMakeLists.txt | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/CMakeLists.txt b/CMakeLists.txt index 5cb31c2..9b9c9ea 100644 --- a/CMakeLists.txt

Re: [Beignet] version 4 of libva buffer sharing patchset

2015-01-25 Thread Guo, Yejun
if the condition is not satisfied, maybe we need assert here. From: Weng, Chuanbo Sent: Friday, January 23, 2015 6:23 PM To: beignet@lists.freedesktop.org; Guo, Yejun Subject: version 4 of libva buffer sharing patchset Hi all, Please fetch the 4th version of patchset from github using the following

Re: [Beignet] version 4 of libva buffer sharing patchset

2015-02-01 Thread Guo, Yejun
The patch LGTM, thanks. From: Weng, Chuanbo Sent: Monday, February 02, 2015 3:06 PM To: Guo, Yejun; beignet@lists.freedesktop.org Subject: RE: version 4 of libva buffer sharing patchset Removed and updated the patchset. From: Guo, Yejun Sent: Monday, February 02, 2015 14:11 To: Weng, Chuanbo

Re: [Beignet] version 4 of libva buffer sharing patchset

2015-02-01 Thread Guo, Yejun
Looks fine except one comment: OUTPUT_NV12_DEFAULT is not really used, we can just remove it. From: Weng, Chuanbo Sent: Thursday, January 29, 2015 11:33 AM To: Guo, Yejun; beignet@lists.freedesktop.org Subject: RE: version 4 of libva buffer sharing patchset Hi Yejun, Thanks

[Beignet] [PATCH V2 2/4] add CMake option USE_STANDALONE_GBE_COMPILER and STANDALONE_GBE_COMPILER_DIR

2015-01-08 Thread Guo Yejun
, and build driver with it. Signed-off-by: Guo Yejun yejun@intel.com --- CMakeLists.txt | 43 +++ GetGenID.sh| 26 ++ backend/CMakeLists.txt | 20 +--- backend/src/CMakeLists.txt | 20

[Beignet] [PATCH 3/4] only build tests that do not need compiler when standalone compiler is provided

2015-01-08 Thread Guo Yejun
in the running system. btw, please make sure compiler_ceil.bin is really updated if there is already one there, the safe way is to delete it first. Signed-off-by: Guo Yejun yejun@intel.com --- CMakeLists.txt| 4 utests/CMakeLists.txt | 27 +++ 2 files

[Beignet] [PATCH V2 1/4] add option BUILD_STANDALONE_GBE_COMPILER to build static compiler

2015-01-08 Thread Guo Yejun
: change the option name to BUILD_STANDALONE_GBE_COMPILER. zip necessary files into a tar ball. Signed-off-by: Guo Yejun yejun@intel.com --- backend/src/CMakeLists.txt | 39 +-- 1 file changed, 29 insertions(+), 10 deletions(-) diff --git a/backend/src

Re: [Beignet] [PATCH 2/3] support CL_EMBEDDED_PROFILE with offline compiler

2015-01-06 Thread Guo, Yejun
, and the document, do not object your comments, and will modify accordingly. -Original Message- From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of Zhigang Gong Sent: Wednesday, January 07, 2015 8:31 AM To: Guo, Yejun Cc: beignet@lists.freedesktop.org Subject: Re

Re: [Beignet] [PATCH 2/3] support CL_EMBEDDED_PROFILE with offline compiler

2015-01-06 Thread Guo, Yejun
: Zhigang Gong [mailto:zhigang.g...@linux.intel.com] Sent: Wednesday, January 07, 2015 11:11 AM To: Guo, Yejun Cc: beignet@lists.freedesktop.org Subject: RE: [Beignet] [PATCH 2/3] support CL_EMBEDDED_PROFILE with offline compiler -Original Message- From: Beignet [mailto:beignet-boun

Re: [Beignet] [PATCH 2/3] support CL_EMBEDDED_PROFILE with offline compiler

2015-01-06 Thread Guo, Yejun
Yes, I expected to compile with -t option. Will also refine the gbe_bin_generator usage information. -Original Message- From: Zhigang Gong [mailto:zhigang.g...@linux.intel.com] Sent: Wednesday, January 07, 2015 10:40 AM To: Guo, Yejun Cc: beignet@lists.freedesktop.org Subject: Re

Re: [Beignet] [PATCH 3/3] add CMAKE option ENABLE_RTTI for some stl version

2015-01-06 Thread Guo, Yejun
07, 2015 10:26 AM To: Guo, Yejun Cc: beignet@lists.freedesktop.org Subject: Re: [Beignet] [PATCH 3/3] add CMAKE option ENABLE_RTTI for some stl version Could you find out the specified STL version? And do some checking at configuration time and remove or add fno-rtti according to the checking

Re: [Beignet] [PATCH 3/3] add CMAKE option ENABLE_RTTI for some stl version

2015-01-06 Thread Guo, Yejun
Ok, it is also a practical method. -Original Message- From: Zhigang Gong [mailto:zhigang.g...@linux.intel.com] Sent: Wednesday, January 07, 2015 12:56 PM To: Guo, Yejun Cc: beignet@lists.freedesktop.org Subject: Re: [Beignet] [PATCH 3/3] add CMAKE option ENABLE_RTTI for some stl

[Beignet] [PATCH] refine gbe_bin_generater usage to add -t option

2015-01-06 Thread Guo Yejun
-t option specifies the gen target pci id, it tells gbe_bin_generater the target platform that it compiles for. The compile result is llvm level binary if this option is not given. Signed-off-by: Guo Yejun yejun@intel.com --- backend/src/gbe_bin_generater.cpp | 2 +- 1 file changed, 1

Re: [Beignet] [PATCH 3/3] add CMAKE option ENABLE_RTTI for some stl version

2015-01-06 Thread Guo, Yejun
[mailto:zhigang.g...@linux.intel.com] Sent: Wednesday, January 07, 2015 12:13 PM To: Guo, Yejun Cc: beignet@lists.freedesktop.org Subject: Re: [Beignet] [PATCH 3/3] add CMAKE option ENABLE_RTTI for some stl version Can we just remove -fno-rtti? Did you try it? On Wed, Jan 07, 2015 at 04:33:16AM

[Beignet] [PATCH] remove useless dependency libocl

2015-01-07 Thread Guo Yejun
libocl is the name of sub directory, the project name in the sub directory, it is not something that others can depend on. Signed-off-by: Guo Yejun yejun@intel.com --- backend/src/CMakeLists.txt | 2 -- 1 file changed, 2 deletions(-) diff --git a/backend/src/CMakeLists.txt b/backend/src

[Beignet] [PATCH V3] add CMake option USE_STANDALONE_GBE_COMPILER and STANDALONE_GBE_COMPILER_DIR

2015-01-09 Thread Guo Yejun
, and build driver with it. v3: add file FindStandaloneGbeCompiler.cmake to make the main cmakefile clean. Signed-off-by: Guo Yejun yejun@intel.com --- CMake/FindStandaloneGbeCompiler.cmake | 35 +++ CMakeLists.txt| 15

[Beignet] [PATCH] correct env var to output llvm IR

2015-03-16 Thread Guo Yejun
Signed-off-by: Guo Yejun yejun@intel.com --- docs/Beignet/Backend.mdwn | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/docs/Beignet/Backend.mdwn b/docs/Beignet/Backend.mdwn index e4259fb..cf80318 100644 --- a/docs/Beignet/Backend.mdwn +++ b/docs/Beignet

Re: [Beignet] [patch v2] strip PointerCast for call instructions before use.

2015-03-16 Thread Guo, Yejun
LGTM, thanks. -Original Message- From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of xionghu@intel.com Sent: Tuesday, March 17, 2015 1:26 PM To: beignet@lists.freedesktop.org Cc: Luo, Xionghu Subject: [Beignet] [patch v2] strip PointerCast for call instructions

[Beignet] [PATCH 1/2] add 3 simd level built-in functions: shuffle, simdsize and simdid

2015-03-19 Thread Guo Yejun
); the value of x of the c-th channel of the SIMD is returned, for all SIMD channels, the behavior is undefined if c is larger than simdsize - 1 Signed-off-by: Guo Yejun yejun@intel.com --- backend/src/backend/gen8_context.cpp | 29 - backend/src/backend/gen_context.cpp

Re: [Beignet] [PATCH 1/2] add 3 simd level built-in functions: shuffle, simdsize and simdid

2015-03-27 Thread Guo, Yejun
Ask for review, thanks. yejun -Original Message- From: Guo, Yejun Sent: Friday, March 20, 2015 1:58 PM To: beignet@lists.freedesktop.org Cc: Guo, Yejun Subject: [PATCH 1/2] add 3 simd level built-in functions: shuffle, simdsize and simdid uint __gen_ocl_get_simd_size(); returns 8

Re: [Beignet] [PATCH V2 1/2] add simd level function __gen_ocl_get_simd_id

2015-04-20 Thread Guo, Yejun
Please ignore this patch set because just found a performance issue caused by the change in liveness.cpp -Original Message- From: Guo, Yejun Sent: Monday, April 20, 2015 9:28 AM To: Yang, Rong R; beignet@lists.freedesktop.org Subject: RE: [Beignet] [PATCH V2 1/2] add simd level

[Beignet] [PATCH V3 1/2] add simd level function __gen_ocl_get_simd_id

2015-04-20 Thread Guo Yejun
uint __gen_ocl_get_simd_id(); return value ranges from 0 to simdsize - 1 V2: use function sel.selReg to refine code V3: correct the uniform condition in liveness.cpp Signed-off-by: Guo Yejun yejun@intel.com --- backend/src/backend/gen_context.cpp| 9 - backend/src/backend

[Beignet] [PATCH 2/2] add utest for __gen_ocl_get_simd_id

2015-04-20 Thread Guo Yejun
Signed-off-by: Guo Yejun yejun@intel.com --- kernels/compiler_get_simd_id.cl | 8 utests/CMakeLists.txt | 3 ++- utests/compiler_get_simd_id.cpp | 33 + 3 files changed, 43 insertions(+), 1 deletion(-) create mode 100644 kernels

  1   2   3   4   >