Re: [Beignet] [PATCH] create GIT_SHA1 without any dependency

2014-10-24 Thread Zhigang Gong
LGTM, pushed, thanks.

On Sat, Oct 25, 2014 at 03:10:02AM +0800, Meng Mengmeng wrote:
 ---
  src/CMakeLists.txt | 5 ++---
  src/git_sha1.sh| 4 ++--
  2 files changed, 4 insertions(+), 5 deletions(-)
 
 diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
 index 0d22589..9e65856 100644
 --- a/src/CMakeLists.txt
 +++ b/src/CMakeLists.txt
 @@ -110,11 +110,10 @@ SET(CMAKE_C_FLAGS -DHAS_OCLIcd ${CMAKE_C_FLAGS})
  endif (OCLIcd_FOUND)
  
  set(GIT_SHA1 git_sha1.h)
 -add_custom_command(OUTPUT  ${GIT_SHA1}
 +add_custom_target(${GIT_SHA1} ALL
COMMAND chmod +x ${CMAKE_CURRENT_SOURCE_DIR}/git_sha1.sh
COMMAND ${CMAKE_CURRENT_SOURCE_DIR}/git_sha1.sh 
 ${CMAKE_CURRENT_SOURCE_DIR} ${GIT_SHA1}
 - )
 -add_custom_target(GIT_SHA1 ALL DEPENDS ${GIT_SHA1})
 +)
  
  SET(CMAKE_SHARED_LINKER_FLAGS ${CMAKE_SHARED_LINKER_FLAGS} 
 -Wl,-Bsymbolic,--allow-shlib-undefined)
  
 diff --git a/src/git_sha1.sh b/src/git_sha1.sh
 index 4f6f972..f44f078 100755
 --- a/src/git_sha1.sh
 +++ b/src/git_sha1.sh
 @@ -4,9 +4,9 @@ SOURCE_DIR=$1
  FILE=$2
  
  touch ${SOURCE_DIR}/${FILE}_tmp
 -if test -d $1/../.git; then
 +if test -d ${SOURCE_DIR}/../.git; then
  if which git  /dev/null; then
 -git --git-dir=$1/../.git log -n 1 --oneline | \
 +git --git-dir=${SOURCE_DIR}/../.git log -n 1 --oneline | \
  sed 's/^\([^ ]*\) .*/#define BEIGNET_GIT_SHA1 git-\1/' \
   ${SOURCE_DIR}/${FILE}_tmp
  fi
 -- 
 1.9.3
 
 ___
 Beignet mailing list
 Beignet@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/beignet
___
Beignet mailing list
Beignet@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/beignet


Re: [Beignet] [Intel-gfx] Beignet crashes on vanilla 3.17.1 with IVB hardware

2014-10-24 Thread Zhigang Gong
I just checked again, and found both 3.17 and 3.17.1 should work
fine on IVB with beignet. I just tested beignet on IVB with kernel
3.17.1, all unit tests passed successfully. For IVB user, no need
to wait for 3.18.

Don't know which application you were testing on your IVB machine.
If it could be reproduced easily, please open a bug on fd.o.

Thanks,
Zhigang Gong.

On Fri, Oct 24, 2014 at 10:27:23AM +0800, Zhigang Gong wrote:
 Hi,
 
 For IVB, I just checked the 3.18-rc1, it has the following patch:
 commit c9224faa59c3071ecfa2d4b24592f4eb61e57069
 Author: Brad Volkin bradley.d.vol...@intel.com
 Date:   Tue Jun 17 14:10:34 2014 -0700
 
 drm/i915: Add some L3 registers to the parser whitelist
 
 Beignet needs these in order to program the L3 cache config for
 OpenCL workloads, particularly when using SLM.
 
 Signed-off-by: Brad Volkin bradley.d.vol...@intel.com
 Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch
 
 So, beignet should work fine with 3.18 on IVB/BYT.
 
 But for the HSW,I'm not quite sure when we could get a workable vanilla
 kernel. Something I found at the intel-gfx mail list as below and it doesn't
 sound good.
 
 http://lists.freedesktop.org/archives/intel-gfx/2014-May/044694.html
 http://lists.freedesktop.org/archives/intel-gfx/2014-May/045088.html
 
 CC to intel-gfx mail list. Hope we can get an official anwser here.
 
 Thanks,
 Zhigang Gong.
 
 On Thu, Oct 23, 2014 at 03:39:35PM +0300, Vasily Khoruzhick wrote:
  Hi,
  
  As you maybe know, any application which uses beignet OpenCL
  implementation crashes on Ivy Bridge hardware when using vanilla
  3.17.1 kernel.
  I guess it's due to batchbuffer security and patch to disable
  batchbuffer security is required, but guys, it fails since 3.16, and
  3.17 was released quite a while ago.
  
  Could you cooperate with i915 driver devs to make Beignet working on
  vanilla kernel without extra patches?
  
  Thanks!
  
  Regards,
  Vasily
  ___
  Beignet mailing list
  Beignet@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/beignet
 ___
 Intel-gfx mailing list
 intel-...@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
Beignet mailing list
Beignet@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/beignet


Re: [Beignet] [Intel-gfx] Beignet crashes on vanilla 3.17.1 with IVB hardware

2014-10-24 Thread Vasily Khoruzhick
Hi Zhigang,

Luxmark crashes with following backtrace:

Program received signal SIGSEGV, Segmentation fault.
0x004cd8b0 in slg::PathOCLRenderEngine::StopLockLess() ()
(gdb) bt
#0  0x004cd8b0 in slg::PathOCLRenderEngine::StopLockLess() ()
#1  0x00482236 in slg::RenderEngine::Stop() ()
#2  0x0047be9b in slg::RenderSession::~RenderSession() ()
#3  0x00468a77 in LuxMarkApp::Stop() ()
#4  0x00468b46 in LuxMarkApp::InitRendering(LuxMarkAppMode,
char const*) ()
#5  0x0046edbe in MainWindow::event(QEvent*) ()
#6  0x77099b9c in QApplicationPrivate::notify_helper(QObject*,
QEvent*) () from /usr/lib/libQtGui.so.4
#7  0x770a05e8 in QApplication::notify(QObject*, QEvent*) ()
from /usr/lib/libQtGui.so.4
#8  0x76b6af7d in QCoreApplication::notifyInternal(QObject*,
QEvent*) () from /usr/lib/libQtCore.so.4
#9  0x76b6e341 in
QCoreApplicationPrivate::sendPostedEvents(QObject*, int, QThreadData*)
() from /usr/lib/libQtCore.so.4
#10 0x76b99e63 in ?? () from /usr/lib/libQtCore.so.4
#11 0x74d05a1d in g_main_context_dispatch () from
/usr/lib/libglib-2.0.so.0
#12 0x74d05d08 in ?? () from /usr/lib/libglib-2.0.so.0
#13 0x74d05dbc in g_main_context_iteration () from
/usr/lib/libglib-2.0.so.0
#14 0x76b99fad in
QEventDispatcherGlib::processEvents(QFlagsQEventLoop::ProcessEventsFlag)
() from /usr/lib/libQtCore.so.4
#15 0x7713d9c6 in ?? () from /usr/lib/libQtGui.so.4
#16 0x76b69ad1 in
QEventLoop::processEvents(QFlagsQEventLoop::ProcessEventsFlag) ()
from /usr/lib/libQtCore.so.4
#17 0x76b69e35 in
QEventLoop::exec(QFlagsQEventLoop::ProcessEventsFlag) () from
/usr/lib/libQtCore.so.4
#18 0x76b6f3c7 in QCoreApplication::exec() () from
/usr/lib/libQtCore.so.4
#19 0x0045afc8 in main ()

And convert from imagemagick crashes in libgbe.so:

Program received signal SIGSEGV, Segmentation fault.
0x7fffeb0df18c in gbe::buildRegInfo(gbe::ir::BasicBlock,
gbe::vectorgbe::RegInfoForMov) () from /usr/lib/beignet//libgbe.so
(gdb) bt
#0  0x7fffeb0df18c in gbe::buildRegInfo(gbe::ir::BasicBlock,
gbe::vectorgbe::RegInfoForMov) () from /usr/lib/beignet//libgbe.so
#1  0x7fffeb0e048d in
gbe::GenWriter::removeLOADIs(gbe::ir::Liveness const,
gbe::ir::Function) () from /usr/lib/beignet//libgbe.so
#2  0x7fffeb0fb823 in
gbe::GenWriter::emitFunction(llvm::Function) () from
/usr/lib/beignet//libgbe.so
#3  0x7fffeb100b63 in
gbe::GenWriter::runOnFunction(llvm::Function) () from
/usr/lib/beignet//libgbe.so
#4  0x7fffec0129e1 in
llvm::FPPassManager::runOnFunction(llvm::Function) () from
/usr/lib/beignet//libgbe.so
#5  0x7fffebe11083 in (anonymous
namespace)::CGPassManager::runOnModule(llvm::Module) () from
/usr/lib/beignet//libgbe.so
#6  0x7fffec015538 in
llvm::legacy::PassManagerImpl::run(llvm::Module) () from
/usr/lib/beignet//libgbe.so
#7  0x7fffeb11fe0e in gbe::llvmToGen(gbe::ir::Unit, char const*,
void const*, int, bool) () from /usr/lib/beignet//libgbe.so
#8  0x7fffeb0c4fc1 in gbe::Program::buildFromLLVMFile(char const*,
void const*, std::string, int) () from /usr/lib/beignet//libgbe.so
#9  0x7fffeb17d0ba in gbe::genProgramNewFromLLVM(unsigned int,
char const*, void const*, void const*, unsigned long, char*, unsigned
long*, int) () from /usr/lib/beignet//libgbe.so
#10 0x7fffeb0c8e09 in gbe::programNewFromSource(unsigned int, char
const*, unsigned long, char const*, char*, unsigned long*) () from
/usr/lib/beignet//libgbe.so
#11 0x7fffef0eac96 in cl_program_build () from /usr/lib/beignet/libcl.so
#12 0x7fffef0e39de in clBuildProgram () from /usr/lib/beignet/libcl.so
#13 0x7fffef32c4cb in clBuildProgram () from /usr/lib/libOpenCL.so
#14 0x77a461ec in ?? () from /usr/lib/libMagickCore-6.Q16HDRI.so.2
#15 0x77a4689f in InitOpenCLEnvInternal () from
/usr/lib/libMagickCore-6.Q16HDRI.so.2
#16 0x77a46b83 in ?? () from /usr/lib/libMagickCore-6.Q16HDRI.so.2
#17 0x77a4797a in ?? () from /usr/lib/libMagickCore-6.Q16HDRI.so.2
#18 0x77a4808c in InitOpenCLEnv () from
/usr/lib/libMagickCore-6.Q16HDRI.so.2
#19 0x77941558 in ?? () from /usr/lib/libMagickCore-6.Q16HDRI.so.2
#20 0x77948d3e in AccelerateResizeImage () from
/usr/lib/libMagickCore-6.Q16HDRI.so.2
#21 0x77a95da0 in ResizeImage () from
/usr/lib/libMagickCore-6.Q16HDRI.so.2
#22 0x7768d310 in MogrifyImage () from
/usr/lib/libMagickWand-6.Q16HDRI.so.2
#23 0x77691bc9 in MogrifyImages () from
/usr/lib/libMagickWand-6.Q16HDRI.so.2
#24 0x7761d03b in ConvertImageCommand () from
/usr/lib/libMagickWand-6.Q16HDRI.so.2
#25 0x77686ff7 in MagickCommandGenesis () from
/usr/lib/libMagickWand-6.Q16HDRI.so.2
#26 0x004008a7 in ?? ()
#27 0x7703a040 in __libc_start_main () from /usr/lib/libc.so.6
#28 0x004008fb in ?? ()

Reproducibility is 100%

Regards
Vasily

On Fri, Oct 24, 2014 at 10:05 AM, Zhigang Gong

Re: [Beignet] possilbe bug when run opencv_test_imgproc

2014-10-24 Thread yan . wang
Hi, Zhigang,
  May your platform also get failed cases when run
OCL_ImageProc/Filter2D.Mat* although no crash, because I have only
Baytail T platfrom. I am not sure the other platform has the same issue.
  If yes, I could try to investigate the reason. It may have issue of
filter2D OpenCL implementation in OpenCV.
  Thanks.

Yan Wang


 Hi, All,
I found one possible bug for review.
if run the following:
 ./opencv_test_imgproc --gtest_filter=OCL_ImageProc/Filter2D.Mat*.
OCL_ImageProc/Filter2D.Mat/256 failed and continue. But the whole test
 flow will crash in OCL_ImageProc/Filter2D.Mat/257:
 [  FAILED  ] OCL_ImageProc/Filter2D.Mat/240, where GetParam() = (CV_8U,
 Channels(2), 7, 1, BORDER_CONSTANT, false, false) (6433 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/241
 [   OK ] OCL_ImageProc/Filter2D.Mat/241 (358 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/242
 [   OK ] OCL_ImageProc/Filter2D.Mat/242 (311 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/243
 [   OK ] OCL_ImageProc/Filter2D.Mat/243 (6 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/244
 [   OK ] OCL_ImageProc/Filter2D.Mat/244 (203 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/245
 [   OK ] OCL_ImageProc/Filter2D.Mat/245 (207 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/246
 [   OK ] OCL_ImageProc/Filter2D.Mat/246 (203 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/247
 [   OK ] OCL_ImageProc/Filter2D.Mat/247 (7 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/248
 [   OK ] OCL_ImageProc/Filter2D.Mat/248 (210 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/249
 [   OK ] OCL_ImageProc/Filter2D.Mat/249 (207 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/250
 [   OK ] OCL_ImageProc/Filter2D.Mat/250 (208 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/251
 [   OK ] OCL_ImageProc/Filter2D.Mat/251 (7 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/252
 [   OK ] OCL_ImageProc/Filter2D.Mat/252 (214 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/253
 [   OK ] OCL_ImageProc/Filter2D.Mat/253 (207 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/254
 [   OK ] OCL_ImageProc/Filter2D.Mat/254 (212 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/255
 [   OK ] OCL_ImageProc/Filter2D.Mat/255 (7 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/256
 /home/yanwang/opencv/modules/imgproc/test/ocl/test_filter2d.cpp:106:
 Failure
 Expected: (TestUtils::checkNorm2(dst, udst)) = (threshold), actual: 255
 vs 1
 Size: [92 x 61]

 /home/yanwang/opencv/modules/imgproc/test/ocl/test_filter2d.cpp:107:
 Failure
 Expected: (TestUtils::checkNorm2(dst_roi, udst_roi)) = (threshold),
 actual: 255 vs 1
 Size: [92 x 61]

 [  FAILED  ] OCL_ImageProc/Filter2D.Mat/256, where GetParam() = (CV_8U,
 Channels(2), 7, 4, BORDER_CONSTANT, false, false) (7413 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/257
 opencv_test_imgproc: /home/yanwang/beignet/src/intel/intel_gpgpu.c:703:
 intel_gpgpu_check_binded_buf_address: Assertion
 `gpgpu-binded_buf[i]-offset != 0' failed.
 Segmentation fault (core dumped)

   But if I run OCL_ImageProc/Filter2D.Mat/257 only, it passed. I think 256
 case may influence it.
   Thanks.

 Yan Wang


___
Beignet mailing list
Beignet@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/beignet


Re: [Beignet] possilbe bug when run opencv_test_imgproc

2014-10-24 Thread Zhigang Gong
I have BYT box, an IVB machine and a HSW notebook. All of them haven't
this issue.
Yang rong is working on a race condition patch. This issue may be
related. You may
try again once he send out the patch.

On Fri, Oct 24, 2014 at 5:46 PM,  yan.w...@linux.intel.com wrote:
 Hi, Zhigang,
   May your platform also get failed cases when run
 OCL_ImageProc/Filter2D.Mat* although no crash, because I have only
 Baytail T platfrom. I am not sure the other platform has the same issue.
   If yes, I could try to investigate the reason. It may have issue of
 filter2D OpenCL implementation in OpenCV.
   Thanks.

 Yan Wang


 Hi, All,
I found one possible bug for review.
if run the following:
 ./opencv_test_imgproc --gtest_filter=OCL_ImageProc/Filter2D.Mat*.
OCL_ImageProc/Filter2D.Mat/256 failed and continue. But the whole test
 flow will crash in OCL_ImageProc/Filter2D.Mat/257:
 [  FAILED  ] OCL_ImageProc/Filter2D.Mat/240, where GetParam() = (CV_8U,
 Channels(2), 7, 1, BORDER_CONSTANT, false, false) (6433 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/241
 [   OK ] OCL_ImageProc/Filter2D.Mat/241 (358 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/242
 [   OK ] OCL_ImageProc/Filter2D.Mat/242 (311 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/243
 [   OK ] OCL_ImageProc/Filter2D.Mat/243 (6 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/244
 [   OK ] OCL_ImageProc/Filter2D.Mat/244 (203 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/245
 [   OK ] OCL_ImageProc/Filter2D.Mat/245 (207 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/246
 [   OK ] OCL_ImageProc/Filter2D.Mat/246 (203 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/247
 [   OK ] OCL_ImageProc/Filter2D.Mat/247 (7 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/248
 [   OK ] OCL_ImageProc/Filter2D.Mat/248 (210 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/249
 [   OK ] OCL_ImageProc/Filter2D.Mat/249 (207 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/250
 [   OK ] OCL_ImageProc/Filter2D.Mat/250 (208 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/251
 [   OK ] OCL_ImageProc/Filter2D.Mat/251 (7 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/252
 [   OK ] OCL_ImageProc/Filter2D.Mat/252 (214 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/253
 [   OK ] OCL_ImageProc/Filter2D.Mat/253 (207 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/254
 [   OK ] OCL_ImageProc/Filter2D.Mat/254 (212 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/255
 [   OK ] OCL_ImageProc/Filter2D.Mat/255 (7 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/256
 /home/yanwang/opencv/modules/imgproc/test/ocl/test_filter2d.cpp:106:
 Failure
 Expected: (TestUtils::checkNorm2(dst, udst)) = (threshold), actual: 255
 vs 1
 Size: [92 x 61]

 /home/yanwang/opencv/modules/imgproc/test/ocl/test_filter2d.cpp:107:
 Failure
 Expected: (TestUtils::checkNorm2(dst_roi, udst_roi)) = (threshold),
 actual: 255 vs 1
 Size: [92 x 61]

 [  FAILED  ] OCL_ImageProc/Filter2D.Mat/256, where GetParam() = (CV_8U,
 Channels(2), 7, 4, BORDER_CONSTANT, false, false) (7413 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/257
 opencv_test_imgproc: /home/yanwang/beignet/src/intel/intel_gpgpu.c:703:
 intel_gpgpu_check_binded_buf_address: Assertion
 `gpgpu-binded_buf[i]-offset != 0' failed.
 Segmentation fault (core dumped)

   But if I run OCL_ImageProc/Filter2D.Mat/257 only, it passed. I think 256
 case may influence it.
   Thanks.

 Yan Wang


 ___
 Beignet mailing list
 Beignet@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/beignet
___
Beignet mailing list
Beignet@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/beignet


Re: [Beignet] [Intel-gfx] Beignet crashes on vanilla 3.17.1 with IVB hardware

2014-10-24 Thread Zhigang Gong
Hi,

Luxmark (both 2.0/2.1) works fine on my IVB machine. The back trace
you provided below doesn't indicate it's a beignet related problem.
It hadn't enter beignet domain and just crashed in luxmark internal.

On Fri, Oct 24, 2014 at 12:04:29PM +0300, Vasily Khoruzhick wrote:
 Hi Zhigang,
 
 Luxmark crashes with following backtrace:
 
 Program received signal SIGSEGV, Segmentation fault.
 0x004cd8b0 in slg::PathOCLRenderEngine::StopLockLess() ()
 (gdb) bt
 #0  0x004cd8b0 in slg::PathOCLRenderEngine::StopLockLess() ()
 #1  0x00482236 in slg::RenderEngine::Stop() ()
 #2  0x0047be9b in slg::RenderSession::~RenderSession() ()
 #3  0x00468a77 in LuxMarkApp::Stop() ()
 #4  0x00468b46 in LuxMarkApp::InitRendering(LuxMarkAppMode,
 char const*) ()

After a quick analysis, I confirm that the second case is indeed a beignet
bug. Beignet lacks of some llvm intrinsics support such as
llvm.uadd.with.overflow.i32().

Will fix it next week.

Thanks,
Zhigang Gong.

 
 And convert from imagemagick crashes in libgbe.so:
 
 Program received signal SIGSEGV, Segmentation fault.
 0x7fffeb0df18c in gbe::buildRegInfo(gbe::ir::BasicBlock,
 gbe::vectorgbe::RegInfoForMov) () from /usr/lib/beignet//libgbe.so
 (gdb) bt
 #0  0x7fffeb0df18c in gbe::buildRegInfo(gbe::ir::BasicBlock,
 gbe::vectorgbe::RegInfoForMov) () from /usr/lib/beignet//libgbe.so
 #1  0x7fffeb0e048d in
 gbe::GenWriter::removeLOADIs(gbe::ir::Liveness const,
 gbe::ir::Function) () from /usr/lib/beignet//libgbe.so
 #2  0x7fffeb0fb823 in
 gbe::GenWriter::emitFunction(llvm::Function) () from
 /usr/lib/beignet//libgbe.so
 #3  0x7fffeb100b63 in
___
Beignet mailing list
Beignet@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/beignet


[Beignet] [PATCH] GBE: fix a wrong type of cl_device_info.

2014-10-24 Thread Zhigang Gong
Per OpenCL spec 1.2:
CL_DEVICE_IMAGE_MAX_BUFFER_SIZE should be size_t type rather
than cl_ulong.

This bug will cause problems on i386 platform.

Signed-off-by: Zhigang Gong zhigang.g...@intel.com
---
 src/cl_device_id.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/cl_device_id.h b/src/cl_device_id.h
index afc32e2..b2c0e5b 100644
--- a/src/cl_device_id.h
+++ b/src/cl_device_id.h
@@ -60,7 +60,7 @@ struct _cl_device_id {
   size_t   image3d_max_width;
   size_t   image3d_max_height;
   size_t   image3d_max_depth;
-  cl_ulong image_mem_size;
+  size_t   image_mem_size;
   cl_uint  max_samplers;
   size_t   max_parameter_size;
   cl_uint  mem_base_addr_align;
-- 
1.8.3.2

___
Beignet mailing list
Beignet@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/beignet


Re: [Beignet] [Intel-gfx] Beignet crashes on vanilla 3.17.1 with IVB hardware

2014-10-24 Thread Vasily Khoruzhick
Hi Zhigang,

On Fri, Oct 24, 2014 at 12:13 PM, Zhigang Gong
zhigang.g...@linux.intel.com wrote:
 Hi,

 Luxmark (both 2.0/2.1) works fine on my IVB machine. The back trace
 you provided below doesn't indicate it's a beignet related problem.
 It hadn't enter beignet domain and just crashed in luxmark internal.

I'm testing with Luxmark-1.3.1. Luxmark 2.0 works fine.

 On Fri, Oct 24, 2014 at 12:04:29PM +0300, Vasily Khoruzhick wrote:
 Hi Zhigang,

 Luxmark crashes with following backtrace:

 Program received signal SIGSEGV, Segmentation fault.
 0x004cd8b0 in slg::PathOCLRenderEngine::StopLockLess() ()
 (gdb) bt
 #0  0x004cd8b0 in slg::PathOCLRenderEngine::StopLockLess() ()
 #1  0x00482236 in slg::RenderEngine::Stop() ()
 #2  0x0047be9b in slg::RenderSession::~RenderSession() ()
 #3  0x00468a77 in LuxMarkApp::Stop() ()
 #4  0x00468b46 in LuxMarkApp::InitRendering(LuxMarkAppMode,
 char const*) ()

 After a quick analysis, I confirm that the second case is indeed a beignet
 bug. Beignet lacks of some llvm intrinsics support such as
 llvm.uadd.with.overflow.i32().

 Will fix it next week.

Ok, thank you!

 Thanks,
 Zhigang Gong.

Regards
Vasily
___
Beignet mailing list
Beignet@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/beignet


Re: [Beignet] possilbe bug when run opencv_test_imgproc

2014-10-24 Thread yan . wang
Sure. I could try Yang Rong's patch.
BTW, I also meet the following failed cases. Could you please confirm
them? Thanks.

[  FAILED  ] 3 tests, listed below:
[  FAILED  ] OCL_ImgProc/Canny.Accuracy/8, where GetParam() =
(Channels(3), AppertureSize(3), L2gradient(false), UseRoi(false))
[  FAILED  ] OCL_ImgProc/Canny.Accuracy/10, where GetParam() =
(Channels(3), AppertureSize(3), L2gradient(true), UseRoi(false))
[  FAILED  ] OCL_Imgproc/HoughLines.RealImage/2, where GetParam() = (1,
0.00872665, 80)

Yan Wang

 I have BYT box, an IVB machine and a HSW notebook. All of them haven't
 this issue.
 Yang rong is working on a race condition patch. This issue may be
 related. You may
 try again once he send out the patch.

 On Fri, Oct 24, 2014 at 5:46 PM,  yan.w...@linux.intel.com wrote:
 Hi, Zhigang,
   May your platform also get failed cases when run
 OCL_ImageProc/Filter2D.Mat* although no crash, because I have only
 Baytail T platfrom. I am not sure the other platform has the same issue.
   If yes, I could try to investigate the reason. It may have issue of
 filter2D OpenCL implementation in OpenCV.
   Thanks.

 Yan Wang


 Hi, All,
I found one possible bug for review.
if run the following:
 ./opencv_test_imgproc --gtest_filter=OCL_ImageProc/Filter2D.Mat*.
OCL_ImageProc/Filter2D.Mat/256 failed and continue. But the whole
 test
 flow will crash in OCL_ImageProc/Filter2D.Mat/257:
 [  FAILED  ] OCL_ImageProc/Filter2D.Mat/240, where GetParam() = (CV_8U,
 Channels(2), 7, 1, BORDER_CONSTANT, false, false) (6433 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/241
 [   OK ] OCL_ImageProc/Filter2D.Mat/241 (358 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/242
 [   OK ] OCL_ImageProc/Filter2D.Mat/242 (311 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/243
 [   OK ] OCL_ImageProc/Filter2D.Mat/243 (6 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/244
 [   OK ] OCL_ImageProc/Filter2D.Mat/244 (203 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/245
 [   OK ] OCL_ImageProc/Filter2D.Mat/245 (207 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/246
 [   OK ] OCL_ImageProc/Filter2D.Mat/246 (203 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/247
 [   OK ] OCL_ImageProc/Filter2D.Mat/247 (7 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/248
 [   OK ] OCL_ImageProc/Filter2D.Mat/248 (210 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/249
 [   OK ] OCL_ImageProc/Filter2D.Mat/249 (207 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/250
 [   OK ] OCL_ImageProc/Filter2D.Mat/250 (208 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/251
 [   OK ] OCL_ImageProc/Filter2D.Mat/251 (7 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/252
 [   OK ] OCL_ImageProc/Filter2D.Mat/252 (214 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/253
 [   OK ] OCL_ImageProc/Filter2D.Mat/253 (207 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/254
 [   OK ] OCL_ImageProc/Filter2D.Mat/254 (212 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/255
 [   OK ] OCL_ImageProc/Filter2D.Mat/255 (7 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/256
 /home/yanwang/opencv/modules/imgproc/test/ocl/test_filter2d.cpp:106:
 Failure
 Expected: (TestUtils::checkNorm2(dst, udst)) = (threshold), actual:
 255
 vs 1
 Size: [92 x 61]

 /home/yanwang/opencv/modules/imgproc/test/ocl/test_filter2d.cpp:107:
 Failure
 Expected: (TestUtils::checkNorm2(dst_roi, udst_roi)) = (threshold),
 actual: 255 vs 1
 Size: [92 x 61]

 [  FAILED  ] OCL_ImageProc/Filter2D.Mat/256, where GetParam() = (CV_8U,
 Channels(2), 7, 4, BORDER_CONSTANT, false, false) (7413 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/257
 opencv_test_imgproc: /home/yanwang/beignet/src/intel/intel_gpgpu.c:703:
 intel_gpgpu_check_binded_buf_address: Assertion
 `gpgpu-binded_buf[i]-offset != 0' failed.
 Segmentation fault (core dumped)

   But if I run OCL_ImageProc/Filter2D.Mat/257 only, it passed. I think
 256
 case may influence it.
   Thanks.

 Yan Wang


 ___
 Beignet mailing list
 Beignet@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/beignet

___
Beignet mailing list
Beignet@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/beignet


Re: [Beignet] possilbe bug when run opencv_test_imgproc

2014-10-24 Thread Zhigang Gong
All of these three failures are already tracked in JIRA.
If you have access to JIRA, you can check them easily.

Thanks,
Zhigang Gong.

On Fri, Oct 24, 2014 at 10:33 PM,  yan.w...@linux.intel.com wrote:
 Sure. I could try Yang Rong's patch.
 BTW, I also meet the following failed cases. Could you please confirm
 them? Thanks.

 [  FAILED  ] 3 tests, listed below:
 [  FAILED  ] OCL_ImgProc/Canny.Accuracy/8, where GetParam() =
 (Channels(3), AppertureSize(3), L2gradient(false), UseRoi(false))
 [  FAILED  ] OCL_ImgProc/Canny.Accuracy/10, where GetParam() =
 (Channels(3), AppertureSize(3), L2gradient(true), UseRoi(false))
 [  FAILED  ] OCL_Imgproc/HoughLines.RealImage/2, where GetParam() = (1,
 0.00872665, 80)

 Yan Wang

 I have BYT box, an IVB machine and a HSW notebook. All of them haven't
 this issue.
 Yang rong is working on a race condition patch. This issue may be
 related. You may
 try again once he send out the patch.

 On Fri, Oct 24, 2014 at 5:46 PM,  yan.w...@linux.intel.com wrote:
 Hi, Zhigang,
   May your platform also get failed cases when run
 OCL_ImageProc/Filter2D.Mat* although no crash, because I have only
 Baytail T platfrom. I am not sure the other platform has the same issue.
   If yes, I could try to investigate the reason. It may have issue of
 filter2D OpenCL implementation in OpenCV.
   Thanks.

 Yan Wang


 Hi, All,
I found one possible bug for review.
if run the following:
 ./opencv_test_imgproc --gtest_filter=OCL_ImageProc/Filter2D.Mat*.
OCL_ImageProc/Filter2D.Mat/256 failed and continue. But the whole
 test
 flow will crash in OCL_ImageProc/Filter2D.Mat/257:
 [  FAILED  ] OCL_ImageProc/Filter2D.Mat/240, where GetParam() = (CV_8U,
 Channels(2), 7, 1, BORDER_CONSTANT, false, false) (6433 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/241
 [   OK ] OCL_ImageProc/Filter2D.Mat/241 (358 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/242
 [   OK ] OCL_ImageProc/Filter2D.Mat/242 (311 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/243
 [   OK ] OCL_ImageProc/Filter2D.Mat/243 (6 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/244
 [   OK ] OCL_ImageProc/Filter2D.Mat/244 (203 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/245
 [   OK ] OCL_ImageProc/Filter2D.Mat/245 (207 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/246
 [   OK ] OCL_ImageProc/Filter2D.Mat/246 (203 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/247
 [   OK ] OCL_ImageProc/Filter2D.Mat/247 (7 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/248
 [   OK ] OCL_ImageProc/Filter2D.Mat/248 (210 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/249
 [   OK ] OCL_ImageProc/Filter2D.Mat/249 (207 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/250
 [   OK ] OCL_ImageProc/Filter2D.Mat/250 (208 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/251
 [   OK ] OCL_ImageProc/Filter2D.Mat/251 (7 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/252
 [   OK ] OCL_ImageProc/Filter2D.Mat/252 (214 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/253
 [   OK ] OCL_ImageProc/Filter2D.Mat/253 (207 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/254
 [   OK ] OCL_ImageProc/Filter2D.Mat/254 (212 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/255
 [   OK ] OCL_ImageProc/Filter2D.Mat/255 (7 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/256
 /home/yanwang/opencv/modules/imgproc/test/ocl/test_filter2d.cpp:106:
 Failure
 Expected: (TestUtils::checkNorm2(dst, udst)) = (threshold), actual:
 255
 vs 1
 Size: [92 x 61]

 /home/yanwang/opencv/modules/imgproc/test/ocl/test_filter2d.cpp:107:
 Failure
 Expected: (TestUtils::checkNorm2(dst_roi, udst_roi)) = (threshold),
 actual: 255 vs 1
 Size: [92 x 61]

 [  FAILED  ] OCL_ImageProc/Filter2D.Mat/256, where GetParam() = (CV_8U,
 Channels(2), 7, 4, BORDER_CONSTANT, false, false) (7413 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/257
 opencv_test_imgproc: /home/yanwang/beignet/src/intel/intel_gpgpu.c:703:
 intel_gpgpu_check_binded_buf_address: Assertion
 `gpgpu-binded_buf[i]-offset != 0' failed.
 Segmentation fault (core dumped)

   But if I run OCL_ImageProc/Filter2D.Mat/257 only, it passed. I think
 256
 case may influence it.
   Thanks.

 Yan Wang


 ___
 Beignet mailing list
 Beignet@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/beignet

___
Beignet mailing list
Beignet@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/beignet


Re: [Beignet] possilbe bug when run opencv_test_imgproc

2014-10-24 Thread yan . wang
Could you give me one URL?
Thanks.

Yan Wang

 All of these three failures are already tracked in JIRA.
 If you have access to JIRA, you can check them easily.

 Thanks,
 Zhigang Gong.

 On Fri, Oct 24, 2014 at 10:33 PM,  yan.w...@linux.intel.com wrote:
 Sure. I could try Yang Rong's patch.
 BTW, I also meet the following failed cases. Could you please confirm
 them? Thanks.

 [  FAILED  ] 3 tests, listed below:
 [  FAILED  ] OCL_ImgProc/Canny.Accuracy/8, where GetParam() =
 (Channels(3), AppertureSize(3), L2gradient(false), UseRoi(false))
 [  FAILED  ] OCL_ImgProc/Canny.Accuracy/10, where GetParam() =
 (Channels(3), AppertureSize(3), L2gradient(true), UseRoi(false))
 [  FAILED  ] OCL_Imgproc/HoughLines.RealImage/2, where GetParam() = (1,
 0.00872665, 80)

 Yan Wang

 I have BYT box, an IVB machine and a HSW notebook. All of them haven't
 this issue.
 Yang rong is working on a race condition patch. This issue may be
 related. You may
 try again once he send out the patch.

 On Fri, Oct 24, 2014 at 5:46 PM,  yan.w...@linux.intel.com wrote:
 Hi, Zhigang,
   May your platform also get failed cases when run
 OCL_ImageProc/Filter2D.Mat* although no crash, because I have only
 Baytail T platfrom. I am not sure the other platform has the same
 issue.
   If yes, I could try to investigate the reason. It may have issue of
 filter2D OpenCL implementation in OpenCV.
   Thanks.

 Yan Wang


 Hi, All,
I found one possible bug for review.
if run the following:
 ./opencv_test_imgproc --gtest_filter=OCL_ImageProc/Filter2D.Mat*.
OCL_ImageProc/Filter2D.Mat/256 failed and continue. But the whole
 test
 flow will crash in OCL_ImageProc/Filter2D.Mat/257:
 [  FAILED  ] OCL_ImageProc/Filter2D.Mat/240, where GetParam() =
 (CV_8U,
 Channels(2), 7, 1, BORDER_CONSTANT, false, false) (6433 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/241
 [   OK ] OCL_ImageProc/Filter2D.Mat/241 (358 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/242
 [   OK ] OCL_ImageProc/Filter2D.Mat/242 (311 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/243
 [   OK ] OCL_ImageProc/Filter2D.Mat/243 (6 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/244
 [   OK ] OCL_ImageProc/Filter2D.Mat/244 (203 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/245
 [   OK ] OCL_ImageProc/Filter2D.Mat/245 (207 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/246
 [   OK ] OCL_ImageProc/Filter2D.Mat/246 (203 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/247
 [   OK ] OCL_ImageProc/Filter2D.Mat/247 (7 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/248
 [   OK ] OCL_ImageProc/Filter2D.Mat/248 (210 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/249
 [   OK ] OCL_ImageProc/Filter2D.Mat/249 (207 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/250
 [   OK ] OCL_ImageProc/Filter2D.Mat/250 (208 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/251
 [   OK ] OCL_ImageProc/Filter2D.Mat/251 (7 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/252
 [   OK ] OCL_ImageProc/Filter2D.Mat/252 (214 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/253
 [   OK ] OCL_ImageProc/Filter2D.Mat/253 (207 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/254
 [   OK ] OCL_ImageProc/Filter2D.Mat/254 (212 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/255
 [   OK ] OCL_ImageProc/Filter2D.Mat/255 (7 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/256
 /home/yanwang/opencv/modules/imgproc/test/ocl/test_filter2d.cpp:106:
 Failure
 Expected: (TestUtils::checkNorm2(dst, udst)) = (threshold), actual:
 255
 vs 1
 Size: [92 x 61]

 /home/yanwang/opencv/modules/imgproc/test/ocl/test_filter2d.cpp:107:
 Failure
 Expected: (TestUtils::checkNorm2(dst_roi, udst_roi)) = (threshold),
 actual: 255 vs 1
 Size: [92 x 61]

 [  FAILED  ] OCL_ImageProc/Filter2D.Mat/256, where GetParam() =
 (CV_8U,
 Channels(2), 7, 4, BORDER_CONSTANT, false, false) (7413 ms)
 [ RUN  ] OCL_ImageProc/Filter2D.Mat/257
 opencv_test_imgproc:
 /home/yanwang/beignet/src/intel/intel_gpgpu.c:703:
 intel_gpgpu_check_binded_buf_address: Assertion
 `gpgpu-binded_buf[i]-offset != 0' failed.
 Segmentation fault (core dumped)

   But if I run OCL_ImageProc/Filter2D.Mat/257 only, it passed. I
 think
 256
 case may influence it.
   Thanks.

 Yan Wang


 ___
 Beignet mailing list
 Beignet@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/beignet



___
Beignet mailing list
Beignet@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/beignet


Re: [Beignet] possilbe bug when run opencv_test_imgproc

2014-10-24 Thread Meng, Mengmeng
Hi,
 The 2 bugs are:
 1. https://jira01.devtools.intel.com/browse/VIZ-4490
 VIZ-4490 [HSW/IVB/BYT-M] After update opencv, 1 opencv case 
(opencv_test_imgproc/OCL_Imgproc/HoughLines)fail
 
 2. https://jira01.devtools.intel.com/browse/VIZ-4489
 VIZ-4489 [IVB/HSW/BYT-M] After update opencv, some opencv 
cases(eg.opencv_test_imgproc/OCL_ImgProc/Canny)fail

Thanks,
Meng

 -Original Message-
 From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of
 yan.w...@linux.intel.com
 Sent: Friday, October 24, 2014 10:59 PM
 To: Zhigang Gong
 Cc: yan.w...@linux.intel.com; beignet@lists.freedesktop.org
 Subject: Re: [Beignet] possilbe bug when run opencv_test_imgproc
 
 Could you give me one URL?
 Thanks.
 
 Yan Wang
 
  All of these three failures are already tracked in JIRA.
  If you have access to JIRA, you can check them easily.
 
  Thanks,
  Zhigang Gong.
 
  On Fri, Oct 24, 2014 at 10:33 PM,  yan.w...@linux.intel.com wrote:
  Sure. I could try Yang Rong's patch.
  BTW, I also meet the following failed cases. Could you please confirm
  them? Thanks.
 
  [  FAILED  ] 3 tests, listed below:
  [  FAILED  ] OCL_ImgProc/Canny.Accuracy/8, where GetParam() =
  (Channels(3), AppertureSize(3), L2gradient(false), UseRoi(false)) [
  FAILED  ] OCL_ImgProc/Canny.Accuracy/10, where GetParam() =
  (Channels(3), AppertureSize(3), L2gradient(true), UseRoi(false)) [
  FAILED  ] OCL_Imgproc/HoughLines.RealImage/2, where GetParam() = (1,
  0.00872665, 80)
 
  Yan Wang
 
  I have BYT box, an IVB machine and a HSW notebook. All of them
  haven't this issue.
  Yang rong is working on a race condition patch. This issue may be
  related. You may try again once he send out the patch.
 
  On Fri, Oct 24, 2014 at 5:46 PM,  yan.w...@linux.intel.com wrote:
  Hi, Zhigang,
May your platform also get failed cases when run
  OCL_ImageProc/Filter2D.Mat* although no crash, because I have only
  Baytail T platfrom. I am not sure the other platform has the same
  issue.
If yes, I could try to investigate the reason. It may have issue
  of filter2D OpenCL implementation in OpenCV.
Thanks.
 
  Yan Wang
 
 
  Hi, All,
 I found one possible bug for review.
 if run the following:
  ./opencv_test_imgproc --gtest_filter=OCL_ImageProc/Filter2D.Mat*.
 OCL_ImageProc/Filter2D.Mat/256 failed and continue. But the
  whole test flow will crash in OCL_ImageProc/Filter2D.Mat/257:
  [  FAILED  ] OCL_ImageProc/Filter2D.Mat/240, where GetParam() =
  (CV_8U, Channels(2), 7, 1, BORDER_CONSTANT, false, false) (6433
  ms)
  [ RUN  ] OCL_ImageProc/Filter2D.Mat/241
  [   OK ] OCL_ImageProc/Filter2D.Mat/241 (358 ms)
  [ RUN  ] OCL_ImageProc/Filter2D.Mat/242
  [   OK ] OCL_ImageProc/Filter2D.Mat/242 (311 ms)
  [ RUN  ] OCL_ImageProc/Filter2D.Mat/243
  [   OK ] OCL_ImageProc/Filter2D.Mat/243 (6 ms)
  [ RUN  ] OCL_ImageProc/Filter2D.Mat/244
  [   OK ] OCL_ImageProc/Filter2D.Mat/244 (203 ms)
  [ RUN  ] OCL_ImageProc/Filter2D.Mat/245
  [   OK ] OCL_ImageProc/Filter2D.Mat/245 (207 ms)
  [ RUN  ] OCL_ImageProc/Filter2D.Mat/246
  [   OK ] OCL_ImageProc/Filter2D.Mat/246 (203 ms)
  [ RUN  ] OCL_ImageProc/Filter2D.Mat/247
  [   OK ] OCL_ImageProc/Filter2D.Mat/247 (7 ms)
  [ RUN  ] OCL_ImageProc/Filter2D.Mat/248
  [   OK ] OCL_ImageProc/Filter2D.Mat/248 (210 ms)
  [ RUN  ] OCL_ImageProc/Filter2D.Mat/249
  [   OK ] OCL_ImageProc/Filter2D.Mat/249 (207 ms)
  [ RUN  ] OCL_ImageProc/Filter2D.Mat/250
  [   OK ] OCL_ImageProc/Filter2D.Mat/250 (208 ms)
  [ RUN  ] OCL_ImageProc/Filter2D.Mat/251
  [   OK ] OCL_ImageProc/Filter2D.Mat/251 (7 ms)
  [ RUN  ] OCL_ImageProc/Filter2D.Mat/252
  [   OK ] OCL_ImageProc/Filter2D.Mat/252 (214 ms)
  [ RUN  ] OCL_ImageProc/Filter2D.Mat/253
  [   OK ] OCL_ImageProc/Filter2D.Mat/253 (207 ms)
  [ RUN  ] OCL_ImageProc/Filter2D.Mat/254
  [   OK ] OCL_ImageProc/Filter2D.Mat/254 (212 ms)
  [ RUN  ] OCL_ImageProc/Filter2D.Mat/255
  [   OK ] OCL_ImageProc/Filter2D.Mat/255 (7 ms)
  [ RUN  ] OCL_ImageProc/Filter2D.Mat/256
 
 /home/yanwang/opencv/modules/imgproc/test/ocl/test_filter2d.cpp:106:
  Failure
  Expected: (TestUtils::checkNorm2(dst, udst)) = (threshold), actual:
  255
  vs 1
  Size: [92 x 61]
 
 
 /home/yanwang/opencv/modules/imgproc/test/ocl/test_filter2d.cpp:107:
  Failure
  Expected: (TestUtils::checkNorm2(dst_roi, udst_roi)) =
  (threshold),
  actual: 255 vs 1
  Size: [92 x 61]
 
  [  FAILED  ] OCL_ImageProc/Filter2D.Mat/256, where GetParam() =
  (CV_8U, Channels(2), 7, 4, BORDER_CONSTANT, false, false) (7413
  ms)
  [ RUN  ] OCL_ImageProc/Filter2D.Mat/257
  opencv_test_imgproc:
  /home/yanwang/beignet/src/intel/intel_gpgpu.c:703:
  intel_gpgpu_check_binded_buf_address: Assertion
  `gpgpu-binded_buf[i]-offset != 0' failed.
  Segmentation fault (core dumped)
 
But if I run OCL_ImageProc/Filter2D.Mat/257 only, it passed. I
  think
  256
  case may influence it.
Thanks.
 
  Yan Wang
 

[Beignet] beignet not working with Wolfram Mathematica

2014-10-24 Thread Lorenzo Pistone

Hello,
I'm trying to run beignet from Wolfram Mathematica 10.0.1, I have 
beignet 0.9.2, Fedora 20 x86_64, kernel 3.16.3, i5-3230M.
First, there is an issue which is probably to be blamed on Mathematica 
(you need to manually load /usr/lib64/beignet/libcl.so with 
LibraryLoad), I suppose that Mathematica doesn't know about the icd system.


Anyway, after that, this test program

   src = __kernel void myKernel( __global mint * global0Id, __global
   mint * global1Id, mint width, mint height) {
  int xIndex = get_global_id(0);
  int yIndex = get_global_id(1);
  int index = xIndex + yIndex*width;
  if (xIndex  width  yIndex  height) {
 global0Id[index] = get_local_id(0);
 global1Id[index] = get_local_id(1);
  }
};


which is suggested in the examples  of the Mathematica implementation 
(http://reference.wolfram.com/language/OpenCLLink/tutorial/Programming.html) 
produces all zeroes. If instead the pocl driver is loaded, the kernel is 
executed correctly. Also I could run the LuxMark benchmarks (even though 
on some tests I see yellow spots that I believe are glitches).


I am sorry if the question is a bit vague, but I don't know much about 
OpenCL and I indeed wanted to start learning while using Mathematica, 
which is a tool that I already need for other reasons.


Lorenzo
___
Beignet mailing list
Beignet@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/beignet


Re: [Beignet] [PATCH 6/6] BDW: Add function intel_gpgpu_bind_buf for gen8.

2014-10-24 Thread He Junyan
This patchset LGTM

On 一, 2014-09-29 at 13:37 +0800, Yang Rong wrote:
 From: Junyan He junyan...@linux.intel.com
 
 Must call cl_bind_buf instead of intel_gpgpu_bind_buf directly in intel_gpgpu.
 
 Signed-off-by: Junyan He junyan...@linux.intel.com
 ---
  src/intel/intel_gpgpu.c | 36 +++-
  1 file changed, 27 insertions(+), 9 deletions(-)
 
 diff --git a/src/intel/intel_gpgpu.c b/src/intel/intel_gpgpu.c
 index 6b8fa38..eedfe31 100644
 --- a/src/intel/intel_gpgpu.c
 +++ b/src/intel/intel_gpgpu.c
 @@ -818,13 +818,13 @@ intel_gpgpu_setup_bti_gen8(intel_gpgpu_t *gpgpu, 
 drm_intel_bo *buf,
ss0-ss8_9.surface_base_addr_lo = (buf-offset64 + internal_offset)  
 0x;
ss0-ss8_9.surface_base_addr_hi = ((buf-offset64 + internal_offset)  
 32)  0x;
dri_bo_emit_reloc(gpgpu-aux_buf.bo,
 -  I915_GEM_DOMAIN_RENDER,
 -  I915_GEM_DOMAIN_RENDER,
 -  internal_offset,
 -  gpgpu-aux_offset.surface_heap_offset +
 -  heap-binding_table[index] +
 -  offsetof(gen8_surface_state_t, ss1),
 -  buf);
 +I915_GEM_DOMAIN_RENDER,
 +I915_GEM_DOMAIN_RENDER,
 +internal_offset,
 +gpgpu-aux_offset.surface_heap_offset +
 +heap-binding_table[index] +
 +offsetof(gen8_surface_state_t, ss1),
 +buf);
  }
  
  static int
 @@ -981,6 +981,18 @@ intel_gpgpu_bind_buf(intel_gpgpu_t *gpgpu, drm_intel_bo 
 *buf, uint32_t offset,
intel_gpgpu_setup_bti(gpgpu, buf, internal_offset, size, bti);
  }
  
 +static void
 +intel_gpgpu_bind_buf_gen8(intel_gpgpu_t *gpgpu, drm_intel_bo *buf, uint32_t 
 offset,
 +  uint32_t internal_offset, uint32_t size, uint8_t 
 bti)
 +{
 +  assert(gpgpu-binded_n  max_buf_n);
 +  gpgpu-binded_buf[gpgpu-binded_n] = buf;
 +  gpgpu-target_buf_offset[gpgpu-binded_n] = internal_offset;
 +  gpgpu-binded_offset[gpgpu-binded_n] = offset;
 +  gpgpu-binded_n++;
 +  intel_gpgpu_setup_bti_gen8(gpgpu, buf, internal_offset, size, bti);
 +}
 +
  static int
  intel_gpgpu_set_scratch(intel_gpgpu_t * gpgpu, uint32_t per_thread_size)
  {
 @@ -1011,7 +1023,7 @@ intel_gpgpu_set_stack(intel_gpgpu_t *gpgpu, uint32_t 
 offset, uint32_t size, uint
drm_intel_bufmgr *bufmgr = gpgpu-drv-bufmgr;
gpgpu-stack_b.bo = drm_intel_bo_alloc(bufmgr, STACK, size, 64);
  
 -  intel_gpgpu_bind_buf(gpgpu, gpgpu-stack_b.bo, offset, 0, size, bti);
 +  cl_gpgpu_bind_buf((cl_gpgpu)gpgpu, (cl_buffer)gpgpu-stack_b.bo, offset, 
 0, size, bti);
  }
  
  static void
 @@ -1427,7 +1439,7 @@ intel_gpgpu_set_printf_buf(intel_gpgpu_t *gpgpu, 
 uint32_t i, uint32_t size, uint
}
memset(bo-virtual, 0, size);
drm_intel_bo_unmap(bo);
 -  intel_gpgpu_bind_buf(gpgpu, bo, offset, 0, size, bti);
 +  cl_gpgpu_bind_buf((cl_gpgpu)gpgpu, (cl_buffer)bo, offset, 0, size, bti);
return 0;
  }
  
 @@ -1526,6 +1538,12 @@ intel_set_gpgpu_callbacks(int device_id)
cl_gpgpu_set_printf_info = (cl_gpgpu_set_printf_info_cb 
 *)intel_gpgpu_set_printf_info;
cl_gpgpu_get_printf_info = (cl_gpgpu_get_printf_info_cb 
 *)intel_gpgpu_get_printf_info;
  
 +  if (IS_BROADWELL(device_id)) {
 +cl_gpgpu_bind_buf = (cl_gpgpu_bind_buf_cb *)intel_gpgpu_bind_buf_gen8;
 +cl_gpgpu_get_cache_ctrl = (cl_gpgpu_get_cache_ctrl_cb 
 *)intel_gpgpu_get_cache_ctrl_gen8;
 +return;
 +  }
 +
if (IS_HASWELL(device_id)) {
  cl_gpgpu_bind_image = (cl_gpgpu_bind_image_cb *) 
 intel_gpgpu_bind_image_gen75;
  cl_gpgpu_alloc_constant_buffer  = (cl_gpgpu_alloc_constant_buffer_cb *) 
 intel_gpgpu_alloc_constant_buffer_gen75;



___
Beignet mailing list
Beignet@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/beignet


Re: [Beignet] [PATCH 5/5] BDW: Add class Gen8Context.

2014-10-24 Thread Yang, Rong R
Yes, I am also plan to change GenContext to a pure virtual class, and I think 
It is better to do this when optimize the long operations in BDW.

-Original Message-
From: He Junyan [mailto:junyan...@inbox.com] 
Sent: Thursday, October 9, 2014 13:09
To: Yang, Rong R
Cc: beignet@lists.freedesktop.org
Subject: Re: [Beignet] [PATCH 5/5] BDW: Add class Gen8Context.

This patchset is OK and will not cause regression on previous platform.
In this patch set, the GenEncoder will be a pure virtual class and all platform 
encoders will derive from it.
But the GenContext still represents the Gen7 context. I think it is better to 
follow the same way as the encoder to make the architecture clearer.



On 一, 2014-09-29 at 13:37 +0800, Yang Rong wrote:
 Now Gen8Context is almost same as Gen75Context, but still derive Gen8Context 
 from GenContext for clearly.
 
 Signed-off-by: Yang Rong rong.r.y...@intel.com
 ---
  backend/src/CMakeLists.txt   |   2 +
  backend/src/backend/gen8_context.cpp | 113 
 +++
  backend/src/backend/gen8_context.hpp |  63 +++
  backend/src/backend/gen_program.cpp  |   3 +
  4 files changed, 181 insertions(+)
  create mode 100644 backend/src/backend/gen8_context.cpp
  create mode 100644 backend/src/backend/gen8_context.hpp
 
 diff --git a/backend/src/CMakeLists.txt b/backend/src/CMakeLists.txt 
 index 2daa630..c5d388e 100644
 --- a/backend/src/CMakeLists.txt
 +++ b/backend/src/CMakeLists.txt
 @@ -96,6 +96,8 @@ set (GBE_SRC
  backend/gen_context.cpp
  backend/gen75_context.hpp
  backend/gen75_context.cpp
 +backend/gen8_context.hpp
 +backend/gen8_context.cpp
  backend/gen_program.cpp
  backend/gen_program.hpp
  backend/gen_program.h
 diff --git a/backend/src/backend/gen8_context.cpp 
 b/backend/src/backend/gen8_context.cpp
 new file mode 100644
 index 000..a9914f6
 --- /dev/null
 +++ b/backend/src/backend/gen8_context.cpp
 @@ -0,0 +1,113 @@
 +/*
 + * Copyright © 2012 Intel Corporation
 + *
 + * This library is free software; you can redistribute it and/or
 + * modify it under the terms of the GNU Lesser General Public
 + * License as published by the Free Software Foundation; either
 + * version 2 of the License, or (at your option) any later version.
 + *
 + * This library is distributed in the hope that it will be useful,
 + * but WITHOUT ANY WARRANTY; without even the implied warranty of
 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 + * Lesser General Public License for more details.
 + *
 + * You should have received a copy of the GNU Lesser General Public
 + * License along with this library. If not, see 
 http://www.gnu.org/licenses/.
 + *
 + */
 +
 +/**
 + * \file gen8_context.cpp
 + */
 +
 +#include backend/gen8_context.hpp
 +#include backend/gen8_encoder.hpp
 +#include backend/gen_program.hpp
 +#include backend/gen_defs.hpp
 +#include backend/gen_encoder.hpp
 +#include backend/gen_insn_selection.hpp
 +#include backend/gen_insn_scheduling.hpp
 +#include backend/gen_reg_allocation.hpp
 +#include sys/cvar.hpp
 +#include ir/function.hpp
 +#include ir/value.hpp
 +#include cstring
 +
 +namespace gbe
 +{
 +  void Gen8Context::emitSLMOffset(void) {
 +if(kernel-getUseSLM() == false)
 +  return;
 +
 +const GenRegister slm_offset = 
 ra-genReg(GenRegister::ud1grf(ir::ocl::slmoffset));
 +const GenRegister slm_index = GenRegister::ud1grf(0, 0);
 +//the slm index is hold in r0.0 24-27 bit, in 4K unit, shift left 12 to 
 get byte unit
 +p-push();
 +  p-curr.execWidth = 1;
 +  p-curr.predicate = GEN_PREDICATE_NONE;
 +  p-SHR(slm_offset, slm_index, GenRegister::immud(12));
 +p-pop();
 +  }
 +
 +  void Gen8Context::allocSLMOffsetCurbe(void) {
 +if(fn.getUseSLM())
 +  allocCurbeReg(ir::ocl::slmoffset, GBE_CURBE_SLM_OFFSET);  }
 +
 +  uint32_t Gen8Context::alignScratchSize(uint32_t size){
 +if(size == 0)
 +  return 0;
 +uint32_t i = 2048;
 +while(i  size) i *= 2;
 +return i;
 +  }
 +
 +  void Gen8Context::emitStackPointer(void) {
 +using namespace ir;
 +
 +// Only emit stack pointer computation if we use a stack
 +if (kernel-getCurbeOffset(GBE_CURBE_STACK_POINTER, 0) = 0)
 +  return;
 +
 +// Check that everything is consistent in the kernel code
 +const uint32_t perLaneSize = kernel-getStackSize();
 +const uint32_t perThreadSize = perLaneSize * this-simdWidth;
 +GBE_ASSERT(perLaneSize  0);
 +GBE_ASSERT(isPowerOf2(perLaneSize) == true);
 +GBE_ASSERT(isPowerOf2(perThreadSize) == true);
 +
 +// Use shifts rather than muls which are limited to 32x16 bit sources
 +const uint32_t perLaneShift = logi2(perLaneSize);
 +const uint32_t perThreadShift = logi2(perThreadSize);
 +const GenRegister selStatckPtr = this-simdWidth == 8 ?
 +  GenRegister::ud8grf(ir::ocl::stackptr) :
 +  GenRegister::ud16grf(ir::ocl::stackptr);
 +const GenRegister stackptr = 

Re: [Beignet] [PATCH 1/6] BDW: Add gen8 into intel_driver_init

2014-10-24 Thread Zhigang Gong
This patch itself LGTM. But I will move it after other patches.
Thus once this patch get applied, it can accept Gen8 device and
work as expected.

Thanks.

On Mon, Sep 29, 2014 at 01:37:44PM +0800, Yang Rong wrote:
 From: Junyan He junyan...@linux.intel.com
 
 Signed-off-by: Junyan He junyan...@linux.intel.com
 ---
  src/cl_command_queue.c   | 2 +-
  src/intel/intel_driver.c | 4 +++-
  2 files changed, 4 insertions(+), 2 deletions(-)
 
 diff --git a/src/cl_command_queue.c b/src/cl_command_queue.c
 index 4cbb4eb..48deba0 100644
 --- a/src/cl_command_queue.c
 +++ b/src/cl_command_queue.c
 @@ -410,7 +410,7 @@ cl_command_queue_ND_range(cl_command_queue queue,
}
  #endif /* USE_FULSIM */
  
 -  if (ver == 7 || ver == 75)
 +  if (ver == 7 || ver == 75 || ver == 8)
  TRY (cl_command_queue_ND_range_gen7, queue, k, work_dim, global_wk_off, 
 global_wk_sz, local_wk_sz);
else
  FATAL (Unknown Gen Device);
 diff --git a/src/intel/intel_driver.c b/src/intel/intel_driver.c
 index 66f2bcf..2c2ed5f 100644
 --- a/src/intel/intel_driver.c
 +++ b/src/intel/intel_driver.c
 @@ -183,7 +183,9 @@ intel_driver_init(intel_driver_t *driver, int dev_fd)
else
  FATAL (Unsupported Gen for emulation);
  #else
 -  if (IS_GEN75(driver-device_id))
 +  if (IS_GEN8(driver-device_id))
 +driver-gen_ver = 8;
 +  else if (IS_GEN75(driver-device_id))
  driver-gen_ver = 75;
else if (IS_GEN7(driver-device_id))
  driver-gen_ver = 7;
 -- 
 1.8.3.2
 
 ___
 Beignet mailing list
 Beignet@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/beignet
___
Beignet mailing list
Beignet@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/beignet


Re: [Beignet] [PATCH 1/3] GBE: Fix a bug when setting flag register

2014-10-24 Thread Zhigang Gong
This patch LGTM, will push latter, thanks.

On Fri, Oct 10, 2014 at 03:01:25PM +0800, Ruiling Song wrote:
 we should use simd1, instead of simd8/simd16.
 
 Signed-off-by: Ruiling Song ruiling.s...@intel.com
 ---
  backend/src/backend/gen_context.cpp |   20 ++--
  backend/src/backend/gen_context.hpp |1 +
  2 files changed, 15 insertions(+), 6 deletions(-)
 
 diff --git a/backend/src/backend/gen_context.cpp 
 b/backend/src/backend/gen_context.cpp
 index 8844233..245c318 100644
 --- a/backend/src/backend/gen_context.cpp
 +++ b/backend/src/backend/gen_context.cpp
 @@ -763,14 +763,14 @@ namespace gbe
  p-SHL(c, e, a);
  p-SHL(d, f, a);
  p-OR(e, d, b);
 -p-MOV(flagReg, GenRegister::immuw(0x));
 +setFlag(flagReg, GenRegister::immuw(0x));
  p-curr.predicate = GEN_PREDICATE_NORMAL;
  p-curr.useFlag(flagReg.flag_nr(), flagReg.flag_subnr());
  p-CMP(GEN_CONDITIONAL_Z, a, zero);
  p-SEL(d, d, e);
  p-curr.predicate = GEN_PREDICATE_NONE;
  p-AND(a, a, GenRegister::immud(32));
 -p-MOV(flagReg, GenRegister::immuw(0x));
 +setFlag(flagReg, GenRegister::immuw(0x));
  p-curr.predicate = GEN_PREDICATE_NORMAL;
  p-curr.useFlag(flagReg.flag_nr(), flagReg.flag_subnr());
  p-CMP(GEN_CONDITIONAL_Z, a, zero);
 @@ -791,14 +791,14 @@ namespace gbe
  p-SHR(c, f, a);
  p-SHR(d, e, a);
  p-OR(e, d, b);
 -p-MOV(flagReg, GenRegister::immuw(0x));
 +setFlag(flagReg, GenRegister::immuw(0x));
  p-curr.predicate = GEN_PREDICATE_NORMAL;
  p-curr.useFlag(flagReg.flag_nr(), flagReg.flag_subnr());
  p-CMP(GEN_CONDITIONAL_Z, a, zero);
  p-SEL(d, d, e);
  p-curr.predicate = GEN_PREDICATE_NONE;
  p-AND(a, a, GenRegister::immud(32));
 -p-MOV(flagReg, GenRegister::immuw(0x));
 +setFlag(flagReg, GenRegister::immuw(0x));
  p-curr.predicate = GEN_PREDICATE_NORMAL;
  p-curr.useFlag(flagReg.flag_nr(), flagReg.flag_subnr());
  p-CMP(GEN_CONDITIONAL_Z, a, zero);
 @@ -820,7 +820,7 @@ namespace gbe
  p-ASR(c, f, a);
  p-SHR(d, e, a);
  p-OR(e, d, b);
 -p-MOV(flagReg, GenRegister::immuw(0x));
 +setFlag(flagReg, GenRegister::immuw(0x));
  p-curr.predicate = GEN_PREDICATE_NORMAL;
  p-curr.useFlag(flagReg.flag_nr(), flagReg.flag_subnr());
  p-CMP(GEN_CONDITIONAL_Z, a, zero);
 @@ -828,7 +828,7 @@ namespace gbe
  p-curr.predicate = GEN_PREDICATE_NONE;
  p-AND(a, a, GenRegister::immud(32));
  p-ASR(f, f, GenRegister::immd(31));
 -p-MOV(flagReg, GenRegister::immuw(0x));
 +setFlag(flagReg, GenRegister::immuw(0x));
  p-curr.predicate = GEN_PREDICATE_NORMAL;
  p-curr.useFlag(flagReg.flag_nr(), flagReg.flag_subnr());
  p-CMP(GEN_CONDITIONAL_Z, a, zero);
 @@ -842,6 +842,14 @@ namespace gbe
  NOT_IMPLEMENTED;
  }
}
 +  void GenContext::setFlag(GenRegister flagReg, GenRegister src) {
 +p-push();
 +p-curr.noMask = 1;
 +p-curr.execWidth = 1;
 +p-curr.predicate = GEN_PREDICATE_NONE;
 +p-MOV(flagReg, src);
 +p-pop();
 +  }
  
void GenContext::saveFlag(GenRegister dest, int flag, int subFlag) {
  p-push();
 diff --git a/backend/src/backend/gen_context.hpp 
 b/backend/src/backend/gen_context.hpp
 index 4a01fd5..7a51f57 100644
 --- a/backend/src/backend/gen_context.hpp
 +++ b/backend/src/backend/gen_context.hpp
 @@ -116,6 +116,7 @@ namespace gbe
  void I64FullAdd(GenRegister high1, GenRegister low1, GenRegister high2, 
 GenRegister low2);
  void I32FullMult(GenRegister high, GenRegister low, GenRegister src0, 
 GenRegister src1);
  void I64FullMult(GenRegister dst1, GenRegister dst2, GenRegister dst3, 
 GenRegister dst4, GenRegister x_high, GenRegister x_low, GenRegister y_high, 
 GenRegister y_low);
 +void setFlag(GenRegister flag, GenRegister src);
  void saveFlag(GenRegister dest, int flag, int subFlag);
  void UnsignedI64ToFloat(GenRegister dst, GenRegister high, GenRegister 
 low, GenRegister exp, GenRegister mantissa, GenRegister tmp, GenRegister 
 flag);
  
 -- 
 1.7.10.4
 
 ___
 Beignet mailing list
 Beignet@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/beignet
___
Beignet mailing list
Beignet@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/beignet


[Beignet] [PATCH] Fix HSW thread_n = 64 assert.

2014-10-24 Thread Yang Rong
In function cl_get_kernel_max_wg_sz, hsw's thread count may large than 64,
add a max limit.

Signed-off-by: Yang Rong rong.r.y...@intel.com
---
 src/cl_device_id.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/src/cl_device_id.c b/src/cl_device_id.c
index a0d0db6..7944ca4 100644
--- a/src/cl_device_id.c
+++ b/src/cl_device_id.c
@@ -633,7 +633,7 @@ cl_check_builtin_kernel_dimension(cl_kernel kernel, 
cl_device_id device)
 LOCAL size_t
 cl_get_kernel_max_wg_sz(cl_kernel kernel)
 {
-  size_t work_group_size;
+  size_t work_group_size, thread_cnt;
   int simd_width = interp_kernel_get_simd_width(kernel-opaque);
   int vendor_id = kernel-program-ctx-device-vendor_id;
   if (!interp_kernel_use_slm(kernel-opaque)) {
@@ -642,9 +642,13 @@ cl_get_kernel_max_wg_sz(cl_kernel kernel)
 else
   work_group_size = kernel-program-ctx-device-max_compute_unit *
 kernel-program-ctx-device-max_thread_per_unit * 
simd_width;
-  } else
-work_group_size = kernel-program-ctx-device-max_compute_unit * 
simd_width *
+  } else {
+thread_cnt = kernel-program-ctx-device-max_compute_unit *
  kernel-program-ctx-device-max_thread_per_unit / 
kernel-program-ctx-device-sub_slice_count;
+if(thread_cnt  64)
+  thread_cnt = 64;
+work_group_size = thread_cnt * simd_width;
+  }
   return work_group_size;
 }
 
-- 
1.8.3.2

___
Beignet mailing list
Beignet@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/beignet


Re: [Beignet] [PATCH] Fix AUX buffer for really page aligned

2014-10-24 Thread Guo, Yejun
three comments in line, thanks.

-Original Message-
From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of 
Zhenyu Wang
Sent: Wednesday, October 22, 2014 4:11 PM
To: beignet@lists.freedesktop.org
Subject: [Beignet] [PATCH] Fix AUX buffer for really page aligned

Apply ALIGN() for aux buffer size from beginning has no effect on final target 
size. Move to the end of all state offsets set for alignment.

Signed-off-by: Zhenyu Wang zhen...@linux.intel.com
---
 src/intel/intel_gpgpu.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/intel/intel_gpgpu.c b/src/intel/intel_gpgpu.c index 
105a077..98b32bf 100644
--- a/src/intel/intel_gpgpu.c
+++ b/src/intel/intel_gpgpu.c
@@ -759,8 +759,6 @@ intel_gpgpu_state_init(intel_gpgpu_t *gpgpu,
 dri_bo_unreference(gpgpu-aux_buf.bo);
   gpgpu-aux_buf.bo = NULL;
 
-  //surface heap must be 4096 bytes aligned because state base address use 
20bit for the address
-  size_aux = ALIGN(size_aux, 4096);
[Yejun] Yes, there is no effect at the beginning. Actually, the code here is to 
highlight the alignment requirement, so future maintainer knows it especially 
if he wants to move the layout from the beginning to the middle.

   gpgpu-aux_offset.surface_heap_offset = size_aux;
   size_aux += sizeof(surface_heap_t);
 
@@ -784,7 +782,10 @@ intel_gpgpu_state_init(intel_gpgpu_t *gpgpu,
   gpgpu-aux_offset.sampler_border_color_state_offset = size_aux;
   size_aux += GEN_MAX_SAMPLERS * sizeof(gen7_sampler_border_color_t);

[Yejun] aux buffer contains several parts, each part has its own alignment 
requirement, so we can see several 'ALIGN'  in above code. The alignment is 
handled for each part, it is not feasible to move one/all of them to the last.

 
-  bo = dri_bo_alloc(gpgpu-drv-bufmgr, AUX_BUFFER, size_aux, 0);
+  //surface heap must be 4096 bytes aligned because state base address 
+ use 20bit for the address  size_aux = ALIGN(size_aux, 4096);
+
+  bo = dri_bo_alloc(gpgpu-drv-bufmgr, AUX_BUFFER, size_aux, 4096);

[Yejun] the last parameter '4096' is to control the size of the whole buffer 
(to be page aligned), not to meet the align requirement of the base address, 
and it is not explicitly required and  is ignored in function 
drm_intel_gem_bo_alloc_internal. 

   if (!bo || dri_bo_map(bo, 1) != 0) {
 fprintf(stderr, %s:%d: %s.\n, __FILE__, __LINE__, strerror(errno));
 if (bo)
--
2.1.1

___
Beignet mailing list
Beignet@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/beignet
___
Beignet mailing list
Beignet@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/beignet


Re: [Beignet] Problems with recent beignet

2014-10-24 Thread Zhigang Gong
On Sat, Oct 18, 2014 at 04:26:02PM +0200, Martin Hauke wrote:
 Hi Zhigang,
 
 On 18.10.2014 04:36, you wrote:
  building git master on OpenSUSE 13.1 actually fails since this OpenSUSE
  version ships only with libdrm2-2.4.46-3.2.2
  As you may know, we are working on enable BDW. And the new version
  libdrm is required for that purpose.
  Considering libdrm 2.4.52 was released more than half year ago, we
  didn't think this requirement is a real issue.
  
  If there is a strong reason that there are systems won't upgrade to
  the libdrm 2.4.52 or newer. Please
  let us know. Thanks.
 
 OpenSUSE 13.2 will be released in the next weeks and is will be based on
 current OpenSUSE Factory packages so it will ship at least libdrm 2.4.58
 
 For OpenSUSE 13.1 users i've included libdrm 2.4.58 in my obs repositories:
 https://build.opensuse.org/package/show/home:mnhauke:opencl:stable/beignet
 https://build.opensuse.org/package/show/home:mnhauke:opencl:testing/beignet
 
 
 OpenSUSE 13.1 ships LLVM 3.3 and building beignet git master still
 fails:
 https://build.opensuse.org/package/live_build_log/home:mnhauke:opencl:testing/beignet/openSUSE_13.1/x86_64
 snip---
 backend/src/llvm/llvm_unroll.cpp:47:32: fatal error:
 llvm/IR/Dominators.h: No such file or directory
 snip---
 
 Is LLVM 3.3 still supported with recent beignet versions?
LLVM 3.3 should be supported. If it was broken, there is a bug. I just pushed a 
patch which could fix
the problem you met above.

Thanks,
Zhigang Gong.
___
Beignet mailing list
Beignet@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/beignet


[Beignet] [PATCH V2] Add the disasm support for Gen8

2014-10-24 Thread junyan . he
From: Junyan He junyan...@linux.intel.com

Signed-off-by: Junyan He junyan...@linux.intel.com
---
 backend/src/backend/gen/gen_mesa_disasm.c | 1110 +++--
 backend/src/backend/gen_defs.hpp  |2 +
 2 files changed, 582 insertions(+), 530 deletions(-)

diff --git a/backend/src/backend/gen/gen_mesa_disasm.c 
b/backend/src/backend/gen/gen_mesa_disasm.c
index 2412404..876bbeb 100644
--- a/backend/src/backend/gen/gen_mesa_disasm.c
+++ b/backend/src/backend/gen/gen_mesa_disasm.c
@@ -1,4 +1,4 @@
-/* 
+/*
  * Copyright © 2012 Intel Corporation
  *
  * This library is free software; you can redistribute it and/or
@@ -67,7 +67,6 @@ static const struct {
   [GEN_OPCODE_LZD] = { .name = lzd, .nsrc = 1, .ndst = 1 },
   [GEN_OPCODE_FBH] = { .name = fbh, .nsrc = 1, .ndst = 1 },
   [GEN_OPCODE_FBL] = { .name = fbl, .nsrc = 1, .ndst = 1 },
-  [GEN_OPCODE_CBIT] = { .name = cbit, .nsrc = 1, .ndst = 1 },
   [GEN_OPCODE_F16TO32] = { .name = f16to32, .nsrc = 1, .ndst = 1 },
   [GEN_OPCODE_F32TO16] = { .name = f32to16, .nsrc = 1, .ndst = 1 },
 
@@ -143,7 +142,7 @@ static const char *_abs[2] = {
   [1] = (abs),
 };
 
-static const char *vert_stride[16] = {
+static const char *vert_stride_gen7[16] = {
   [0] = 0,
   [1] = 1,
   [2] = 2,
@@ -153,6 +152,15 @@ static const char *vert_stride[16] = {
   [6] = 32,
   [15] = VxH,
 };
+static const char *vert_stride_gen8[16] = {
+  [0] = 0,
+  [1] = 1,
+  [2] = 2,
+  [3] = 4,
+  [4] = 8,
+  [5] = 16,
+  [6] = 32,
+};
 
 static const char *width[8] = {
   [0] = 1,
@@ -234,11 +242,17 @@ static const char *pred_ctrl_align1[16] = {
   [11] = .all16h,
 };
 
-static const char *thread_ctrl[4] = {
+static const char *thread_ctrl_gen7[4] = {
+  [0] = ,
+  [2] = switch
+};
+static const char *thread_ctrl_gen8[4] = {
   [0] = ,
+  [1] = atomic,
   [2] = switch
 };
 
+
 static const char *dep_ctrl[4] = {
   [0] = ,
   [1] = NoDDClr,
@@ -246,11 +260,6 @@ static const char *dep_ctrl[4] = {
   [3] = NoDDClr,NoDDChk,
 };
 
-static const char *mask_ctrl[4] = {
-  [0] = ,
-  [1] = nomask,
-};
-
 static const char *access_mode[2] = {
   [0] = align1,
   [1] = align16,
@@ -351,7 +360,7 @@ static const char *gateway_sub_function[8] = {
   [7] = reserved
 };
 
-static const char *math_function[16] = {
+static const char *math_function_gen7[16] = {
   [GEN_MATH_FUNCTION_INV] = inv,
   [GEN_MATH_FUNCTION_LOG] = log,
   [GEN_MATH_FUNCTION_EXP] = exp,
@@ -365,6 +374,20 @@ static const char *math_function[16] = {
   [GEN_MATH_FUNCTION_INT_DIV_QUOTIENT] = intdiv,
   [GEN_MATH_FUNCTION_INT_DIV_REMAINDER] = intmod,
 };
+static const char *math_function_gen8[16] = {
+  [GEN_MATH_FUNCTION_INV] = inv,
+  [GEN_MATH_FUNCTION_LOG] = log,
+  [GEN_MATH_FUNCTION_EXP] = exp,
+  [GEN_MATH_FUNCTION_SQRT] = sqrt,
+  [GEN_MATH_FUNCTION_RSQ] = rsq,
+  [GEN_MATH_FUNCTION_SIN] = sin,
+  [GEN_MATH_FUNCTION_COS] = cos,
+  [GEN_MATH_FUNCTION_FDIV] = fdiv,
+  [GEN_MATH_FUNCTION_POW] = pow,
+  [GEN_MATH_FUNCTION_INT_DIV_QUOTIENT_AND_REMAINDER] = intdivmod,
+  [GEN8_MATH_FUNCTION_INVM] = invm,
+  [GEN8_MATH_FUNCTION_RSQRTM] = rsqrtm,
+};
 
 static const char *math_saturate[2] = {
   [0] = ,
@@ -452,14 +475,82 @@ static const char *data_port1_data_cache_msg_type[] = {
 
 static int column;
 
-static int string (FILE *file, const char *string)
+static int gen_version;
+
+#define GEN_BITS_FIELD(inst, gen)   \
+  ({\
+int bits;   \
+if (gen_version  80)   \
+  bits = ((const union Gen7NativeInstruction *)inst)-gen;  \
+else\
+  bits = ((const union Gen8NativeInstruction *)inst)-gen;  \
+bits;   \
+  })
+
+#define GEN_BITS_FIELD2(inst, gen7, gen8)   \
+  ({\
+int bits;   \
+if (gen_version  80)   \
+  bits = ((const union Gen7NativeInstruction *)inst)-gen7; \
+else\
+  bits = ((const union Gen8NativeInstruction *)inst)-gen8; \
+bits;   \
+  })
+
+#define PRED_CTRL(inst)GEN_BITS_FIELD(inst, 
header.predicate_control)
+#define PRED_INV(inst) GEN_BITS_FIELD(inst, 
header.predicate_inverse)
+#define FLAG_REG_NR(inst)  GEN_BITS_FIELD2(inst, 
bits2.da1.flag_reg_nr, bits1.da1.flag_reg_nr)
+#define FLAG_SUB_REG_NR(inst)  GEN_BITS_FIELD2(inst, 
bits2.da1.flag_sub_reg_nr, bits1.da1.flag_sub_reg_nr)
+#define ACCESS_MODE(inst)  GEN_BITS_FIELD(inst, header.access_mode)
+#define MASK_CONTROL(inst) GEN_BITS_FIELD2(inst, header.mask_control, 
bits1.da1.mask_control)
+#define 

Re: [Beignet] Which version of mesa that Beignet could compile with?

2014-10-24 Thread Steven Newbury
On Sun, 2014-10-19 at 22:15 +0800, Boxiang Sun wrote:
 Did you set the 'MESA_SOURCE_FOUND' to '1' and the path of mesa 
 source
 directory in Beignet cmake file?
 
 I'm pretty sure that the current master could not compiles with Mesa 
 10.3.x
 if without some modifications.
 
 Regards,
 Sun
 
 2014-10-19 3:13 GMT+08:00 Yichao Yu yyc1...@gmail.com:
 
  On Sat, Oct 18, 2014 at 11:44 AM, Boxiang Sun daetalu...@gmail.com
  wrote:
   Hi,
   
   I know the Beignet could not been build with current version of
  
  Really? Both the current master and the 0.9.3 release compiles with
  mesa 10.3.1 just fine here.
  
   mesa(10.3.x5). But I tired the 9.2.5. It still could not been 
   build.
   
   So which version of mesa that the current Beignet could been 
   build? I
  mean
   could build successfully without any modification.
   
I've not tested recently, but the Beignet Mesa EGL extension has been 
broken for a long time.  I looked into fixing it, but it's a big job, 
and really needs to be worked on by an experienced Mesa developer as 
it requires support from the Mesa side.  The (currently?) broken 
version was a bit hacky and depended on Mesa functionality that has 
since been removed.

___
Beignet mailing list
Beignet@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/beignet


Re: [Beignet] [PATCH] Fix AUX buffer for really page aligned

2014-10-24 Thread Zhenyu Wang
On 2014.10.22 08:49:53 +, Guo, Yejun wrote:
 three comments in line, thanks.
 

Thanks for review this.

gpgpu-aux_offset.surface_heap_offset = size_aux;
size_aux += sizeof(surface_heap_t);
  
 @@ -784,7 +782,10 @@ intel_gpgpu_state_init(intel_gpgpu_t *gpgpu,
gpgpu-aux_offset.sampler_border_color_state_offset = size_aux;
size_aux += GEN_MAX_SAMPLERS * sizeof(gen7_sampler_border_color_t);
 
 [Yejun] aux buffer contains several parts, each part has its own alignment 
 requirement, so we can see several 'ALIGN'  in above code. The alignment is 
 handled for each part, it is not feasible to move one/all of them to the last.
 

The goal is to make sure final aux buffer size is page aligned after
adding up all required aligned space size. My patch only counts on
final size.

  
 -  bo = dri_bo_alloc(gpgpu-drv-bufmgr, AUX_BUFFER, size_aux, 0);
 +  //surface heap must be 4096 bytes aligned because state base address 
 + use 20bit for the address  size_aux = ALIGN(size_aux, 4096);
 +
 +  bo = dri_bo_alloc(gpgpu-drv-bufmgr, AUX_BUFFER, size_aux, 4096);
 
 [Yejun] the last parameter '4096' is to control the size of the whole buffer 
 (to be page aligned), not to meet the align requirement of the base address, 
 and it is not explicitly required and  is ignored in function 
 drm_intel_gem_bo_alloc_internal. 
 

It's for 'alignment' parameter. Yes current libdrm will optimize it as to be
page aligned, but as your comment above it's a good note to say that we need
aux buffer allocation to be page aligned, like for other objects, printf buf,
timestamp, etc.

-- 
Open Source Technology Center, Intel ltd.

$gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827


signature.asc
Description: Digital signature
___
Beignet mailing list
Beignet@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/beignet


Re: [Beignet] [PATCH v2 1/5] Make use of write enable flag for mem bo map

2014-10-24 Thread Yang, Rong R
This patchset LGTM, thanks.

 -Original Message-
 From: Beignet [mailto:beignet-boun...@lists.freedesktop.org] On Behalf Of
 Zhenyu Wang
 Sent: Thursday, October 23, 2014 15:19
 To: beignet@lists.freedesktop.org
 Subject: [Beignet] [PATCH v2 1/5] Make use of write enable flag for mem bo
 map
 
 Use drm/intel optimization for mem bo mapping in case of read or write.
 So we could be possibly waiting less.
 
 This also adds 'map_flags' check in
 clEnqueueMapBuffer/clEnqueueMapImage
 for actual read or write mapping.
 
 But currently leave clMapBufferIntel untouched which might break ABI/API.
 
 v2: Fix write_map flag in clEnqueueMapBuffer/clEnqueueMapImage.
 
 Signed-off-by: Zhenyu Wang zhen...@linux.intel.com
 ---
  src/cl_api.c   |  6 +-
  src/cl_command_queue.c |  2 +-
  src/cl_enqueue.c   | 18 +-
  src/cl_enqueue.h   |  1 +
  src/cl_mem.c   | 18 +-
  src/cl_mem.h   |  4 ++--
  6 files changed, 27 insertions(+), 22 deletions(-)
 
 diff --git a/src/cl_api.c b/src/cl_api.c index 8a2e999..05d3093 100644
 --- a/src/cl_api.c
 +++ b/src/cl_api.c
 @@ -2653,6 +2653,8 @@ clEnqueueMapBuffer(cl_command_queue
 command_queue,
data-size= size;
data-ptr = ptr;
data-unsync_map  = 1;
 +  if (map_flags  (CL_MAP_WRITE | CL_MAP_WRITE_INVALIDATE_REGION))
 +data-write_map = 1;
 
if(handle_events(command_queue, num_events_in_wait_list,
 event_wait_list,
 event, data, CL_COMMAND_MAP_BUFFER) ==
 CL_ENQUEUE_EXECUTE_IMM) {
 @@ -2735,6 +2737,8 @@ clEnqueueMapImage(cl_command_queue
 command_queue,
data-region[0]   = region[0];  data-region[1] = region[1];  
 data-region[2]
 = region[2];
data-ptr = ptr;
data-unsync_map  = 1;
 +  if (map_flags  (CL_MAP_WRITE | CL_MAP_WRITE_INVALIDATE_REGION))
 +data-write_map = 1;
 
if(handle_events(command_queue, num_events_in_wait_list,
 event_wait_list,
 event, data, CL_COMMAND_MAP_IMAGE) ==
 CL_ENQUEUE_EXECUTE_IMM) { @@ -3203,7 +3207,7 @@
 clMapBufferIntel(cl_mem mem, cl_int *errcode_ret)
void *ptr = NULL;
cl_int err = CL_SUCCESS;
CHECK_MEM (mem);
 -  ptr = cl_mem_map(mem);
 +  ptr = cl_mem_map(mem, 1);
  error:
if (errcode_ret)
  *errcode_ret = err;
 diff --git a/src/cl_command_queue.c b/src/cl_command_queue.c index
 48deba0..d07774f 100644
 --- a/src/cl_command_queue.c
 +++ b/src/cl_command_queue.c
 @@ -336,7 +336,7 @@ cl_fulsim_read_all_surfaces(cl_command_queue
 queue, cl_kernel k)
  assert(mem-bo);
  chunk_n = cl_buffer_get_size(mem-bo) / chunk_sz;
  chunk_remainder = cl_buffer_get_size(mem-bo) % chunk_sz;
 -to = cl_mem_map(mem);
 +to = cl_mem_map(mem, 1);
  for (j = 0; j  chunk_n; ++j) {
char name[256];
sprintf(name, dump%03i.bmp, curr); diff --git a/src/cl_enqueue.c
 b/src/cl_enqueue.c index af118ad..2e43122 100644
 --- a/src/cl_enqueue.c
 +++ b/src/cl_enqueue.c
 @@ -38,7 +38,7 @@ cl_int cl_enqueue_read_buffer(enqueue_data* data)
void* src_ptr;
struct _cl_mem_buffer* buffer = (struct _cl_mem_buffer*)mem;
 
 -  if (!(src_ptr = cl_mem_map_auto(data-mem_obj))) {
 +  if (!(src_ptr = cl_mem_map_auto(data-mem_obj, 0))) {
  err = CL_MAP_FAILURE;
  goto error;
}
 @@ -66,7 +66,7 @@ cl_int cl_enqueue_read_buffer_rect(enqueue_data*
 data)
   mem-type == CL_MEM_SUBBUFFER_TYPE);
struct _cl_mem_buffer* buffer = (struct _cl_mem_buffer*)mem;
 
 -  if (!(src_ptr = cl_mem_map_auto(mem))) {
 +  if (!(src_ptr = cl_mem_map_auto(mem, 0))) {
  err = CL_MAP_FAILURE;
  goto error;
}
 @@ -112,7 +112,7 @@ cl_int cl_enqueue_write_buffer(enqueue_data
 *data)
struct _cl_mem_buffer* buffer = (struct _cl_mem_buffer*)mem;
void* dst_ptr;
 
 -  if (!(dst_ptr = cl_mem_map_auto(data-mem_obj))) {
 +  if (!(dst_ptr = cl_mem_map_auto(data-mem_obj, 1))) {
  err = CL_MAP_FAILURE;
  goto error;
}
 @@ -140,7 +140,7 @@ cl_int cl_enqueue_write_buffer_rect(enqueue_data
 *data)
   mem-type == CL_MEM_SUBBUFFER_TYPE);
struct _cl_mem_buffer* buffer = (struct _cl_mem_buffer*)mem;
 
 -  if (!(dst_ptr = cl_mem_map_auto(mem))) {
 +  if (!(dst_ptr = cl_mem_map_auto(mem, 1))) {
  err = CL_MAP_FAILURE;
  goto error;
}
 @@ -188,7 +188,7 @@ cl_int cl_enqueue_read_image(enqueue_data *data)
const size_t* origin = data-origin;
const size_t* region = data-region;
 
 -  if (!(src_ptr = cl_mem_map_auto(mem))) {
 +  if (!(src_ptr = cl_mem_map_auto(mem, 0))) {
  err = CL_MAP_FAILURE;
  goto error;
}
 @@ -231,7 +231,7 @@ cl_int cl_enqueue_write_image(enqueue_data *data)
cl_mem mem = data-mem_obj;
CHECK_IMAGE(mem, image);
 
 -  if (!(dst_ptr = cl_mem_map_auto(mem))) {
 +  if (!(dst_ptr = cl_mem_map_auto(mem, 1))) {
  err = CL_MAP_FAILURE;
  goto error;
}
 @@ -260,7 +260,7 @@ cl_int cl_enqueue_map_buffer(enqueue_data *data)
  //because using unsync map in clEnqueueMapBuffer, so force use
 map_gtt 

Re: [Beignet] [PATCH 3/5] Remove intel_gpgpu_check_binded_buf_address()

2014-10-24 Thread Zhigang Gong
This assertion is just to make sure we will not get a NULL pointer
for a normal buffer. The OpenCL spec doesn't give a very specific
statement about a NULL buffer object. But it does allow to pass
a NULL to a buffer object. Thus some one may implement the following
kernel:

__kernel foo( global uint * input, global uint *output1, global uint *output2)
{
  ...
  if (output1)
output1[get_global_id(0)] = result0;
  if (output2)
output2[get_global_id(1)] = result1;
}

If we pass in a NULL output1 which should be a normal allocated buffer,
it breaks the above code.

Without PPGTT, this assertion works fine till now. If with PPGTT,
we may hit this assertion, but we can't just simply remove this assertion.

Do you have good suggestion to always avoid allocate 0 offset for a
valid buffer object?

All the other patches in this series LGTM, I will push them latter.
Let's defer this one to next week.

Thanks,
Zhigang Gong.

On Thu, Oct 23, 2014 at 03:19:24PM +0800, Zhenyu Wang wrote:
 On recent kernel with full PPGTT support, we can possibly bind buffer
 offset with 0, but intel_gpgpu_check_binded_buf_address() always thinks
 it's invalid, which is not true. So simply remove the check.
 
 Signed-off-by: Zhenyu Wang zhen...@linux.intel.com
 ---
  src/intel/intel_gpgpu.c | 9 -
  1 file changed, 9 deletions(-)
 
 diff --git a/src/intel/intel_gpgpu.c b/src/intel/intel_gpgpu.c
 index 6cd73d6..b7958d5 100644
 --- a/src/intel/intel_gpgpu.c
 +++ b/src/intel/intel_gpgpu.c
 @@ -694,14 +694,6 @@ intel_gpgpu_batch_reset(intel_gpgpu_t *gpgpu, size_t sz)
  {
return intel_batchbuffer_reset(gpgpu-batch, sz);
  }
 -/* check we do not get a 0 starting address for binded buf */
 -static void
 -intel_gpgpu_check_binded_buf_address(intel_gpgpu_t *gpgpu)
 -{
 -  uint32_t i;
 -  for (i = 0; i  gpgpu-binded_n; ++i)
 -assert(gpgpu-binded_buf[i]-offset != 0);
 -}
  
  static void
  intel_gpgpu_flush_batch_buffer(intel_batchbuffer_t *batch)
 @@ -717,7 +709,6 @@ intel_gpgpu_flush(intel_gpgpu_t *gpgpu)
if (!gpgpu-batch || !gpgpu-batch-buffer)
  return;
intel_gpgpu_flush_batch_buffer(gpgpu-batch);
 -  intel_gpgpu_check_binded_buf_address(gpgpu);
  }
  
  static int
 -- 
 2.1.1
 
 ___
 Beignet mailing list
 Beignet@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/beignet
___
Beignet mailing list
Beignet@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/beignet


Re: [Beignet] Performances of beignet ... macbook pro 13

2014-10-24 Thread Zhigang Gong
Jérôme,

Thanks for sharing the very interesting performance data.
We will look into the large image issue.

Thanks,
Zhigang Gong.
___
Beignet mailing list
Beignet@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/beignet


Re: [Beignet] johntheripper/OpenCL clGetEventProfilingInfo issue

2014-10-24 Thread Zhigang Gong
This should be an application bug, according to OpenCL 1.2 spec:

  CL_PROFILING_INFO_NOT_AVAILABLE if the CL_QUEUE_PROFILING_ENABLE
  flag is not set for the command-queue, if the execution status of
the command identified
  by event is not CL_COMPLETE or if event is a user event object.

To make sure an event's state to be CL_COMPLETE, you need to call
clWaitForEvents()
rather than clFinish().

According to spec, clFinish() is used to :
  blocks until all previously queued OpenCL commands in command_queue are issued
  to the associated device and have completed.

It is not to update all the related event's state. And it is too
heavy, as it will wait for the command
to be completed. The event's CL_COMPLETE state means the command has
been flushed into
the GPU's command buffer and may haven't completed. It's used to do
GPU command queue
side synchronization. clFinish() is to synchronize with host CPU.

I would recommend you to call clWaitForEvents before you call the
clGetEventProfilingInfo().
If you still met problems with that change, please let us know.

Thanks,
Zhigang Gong.


On Sat, Oct 25, 2014 at 6:13 AM, Oleksii Shevchuk
public.ava...@gmail.com wrote:
 Hi list!

 I'm trying to use beignet (67650189c145a65addffdcd4d8ff452709bd4149)
 with johntheripper (ceb8aa1899817965afd26ffb651932e475a7bbc7 at
 https://github.com/magnumripper/JohnTheRipper) on i7/HD4000.
 Test suite reports that tests are passed:

 .

 compiler_abs_diff_int()Instruction (#42) src too large pooloffset 11
 Instruction (#20) src too large pooloffset 11
 [SUCCESS]

 .


 [SUCCESS]
 double_precision_check()
   - WARN: GPU doesn't have correct double precision. Got 9.995699E-05, 
 expected 0.000101

 summary:
 --
   total: 421
   run: 421
   pass: 419
   fail: 0
   pass rate: 1.00

 But with john I ran into the next problem:

  john -format=rar-opencl -te
 Will run 4 OpenMP threads
 Device 0: Intel(R) HD Graphics IvyBridge M GT2
 OpenCL error (CL_PROFILING_INFO_NOT_AVAILABLE) in file (common-opencl.c) at 
 line (1210) - (Failed in clGetEventProfilingInfo I)

 Here is this place:
 https://github.com/magnumripper/JohnTheRipper/blob/bleeding-jumbo/src/common-opencl.c#L1205

 According to the cl_api.c, in the clGetEventProfilingInfo, status is not
 equals to CL_COMPLETE, and all other conditions were ok.

 I add this to cl_api.c / clGetEventProfilingInfo
 fprintf(stderr, ER: %d? %d? %d?\n,.
   event-type == CL_COMMAND_USER, !(event-queue-props  
 CL_QUEUE_PROFILING_ENABLE),.
   event-status != CL_COMPLETE);

 And result was:
 LD_PRELOAD=/home/avatar/Software/beignet-build/src/libcl.so ../run/john 
 -format=encfs-opencl -te
 Will run 4 OpenMP threads
 Device 0: Intel(R) HD Graphics IvyBridge M GT2
 ER: 0? 0? 1?
 OpenCL error (CL_PROFILING_INFO_NOT_AVAILABLE) in file (common-opencl.c) at 
 line (1211) - (Failed in clGetEventProfilingInfo I)

 According to the documentation, before requesting
 clGetEventProfilingInfo, the queue should be created with
 CL_QUEUE_PROFILING_ENABLE. So I add clFinish after all queues creation
 with this flag. Result was the same.

 So, the question is, are there problems in the beignet implementation,
 or in their code?

 Thanks.

 // wbr
 // alxchk
 ___
 Beignet mailing list
 Beignet@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/beignet
___
Beignet mailing list
Beignet@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/beignet


Re: [Beignet] beignet not working with Wolfram Mathematica

2014-10-24 Thread Zhigang Gong
As you are using 3.16.x kernel, the most possible solution is in the
README.md's known issue section:

Almost all unit tests fail on Linux kernel 3.15/3.16. There is a known
issue in some versions of linux kernel which enable register whitelist
feature but miss some necessary registers which are required for
beignet. The problematic version are around 3.15 and 3.16 which have
commit f0a346b... but haven't commit c9224f... If it is the case, you
can apply c9224f... manually and rebuild the kernel or just disable
the parse command by invoke the following command (use Ubuntu as an
example):
# echo 0  /sys/module/i915/parameters/enable_cmd_parser

And I strongly recommend you to use the latest git master version
beignet. If you still have problem, please feel free to report here.
Thanks.

On Sun, Oct 5, 2014 at 2:22 AM, Lorenzo Pistone blaffabla...@gmail.com wrote:
 Hello,
 I'm trying to run beignet from Wolfram Mathematica 10.0.1, I have beignet
 0.9.2, Fedora 20 x86_64, kernel 3.16.3, i5-3230M.
 First, there is an issue which is probably to be blamed on Mathematica (you
 need to manually load /usr/lib64/beignet/libcl.so with LibraryLoad), I
 suppose that Mathematica doesn't know about the icd system.

 Anyway, after that, this test program

 src = __kernel void myKernel( __global mint * global0Id, __global mint *
 global1Id, mint width, mint height) {
   int xIndex = get_global_id(0);
   int yIndex = get_global_id(1);
   int index = xIndex + yIndex*width;
   if (xIndex  width  yIndex  height) {
  global0Id[index] = get_local_id(0);
  global1Id[index] = get_local_id(1);
   }
 };


 which is suggested in the examples  of the Mathematica implementation
 (http://reference.wolfram.com/language/OpenCLLink/tutorial/Programming.html)
 produces all zeroes. If instead the pocl driver is loaded, the kernel is
 executed correctly. Also I could run the LuxMark benchmarks (even though on
 some tests I see yellow spots that I believe are glitches).

 I am sorry if the question is a bit vague, but I don't know much about
 OpenCL and I indeed wanted to start learning while using Mathematica, which
 is a tool that I already need for other reasons.

 Lorenzo

 ___
 Beignet mailing list
 Beignet@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/beignet

___
Beignet mailing list
Beignet@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/beignet