Sorry for the late since I'm on travel. But I can also trigger this bug ......
Thanks --Sun, Yi > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of > Xing, Homer > Sent: Wednesday, September 25, 2013 10:34 PM > To: Zhigang Gong; Yang, Rong R > Cc: [email protected] > Subject: Re: [Beignet] [PATCH] Remove global offset need divide by local size > restriction. > > I can take this bug. But currently I don't have enough information. > > Zhigang, do you know more details about how to find dependency bug in code > path? > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On > Behalf Of Zhigang Gong > Sent: Thursday, September 26, 2013 10:43 AM > To: Yang, Rong R > Cc: [email protected] > Subject: Re: [Beignet] [PATCH] Remove global offset need divide by local size > restriction. > > Hi guys, > > I just pushed this patch, and it causes a regression as below: > > compiler_abs_diff_long16: > compiler_abs_diff_long16() [FAILED] > Error: !memcmp(buf_data[2], cpu_diff, sizeof(T) * n) > at file /home/gongzg/git/fdo/beignet/utests/compiler_abs_diff.cpp, function > compiler_abs_diff_with_type, line 177 > > I took a quick look at it and found the root cause of this regression is not > in this > patch. As when I disable the OCL_PRE_ALLOC_INSN, it works fine. > For me, the most possible root cause is there is a dependency bug within the > abs_diff_long code path. > > I will take vacation from tomorrow and will back to work at 8/10. If anybody > in > the list can take this bug, I will appreciate. Thanks > > - Zhigang > > On Mon, Sep 23, 2013 at 02:04:08PM +0800, Yang Rong wrote: > > Set to global offset to 0 in walker, and add global offset when > > get_global_id. > > > > Signed-off-by: Yang Rong <[email protected]> > > --- > > backend/src/ocl_stdlib.tmpl.h | 2 +- > > src/cl_api.c | 7 ------- > > src/intel/intel_gpgpu.c | 6 +++--- > > 3 files changed, 4 insertions(+), 11 deletions(-) > > > > diff --git a/backend/src/ocl_stdlib.tmpl.h > > b/backend/src/ocl_stdlib.tmpl.h index 9b76ba1..4921ee4 100644 > > --- a/backend/src/ocl_stdlib.tmpl.h > > +++ b/backend/src/ocl_stdlib.tmpl.h > > @@ -588,7 +588,7 @@ DECL_PUBLIC_WORK_ITEM_FN(get_num_groups, 1) > #undef > > DECL_PUBLIC_WORK_ITEM_FN > > > > INLINE uint get_global_id(uint dim) { > > - return get_local_id(dim) + get_local_size(dim) * get_group_id(dim); > > + return get_local_id(dim) + get_local_size(dim) * get_group_id(dim) > > + + get_global_offset(dim); > > } > > > > > > ////////////////////////////////////////////////////////////////////// > > /////// diff --git a/src/cl_api.c b/src/cl_api.c index > > b60160b..c19b80a 100644 > > --- a/src/cl_api.c > > +++ b/src/cl_api.c > > @@ -2264,19 +2264,12 @@ clEnqueueNDRangeKernel(cl_command_queue > command_queue, > > goto error; > > } > > > > - /* Check offset values. We add a non standard restriction. The offsets > must > > - * also be evenly divided by the local sizes > > - */ > > if (global_work_offset != NULL) > > for (i = 0; i < work_dim; ++i) { > > if (UNLIKELY(~0LL - global_work_offset[i] > global_work_size[i])) { > > err = CL_INVALID_GLOBAL_OFFSET; > > goto error; > > } > > - if (UNLIKELY(local_work_size != NULL && global_work_offset[i] % > local_work_size[i])) { > > - err = CL_INVALID_GLOBAL_OFFSET; > > - goto error; > > - } > > } > > > > /* Local sizes must be non-null and divide global sizes */ diff > > --git a/src/intel/intel_gpgpu.c b/src/intel/intel_gpgpu.c index > > 7b82b76..44f44ef 100644 > > --- a/src/intel/intel_gpgpu.c > > +++ b/src/intel/intel_gpgpu.c > > @@ -886,11 +886,11 @@ intel_gpgpu_walker(intel_gpgpu_t *gpgpu, > > OUT_BATCH(gpgpu->batch, (1 << 30) | (thread_n-1)); /* SIMD16 | > thread max */ > > else > > OUT_BATCH(gpgpu->batch, (0 << 30) | (thread_n-1)); /* SIMD8 | > > thread max */ > > - OUT_BATCH(gpgpu->batch, global_wk_off[0]); > > + OUT_BATCH(gpgpu->batch, 0); > > OUT_BATCH(gpgpu->batch, global_wk_dim[0]); > > - OUT_BATCH(gpgpu->batch, global_wk_off[1]); > > + OUT_BATCH(gpgpu->batch, 0); > > OUT_BATCH(gpgpu->batch, global_wk_dim[1]); > > - OUT_BATCH(gpgpu->batch, global_wk_off[2]); > > + OUT_BATCH(gpgpu->batch, 0); > > OUT_BATCH(gpgpu->batch, global_wk_dim[2]); > > OUT_BATCH(gpgpu->batch, right_mask); > > OUT_BATCH(gpgpu->batch, ~0x0); /* we > always set height as 1, so set bottom mask as all 1*/ > > -- > > 1.8.1.2 > > > > _______________________________________________ > > Beignet mailing list > > [email protected] > > http://lists.freedesktop.org/mailman/listinfo/beignet > _______________________________________________ > Beignet mailing list > [email protected] > http://lists.freedesktop.org/mailman/listinfo/beignet > _______________________________________________ > Beignet mailing list > [email protected] > http://lists.freedesktop.org/mailman/listinfo/beignet _______________________________________________ Beignet mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/beignet
