Thanks for your comment, I'll send out a new version to fix this error. -----Original Message----- From: Zhigang Gong [mailto:zhigang.g...@linux.intel.com] Sent: Friday, February 13, 2015 10:18 To: Weng, Chuanbo Cc: beignet@lists.freedesktop.org Subject: Re: [Beignet] [PATCH] Optimization of clEnqueueCopyImageToBuffer for 16 aligned case.
On Fri, Feb 13, 2015 at 12:37:03AM +0800, Chuanbo Weng wrote: > We can change the image_channel_order to CL_RGBA and > image_channel_data_type to CL_UNSIGNED_INT32 for some special case, > thus 16 bytes can be read by one work item. Bandwidth is fully used. > > Signed-off-by: Chuanbo Weng <chuanbo.w...@intel.com> > --- > src/CMakeLists.txt | 2 +- > src/cl_context.h | 1 + > src/cl_mem.c | 44 > ++++++++++++++++++---- > .../cl_internal_copy_image_2d_to_buffer_align16.cl | 19 ++++++++++ > 4 files changed, 57 insertions(+), 9 deletions(-) create mode 100644 > src/kernels/cl_internal_copy_image_2d_to_buffer_align16.cl > > diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt index > 939f58d..d4181d8 100644 > --- a/src/CMakeLists.txt > +++ b/src/CMakeLists.txt > @@ -49,7 +49,7 @@ cl_internal_copy_image_3d_to_2d > cl_internal_copy_image_2d_to_3d cl_internal_copy > cl_internal_copy_image_2d_to_2d_array > cl_internal_copy_image_1d_array_to_1d_array > cl_internal_copy_image_2d_array_to_2d_array > cl_internal_copy_image_2d_array_to_2d > cl_internal_copy_image_2d_array_to_3d > cl_internal_copy_image_3d_to_2d_array > -cl_internal_copy_image_2d_to_buffer > cl_internal_copy_image_3d_to_buffer > +cl_internal_copy_image_2d_to_buffer > +cl_internal_copy_image_2d_to_buffer_align16 > +cl_internal_copy_image_3d_to_buffer > cl_internal_copy_buffer_to_image_2d > cl_internal_copy_buffer_to_image_3d > cl_internal_fill_buf_align8 cl_internal_fill_buf_align4 > cl_internal_fill_buf_align2 cl_internal_fill_buf_unalign diff --git > a/src/cl_context.h b/src/cl_context.h index 2ea0a73..fdbfd2a 100644 > --- a/src/cl_context.h > +++ b/src/cl_context.h > @@ -60,6 +60,7 @@ enum _cl_internal_ker_type { > CL_ENQUEUE_COPY_IMAGE_2D_ARRAY_TO_3D, //copy image 2d array to image > 3d > CL_ENQUEUE_COPY_IMAGE_3D_TO_2D_ARRAY, //copy image 3d to image 2d > array > CL_ENQUEUE_COPY_IMAGE_2D_TO_BUFFER, //copy image 2d to buffer > + CL_ENQUEUE_COPY_IMAGE_2D_TO_BUFFER_ALIGN16, > CL_ENQUEUE_COPY_IMAGE_3D_TO_BUFFER, //copy image 3d tobuffer > CL_ENQUEUE_COPY_BUFFER_TO_IMAGE_2D, //copy buffer to image 2d > CL_ENQUEUE_COPY_BUFFER_TO_IMAGE_3D, //copy buffer to image 3d > diff --git a/src/cl_mem.c b/src/cl_mem.c index e58a183..e9cf539 100644 > --- a/src/cl_mem.c > +++ b/src/cl_mem.c > @@ -1714,6 +1714,10 @@ cl_mem_copy_image_to_buffer(cl_command_queue queue, > struct _cl_mem_image* image, > uint32_t intel_fmt, bpp; > cl_image_format fmt; > size_t origin0, region0; > + size_t kn_dst_offset; > + int align16 = 0; > + size_t align_size = 1; > + size_t w_saved; > > if(region[1] == 1) local_sz[1] = 1; > if(region[2] == 1) local_sz[2] = 1; @@ -1724,24 +1728,48 @@ > cl_mem_copy_image_to_buffer(cl_command_queue queue, struct _cl_mem_image* > image, > /* We use one kernel to copy the data. The kernel is lazily created. */ > assert(image->base.ctx == buffer->ctx); > > - fmt.image_channel_order = CL_R; > - fmt.image_channel_data_type = CL_UNSIGNED_INT8; > intel_fmt = image->intel_fmt; > bpp = image->bpp; > - image->intel_fmt = cl_image_get_intel_format(&fmt); > - image->w = image->w * image->bpp; > - image->bpp = 1; > + w_saved = image->w; > region0 = region[0] * bpp; > - origin0 = src_origin[0] * bpp; > + kn_dst_offset = dst_offset; > + if( ((image->w * image->bpp) % 16 == 0) && ((src_origin[0] * bpp) % 16 == > 0) && > + (region0 % 16 == 0) && (dst_offset % 16 == 0) ){ > + fmt.image_channel_order = CL_RGBA; > + fmt.image_channel_data_type = CL_UNSIGNED_INT32; > + align16 = 1; > + align_size = 16; The above logic will break the IMAGE3D code path, as you haven't made the corresponding change in the normal kernel cl_internal_copy_image_3d_to_buffer.cl. _______________________________________________ Beignet mailing list Beignet@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/beignet