Re: [ViennaCL-devel] Custom OpenCL kernel confusion

Charles Determan Fri, 10 Jun 2016 11:59:56 -0700

Karl,

I believe I figured it out, your comment about the global sizes allowed me
to realize the the defaults don't account for a second dimension.  Once I
set that I am able to get the kernel to work properly.  Thank you for
listening and directing me to different points to check.


Regards,
Charles

On Fri, Jun 10, 2016 at 7:42 AM, Charles Determan <cdeterma...@gmail.com>
wrote:

> I neglected one further question you had
>
> Which local and global work sizes do you use?
>
> I am not setting any local/global work sizes as I thought the defaults
> specified by ViennaCL were supposed to be sufficient as noted in the
> documentation (
> http://viennacl.sourceforge.net/doc/manual-custom-kernels.html) - 'The
> default work sizes suffice for most cases'.
>
> Regards,
> Charles
>
> On Fri, Jun 10, 2016 at 7:35 AM, Charles Determan <cdeterma...@gmail.com>
> wrote:
>
>> Karl,
>>
>> I am trying to adapt from a previous kernel I knew worked on an unpadded
>> matrix.
>>
>> __kernel void iMatMult(const int Mdim, const int Pdim,
>>                        __global const int *A, __global const int *B,
>> __global int *C) {
>>
>>     int k;
>>
>>     // Get the index of the elements to be processed
>>     const int globalRow = get_global_id(0); // C Row ID
>>     const int globalCol = get_global_id(1); // C Col ID
>>     int tmp = 0;
>>
>>     // Do the operation
>>     for(k=0; k < Pdim; k++){
>>         tmp += A[k*Mdim+globalRow] * B[globalCol*Pdim+k];
>>     }
>>     C[globalCol*Mdim+globalRow] = tmp;
>> }
>>
>> So when you ask - "where is the third dimension? Are you assuming C to be
>> M-by-M?"
>>
>> I haven't passed a third dimension as Mdim is the number of columns and
>> Pdim is the number of rows in matrix 'A'.
>>
>> Which values do you pass to the kernel? Which local and global work sizes
>> do you use?
>>
>> Right now I am passing Mdim, Pdim, MdimPad (padded number of columns),
>> PdimPad (padded number of rows), and three matrices.
>>
>> I'm confused with your use of MdimPad and PdimPad here. As currently
>> written, A has Mdim columns, and B has Pdim columns. But this doesn't agree
>> with the if-check above, where C is assumed Mdim-by-Mdim.
>>
>> I am using MdimPad and PdimPad to index the matrix elements because they
>> are padded (this is new to me for writing OpenCL kernels).  C is intended
>> to be square but I can't even get it to work with a square matrix.  That
>> line actually looks like I intended to have:
>>
>> if (globalRow > MdimPad || globalCol > PdimPad)
>>         return;
>>
>> but that still doesn't fix the problem for me.
>>
>> The last line assumes C to be M-by-M. Is this the case?
>>
>> Again, I am trying to base this off the previous kernel which I thought
>> worked for non-square matrices but I could very well be mistaken.  The
>> entire goal here is to just get a basic working integer gemm kernel for
>> square or rectangular matrices.  I really didn't think it would be
>> difficult but I think I have fallen in a rabbit hole at this point and
>> likely just confusing myself.
>>
>> Regards,
>> Charles
>>
>>
>> On Fri, Jun 10, 2016 at 3:40 AM, Karl Rupp <r...@iue.tuwien.ac.at> wrote:
>>
>>> Hi Charles,
>>>
>>> Here is the current kernel
>>>> with all the different attempts commented out (where MdimPad and PdimPad
>>>> or the padded dimensions).
>>>>
>>>
>>> where is the third dimension? Are you assuming C to be M-by-M?
>>>
>>>
>>>
>>> If I don't have a size condition check, the
>>>> device quickly runs out of resources (Error: ViennaCL: FATAL ERROR:
>>>> CL_OUT_OF_RESOURCES ).  Any thoughts?  I feel like I must be missing
>>>> something simple at this point.
>>>>
>>>
>>> Which values do you pass to the kernel? Which local and global work
>>> sizes do you use?
>>>
>>>
>>>
>>>
>>> __kernel void iMatMult(const int Mdim, const int MdimPad,
>>>>                         const int Pdim, const int PdimPad,
>>>>                         __global const int *A, __global const int *B,
>>>> __global int *C) {
>>>>
>>>>      // Get the index of the elements to be processed
>>>>      const int globalRow = get_global_id(0); // C Row ID
>>>>      const int globalCol = get_global_id(1); // C Col ID
>>>>      int tmp = 0;
>>>>
>>>>      if (globalRow > MdimPad || globalCol > MdimPad)
>>>>          return;
>>>>
>>>
>>> Here it should be enough to check against Mdim.
>>>
>>>      printf("globalCol = %d\n", globalCol);
>>>>      printf("globalRow = %d\n", globalRow);
>>>>
>>>>      // Do the operation
>>>>      for(int k=0; k < Pdim; k++){
>>>>          tmp += A[globalRow * MdimPad + k] * B[globalCol+PdimPad*k];
>>>>
>>>
>>> I'm confused with your use of MdimPad and PdimPad here. As currently
>>> written, A has Mdim columns, and B has Pdim columns. But this doesn't agree
>>> with the if-check above, where C is assumed Mdim-by-Mdim.
>>>
>>>      }
>>>>
>>>>      C[globalCol+MdimPad*globalRow] = tmp;
>>>>
>>>
>>> The last line assumes C to be M-by-M. Is this the case?
>>>
>>> Best regards,
>>> Karli
>>>
>>>
>>
>

------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are 
consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
J-Flow, sFlow and other flows. Make informed decisions using capacity 
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e

_______________________________________________
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel

Re: [ViennaCL-devel] Custom OpenCL kernel confusion

Reply via email to