I am at a loss how I can use a custom kernel that operates on a 
component of a matrix.
I can not seem to get a device pointer to the internal buffer in any 
obvious way.  (Maybe it's somewhere in the doc.s but I can't find it).
I notice that you can do strange things like this, from the 
custom-kernels.cpp example.

viennacl::ocl::enqueue(my_kernel_mul(vec1, vec2, result_mul, 
static_cast<cl_uint>(vec1.size())));

In theory vec1 and vec2 should float * pointers to a device buffer, 
according to the type signature of my_kernel_mul().
However vec1 and vec2 and result_mul  are all viennacl vectors. This 
pattern works for matrices as well for custom kernels.


Unfortunately I can not work on a sub-block of the matrix.  This doesn't 
work
float *ptr = &(matrix(i,j)) ;

In fact I can't even do this,
float *ptr = matrix ;

So I'm not really understanding the internals of how this works. I'm 
more familiar with CUDA where you can work with device pointers and 
pointer arithmetic.
My kernel needs to fill in sublocks of the output matrix and so I need 
to pass in a pointer that has an offset.  Is there any mechanism for this?

My other problem with viennacl is that the zero padding and current API 
seems to force at least 3 copies to get  dense matrix data into and out 
of the device buffer into a flat float Host array with no zero padding.  
It's such a common use case I'm surprised that there isn't some more 
efficient method or function to support this.

Thanks in advance for any help.



------------------------------------------------------------------------------
Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
Get real-time metrics from all of your servers, apps and tools
in one place.
SourceForge users - Click here to start your Free Trial of Datadog now!
http://pubads.g.doubleclick.net/gampad/clk?id=241902991&iu=/4140
_______________________________________________
ViennaCL-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/viennacl-support

Reply via email to