I am at a loss how I can use a custom kernel that operates on a component of a matrix. I can not seem to get a device pointer to the internal buffer in any obvious way. (Maybe it's somewhere in the doc.s but I can't find it). I notice that you can do strange things like this, from the custom-kernels.cpp example.
viennacl::ocl::enqueue(my_kernel_mul(vec1, vec2, result_mul, static_cast<cl_uint>(vec1.size()))); In theory vec1 and vec2 should float * pointers to a device buffer, according to the type signature of my_kernel_mul(). However vec1 and vec2 and result_mul are all viennacl vectors. This pattern works for matrices as well for custom kernels. Unfortunately I can not work on a sub-block of the matrix. This doesn't work float *ptr = &(matrix(i,j)) ; In fact I can't even do this, float *ptr = matrix ; So I'm not really understanding the internals of how this works. I'm more familiar with CUDA where you can work with device pointers and pointer arithmetic. My kernel needs to fill in sublocks of the output matrix and so I need to pass in a pointer that has an offset. Is there any mechanism for this? My other problem with viennacl is that the zero padding and current API seems to force at least 3 copies to get dense matrix data into and out of the device buffer into a flat float Host array with no zero padding. It's such a common use case I'm surprised that there isn't some more efficient method or function to support this. Thanks in advance for any help. ------------------------------------------------------------------------------ Monitor Your Dynamic Infrastructure at Any Scale With Datadog! Get real-time metrics from all of your servers, apps and tools in one place. SourceForge users - Click here to start your Free Trial of Datadog now! http://pubads.g.doubleclick.net/gampad/clk?id=241902991&iu=/4140 _______________________________________________ ViennaCL-support mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/viennacl-support
