Hi,
Perhaps that *GetSource method should also return an opaque device "Mat"
pointer that the user is responsible for shepherding into the kernel
From which they call the device MatSetValues?
This is easy of the OpenCL management is within PETSc (i.e. context,
buffers and command queues managed by us). I expect that a bunch of
users wants to provide their own context and stuff, which would require
us to offer something like
MatAttachOpenCLEnvironment(Mat,cl_context,cl_command_queue);
for all the matrix and vector objects involved. Note that this needs to
be attached before the matrix is created. I think this is doable.
Okay, but why do they need to provide their own "Mat" data?
If the context and queue are not attached to objects, then they would
essentially represent global state, which is something I want to avoid.
What if a user for example wants to split the matrix accross multiple
OpenCL contexts (e.g. an AMD GPU and a Xeon Phi)?
I envision the user doing all the job launching so they have control
over everything. The Mat "implements" its MatSetValuesOpenCL (called on
the device) so it needs to provide the device handle.
It needs to provide the command queue(s) (rather than the device(s)) and
access to its memory handles.
What am I missing?
I think you were referring to the 'Mat' on the device, while I was
referring to the plain PETSc Mat. The difficulty for a 'Mat' on the
device is a limitation of OpenCL in defining opaque types: It is not
possible to have something like
typedef struct OpenCLMat {
__global int row_indices;
__global int col_indices;
__global float entries;
} PetscMat;
and pass this as a single kernel argument.
(cf. OpenCL standard or
http://stackoverflow.com/questions/17635898/passing-struct-with-pointer-members-to-opencl-kernel-using-pyopencl)
If we only had to color the interfaces between thread blocks, we should
have fewer colors.
If colors are assigned in a red-black fashion, then yes. :-)
Best regards,
Karli