Jerome Kieffer <jerome.kief...@esrf.fr> writes: > LogicError: clEnqueueFillBuffer failed: INVALID_OPERATION > > The same "bug" occurs in the PoCL driver when addressing nvidia GPU, > since the corresponding low-level primitive is absent in NVVM. > > I wonder if we should best address this issue within our code or it > could be addressed at a higher level. Getting from nvidia that they fix > their code to conform for the specification is an illusion. But does it > make sense to address this as part of pyopencl ?
Huh, that's not ideal. I don't have Fermi-gen hardware around any more, so I didn't notice it. There exists a fallback path for the thing you mention, and PyOpenCL tries to be careful about selecting it [0]. According to the CL spec, clEnqueueFillBuffer is unconditionally available if the device advertises CL1.2. So this looks like an Nvidia bug to me, but that realization likely won't buy us much, since I'm pretty sure Nvidia isn't going to fix it. We *could* complicate the fallback logic to mop up after Nvidia. I'd be open to reviewing a patch. As for pocl, my PhD student Isuru recently submitted a PR that might help [1] You could try pocl master to see if that makes things better. As an added bonus, pocl master also contains significant performance fixes for CUDA POCL [2], also due to Isuru. Hope that helps, Andreas [0] https://github.com/inducer/pyopencl/blob/fc5847239728d67cad157c1bbe50431e331553be/pyopencl/array.py#L1231-L1240 [1] https://github.com/pocl/pocl/pull/834 [2] https://github.com/pocl/pocl/issues/830
signature.asc
Description: PGP signature
_______________________________________________ PyOpenCL mailing list -- pyopencl@tiker.net To unsubscribe send an email to pyopencl-le...@tiker.net