Hello, I would like to work on extending cython with a way to offload cython code to a GPU. I found to related CEPs (https://github.com/cython/cython/wiki/enhancements-opencl and https://github.com/cython/cython/wiki/enchancements-metadefintions).
My current thinking is that a solution along the OpenCL CEP is most effective, it does require many code changes and seems to provide a good tradeoff between usability and efficiency. I would like to suggest a few modifications to this approach, like * using SYCL instead of OpenCL to closely follow existing parallel/prange semantics more easily * selecting the device (CPU, GPU) per region rather than per file * maybe allowing calling appropriately annotated and written external functions I would be very grateful for any thoughts about this topic in general and for any advice on how to approach this so that a solution is found that is most broadly useful and most cythonic. Cheers frank Intel Deutschland GmbH Registered Address: Am Campeon 10-12, 85579 Neubiberg, Germany Tel: +49 89 99 8853-0, www.intel.de Managing Directors: Christin Eisenschmid, Gary Kershaw Chairperson of the Supervisory Board: Nicole Lau Registered Office: Munich Commercial Register: Amtsgericht Muenchen HRB 186928
_______________________________________________ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel