Hello-

I'm working on porting NVIDIA's example separable convolution kernel to pycuda, and am running into a little snag. The kernel uses some sort of recursive C++ function templates to do loop unrolling. When I compile, I get "error: this declaration may not have extern "C" linkage". I assume this has something to do with the 'extern "C" {...}' that pycuda is wrapping around the kernel before sending it to the compiler.

My questions:
1) Is there some trivial workaround for this that will leave the template stuff un-molested? 2) If I just rip out the templates and do the loop unrolling in python, are there some pythonic examples of doing the loop unrolling out there? (I could roll my own, but I'm a novice so it might be ugly)

I'm already using string.Template() to replace some stuff that was #defined in the kernel but is needed in the calling function.

Thanks!
Drew

_______________________________________________
PyCUDA mailing list
[email protected]
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Reply via email to