[PyCUDA] Porting nvidia's separable convolution example to pycuda: C++ templates, loop unrolling

Andrew Wagner Fri, 12 Jun 2009 14:47:15 -0700

Hello-

I'm working on porting NVIDIA's example separable convolution kernelto pycuda, and am running into a little snag. The kernel uses somesort of recursive C++ function templates to do loop unrolling. When Icompile, I get "error: this declaration may not have extern "C"linkage". I assume this has something to do with the 'extern"C" {...}' that pycuda is wrapping around the kernel before sending itto the compiler.


My questions:

1) Is there some trivial workaround for this that will leave thetemplate stuff un-molested?2) If I just rip out the templates and do the loop unrolling inpython, are there some pythonic examples of doing the loop unrollingout there? (I could roll my own, but I'm a novice so it might be ugly)

I'm already using string.Template() to replace some stuff that was#defined in the kernel but is needed in the calling function.


Thanks!
Drew

_______________________________________________
PyCUDA mailing list
[email protected]
http://tiker.net/mailman/listinfo/pycuda_tiker.net

[PyCUDA] Porting nvidia's separable convolution example to pycuda: C++ templates, loop unrolling

Reply via email to