Re: [PyCUDA] Porting nvidia's separable convolution example to pycuda: C++ templates, loop unrolling

Andreas Klöckner Fri, 12 Jun 2009 15:26:06 -0700

On Freitag 12 Juni 2009, Andrew Wagner wrote:
> Hello-
>
> I'm working on porting NVIDIA's example separable convolution kernel
> to pycuda, and am running into a little snag.  The kernel uses some
> sort of recursive C++ function templates to do loop unrolling.  When I
> compile, I get "error: this declaration may not have extern "C"
> linkage".  I assume this has something to do with the 'extern
> "C" {...}' that pycuda is wrapping around the kernel before sending it
> to the compiler.


SourceModule supports a no_extern_c keyword argument. However, once you use 
that, all the names in the CUDA module become "mangled" [1]. If you can live 
with just a few entrypoints that you manually declare extern "C", then this is 
likely a good way.

[1] http://en.wikipedia.org/wiki/Name_mangling

> My questions:
> 1) Is there some trivial workaround for this that will leave the
> template stuff un-molested?

see above

> 2) If I just rip out the templates and do the loop unrolling in
> python, are there some pythonic examples of doing the loop unrolling
> out there? (I could roll my own, but I'm a novice so it might be ugly)

http://documen.tician.de/pycuda/metaprog.html

http://is.gd/10clQ

Andreas

signature.asc
Description: This is a digitally signed message part.

_______________________________________________
PyCUDA mailing list
[email protected]
http://tiker.net/mailman/listinfo/pycuda_tiker.net

Re: [PyCUDA] Porting nvidia's separable convolution example to pycuda: C++ templates, loop unrolling

Reply via email to