After testing, the proper syntax is:
template squareCuda(bpg, tpb: int, y: var GpuArray, x: GpuArray) =
## Compute the square of x and store it in y
## bpg: BlocksPerGrid
## tpb: ThreadsPerBlock
## Output square<<<bpg, tpb>>>(y,x)
{.emit:
["""square<<<""",bpg.cint,""",""",tpb.cint,""">>>(""",y.data[],""",""",x.data[],""");"""].}
The triple-quote are is a bit verbose. I tried with backticks as well "kernel<<<bpg,`tpb`>>>(y,`x`);"
