DCompute: First kernels run successfully

Nicholas Wilson via Digitalmars-d-announce Mon, 11 Sep 2017 05:26:56 -0700

I'm pleased to announce that I have run the first dcompute kerneland it was a success!

There is still a fair bit of polish to the driver needed to makethe API sane and more complete, not to mention more similar tothe (untested) OpenCL driver API. But it works!

(Contributions are of course greatly welcomed)


The kernel:
```
@compute(CompileFor.deviceOnly)
module dcompute.tests.dummykernels;

import ldc.dcompute;
import dcompute.std.index;

@kernel void saxpy(GlobalPointer!(float) res,
                   float alpha,GlobalPointer!(float) x,
                   GlobalPointer!(float) y,
                   size_t N)
{
    auto i = GlobalIndex.x;
    if (i >= N) return;
    res[i] = alpha*x[i] + y[i];
}
```

The host code:
```
import dcompute.driver.cuda;
import dcompute.tests.dummykernels : saxpy;

Platform.initialise();

auto devs   = Platform.getDevices(theAllocator);
auto ctx    = Context(devs[0]); scope(exit) ctx.detach();

// Change the file to match your GPU.

Program.globalProgram =Program.fromFile("./.dub/obj/kernels_cuda210_64.ptx");

auto q = Queue(false);

enum size_t N = 128;
float alpha = 5.0;
float[N] res, x,y;
foreach (i; 0 .. N)
{
    x[i] = N - i;
    y[i] = i * i;
}
Buffer!(float) b_res, b_x, b_y;
b_res      =  Buffer!(float)(res[]); scope(exit) b_res.release();
b_x        =  Buffer!(float)(x[]);   scope(exit) b_x.release();
b_y        =  Buffer!(float)(y[]);   scope(exit) b_y.release();

b_x.copy!(Copy.hostToDevice); // not quite sold on this interfaceyet.

b_y.copy!(Copy.hostToDevice);

q.enqueue!(saxpy)  // <-- the main magic happens here
    ([N,1,1],[1,1,1])   // the grid
    (b_res,alpha,b_x,b_y, N); // the kernel arguments

b_res.copy!(Copy.deviceToHost);
foreach(i; 0 .. N)
    enforce(res[i] == alpha * x[i] + y[i]);
writeln(res[]); // [640, 636, ... 16134]
```

Simple as that!

Dcompute, as always, is at https://github.com/libmir/dcompute andon dub.

To successfully run the dcompute CUDA test you will need a veryrecent LDC (less than two days) with the NVPTX backend* enabledalong with a CUDA environment and an Nvidia GPU.


*Or wait for LDC 1.4 release real soon(™).

Thanks to the LDC folks for putting up with me ;)

Have fun GPU programming,
Nic

DCompute: First kernels run successfully

Reply via email to