Hi all,

Here's what I think is a bug in the Nvidia CL implementation. The
assertion fails on my GTX 260, but is fine just about anywhere else.

8< --------------------------------------------------------
#! /usr/bin/env python

import pyopencl as cl
import pyopencl.array
import numpy as np


ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)

n = 2**20 + 1
dtype = np.int32
host_data = np.random.randint(0, 10, n).astype(dtype)
dev_data = cl.array.to_device(queue, host_data)
host_data_2 = dev_data.get()

assert (host_data == host_data_2).all()
8< --------------------------------------------------------

Yes, you read that right--it seems they messed up just transferring a
bit of memory. As is to be expected, the bug is very sensitive to the
size (n).

Infuriatingly, this happened in the unit test for parallel scan, so for
a very long time I hunted for a scan bug that didn't exist. :(

I hope I'm not crazy. In any case, I thought this might be important
enough to warn you guys about. Happens for me with 290.x and 295.x
drivers on a GTX260. No problem on Fermi.

Andreas

Attachment: pgpiZZSnFuNN6.pgp
Description: PGP signature

_______________________________________________
PyOpenCL mailing list
[email protected]
http://lists.tiker.net/listinfo/pyopencl

Reply via email to