Hi all, Here's what I think is a bug in the Nvidia CL implementation. The assertion fails on my GTX 260, but is fine just about anywhere else.
8< -------------------------------------------------------- #! /usr/bin/env python import pyopencl as cl import pyopencl.array import numpy as np ctx = cl.create_some_context() queue = cl.CommandQueue(ctx) n = 2**20 + 1 dtype = np.int32 host_data = np.random.randint(0, 10, n).astype(dtype) dev_data = cl.array.to_device(queue, host_data) host_data_2 = dev_data.get() assert (host_data == host_data_2).all() 8< -------------------------------------------------------- Yes, you read that right--it seems they messed up just transferring a bit of memory. As is to be expected, the bug is very sensitive to the size (n). Infuriatingly, this happened in the unit test for parallel scan, so for a very long time I hunted for a scan bug that didn't exist. :( I hope I'm not crazy. In any case, I thought this might be important enough to warn you guys about. Happens for me with 290.x and 295.x drivers on a GTX260. No problem on Fermi. Andreas
pgpiZZSnFuNN6.pgp
Description: PGP signature
_______________________________________________ PyOpenCL mailing list [email protected] http://lists.tiker.net/listinfo/pyopencl
