Dear pyCuda mailing list readers, I am aware that this question is rather directly CUDA related, but as I use pycuda, I just wanted to try if someone had similar problems already.
The error I get : Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/peter/PhD/img_fac/pycuda/complementarity/complementarity.py", line 240, in getPocketASAProfile A=getSurfaceGPU(pocket, r, nbPtPerUnitSphere, pocket_ref, ligand, xyz_name="pocket.xyz") File "/home/peter/PhD/img_fac/pycuda/complementarity/complementarity.py", line 132, in getSurfaceGPU m=cu.getSurfacePts(test_points, reference.xyz, reference["vdw"]+probe+correct, atom_idx, ligref.xyz, mol.xyz) File "cudm.py", line 239, in getSurfacePts npy.uint16(lig_ref.shape[0]), block=(bX,bY,1), grid=(gX,gY)) File "/usr/local/lib64/python2.6/site-packages/pycuda-0.93-py2.6-linux-x86_64.egg/pycuda/driver.py", line 138, in function_call Context.synchronize() pycuda._driver.LaunchError: cuCtxSynchronize failed: launch failed >>> >>> terminate called after throwing an instance of 'cuda::error' what(): cuMemFree failed: launch failed What I am doing : I am calculating if a point on a surface of a sphere (atom) is exposed or occluded by other atoms. I am doing this using a 3d array of coordinates. The length of the coordinate vector can be different thus I have written a function that gives me "possible but not optimal" values for the grid size, and block size to use to run the calculation on the GPU in 1D. Thus, I only use the X dimension for blocks and grids here. Following an output of blocksize.X, blocksize.y, gridSize.x, gridSize.y that I calculate (and subsequent calculation runs successfully on the GPU) : (4, 1, 4363, 1) (499, 1, 40, 1) (499, 1, 40, 1) (458, 1, 50, 1) (458, 1, 50, 1) (380, 1, 68, 1) (380, 1, 68, 1) (297, 1, 97, 1) (297, 1, 97, 1) (3, 1, 10607, 1) (3, 1, 10607, 1) (3, 1, 11731, 1) (3, 1, 11731, 1) (63, 1, 619, 1) (63, 1, 619, 1) (33, 1, 1297, 1) (33, 1, 1297, 1) (3, 1, 15559, 1) (3, 1, 15559, 1) (411, 1, 123, 1) (411, 1, 123, 1) (21, 1, 2609, 1) (21, 1, 2609, 1) (487, 1, 122, 1) (487, 1, 122, 1) (475, 1, 135, 1) (475, 1, 135, 1) (26, 1, 2647, 1) (26, 1, 2647, 1) (2, 1, 36781, 1) (2, 1, 36781, 1) (74, 1, 1063, 1) (74, 1, 1063, 1) (473, 1, 178, 1) (473, 1, 178, 1) (493, 1, 182, 1) (493, 1, 182, 1) (1, 1, 95287, 1) <-- interesting here, because 95287 is bigger than the maximum x dimension of a grid (65536) (7, 1, 1, 1) <-- thus I substract 3, or 7 or whatever uneven number from the data, and run the same thing again...we end up with something possbile) (7, 1, 1, 1) (397, 1, 240, 1) <- - here it crashes with the error shown above...this is the major bunch of the previous data. My card 9800GTX+ supports up to 512 threads/block. Someone has an idea how I can track down the problem. Furthermore after this operation when I close the python interpreter, pycuda is not able to unalloc the memory on the device. Ah yes, my cuda call looks like this : isASAPoint(drv.Out(mask), drv.In(x1),drv.In(x2), drv.In(y1),drv.In(y2),drv.In(z1),drv.In(z2), drv.In(radii), npy.uint16(molxyz.shape[0]), drv.In(excluded_idx),drv.In(pocket),drv.In(lig_ref), npy.uint16(lig_ref.shape[0]), block=(bX,bY,1), grid=(gX,gY)) So I use the drv in and out functions from pycuda. Thanks in advance. -- Peter Schmidtke ---------------------- PhD Student at the Molecular Modeling and Bioinformatics Group Dep. Physical Chemistry Faculty of Pharmacy University of Barcelona _______________________________________________ PyCUDA mailing list PyCUDA@tiker.net http://tiker.net/mailman/listinfo/pycuda_tiker.net