I'm learning cuda and decided to use python with ctypes to call all the cuda
functions but I'm finding some memory issues.  I've boiled it down to the
simplest scenario.  I use ctypes to call a cuda function which allocates
memory on the device and then frees it.  This works fine, but if I then try
to use np.dot on a totally other array declared in python, I get a
segmentation fault.  Note this only happens if the numpy array is
sufficiently large.  If I change the cuda mallocs to simple c mallocs, all
the problems go away, but thats not really helpful.  Any ideas what's going
on here?

CUDA CODE (debug.cu): 

#include <stdio.h>
#include <stdlib.h>

extern "C" {
void all_together( size_t N)
{
    void*d;
    int size = N *sizeof(float);
    int err;

    err = cudaMalloc(&d, size);
    if (err != 0) printf("cuda malloc error: %d\n", err);

    err = cudaFree(d);
    if (err != 0) printf("cuda free error: %d\n", err);
}}

PYTHON CODE (master.py):

import numpy as np
import ctypes
from ctypes import *

dll = ctypes.CDLL('./cuda_lib.so', mode=ctypes.RTLD_GLOBAL)

def build_all_together_f(dll):
    func = dll.all_together
    func.argtypes = [c_size_t]
    return func

__pycu_all_together = build_all_together_f(dll)


if __name__ == '__main__':
    N = 5001 # if this is less, the error doesn't show up

    a = np.random.randn(N).astype('float32')

    da = __pycu_all_together(N)

    # toggle this line on/off to get error
    #np.dot(a, a)

    print 'end of python'

COMPILE: nvcc -Xcompiler -fPIC -shared -o cuda_lib.so debug.cu

RUN: python master.py



--
View this message in context: 
http://numpy-discussion.10968.n7.nabble.com/numpy-dot-causes-segfault-after-ctypes-call-to-cudaMalloc-tp35910.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.
_______________________________________________
NumPy-Discussion mailing list
[email protected]
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to