[PyCUDA] Stopping Criterion in for loops

slegrand Mon, 17 Oct 2016 03:13:10 -0700

Hello everybody,

I'm currently using pycuda and scikit-cuda to parallelize a simple code.Basically I repeat this structure inside a for loop:


1-matrix/vector product (cublas.cublasDgemv)

2-elementwise division(cumisc.divide)

3-matrix/vector product

4-elementwise division

5-Error calculation

and I leave the loop when the error is small enough (You can see thecode at the end of the mail). I want to calculate the error on the GPUand check with a if condition if it's small enough before breaking theloop. error_dev and error_min_dev are both (1,) array but when I try tocompare them in the if condition, I get the following error:

File "./lib/Solvers/IPFP_GPU/functionsGPU.py", line 109, insolve_IPFP_simple_gpu

    if(error_dev < error_min_dev):
TypeError: an integer is required

and if I try to access to the only element of these arrays:

File "./lib/Solvers/IPFP_GPU/functionsGPU.py", line 129, insolve_IPFP_simple_gpu

    if(error_dev[0] < error_min_dev[0]):

File"/home/slegrand/miniconda/lib/python2.7/site-packages/pycuda/gpuarray.py",line 838, in __getitem__

    array_shape = self.shape[array_axis]
IndexError: tuple index out of range

The only solution I found was to use the get_async() and to compare botharrays on the CPU but I guess this is not the best solution... Iwondered if there is a way to compare these arrays without sending themback to the CPU.

On the other hand, I wondered how is controlled the for loop. How arethe iterations synchronized with the GPU calculations?


Thanks for your time!

Best regards,

Simon Legrand


def solve_IPFP_simple_gpu(Mu, Nu, epsilon):

    dtype = np.float64
    mu = np.reshape(Mu.values,(1,np.size(Mu.values)))
    nu = np.reshape(Nu.values,(np.size(Nu.values),1))
    a = np.copy(nu)

    C = quad_cost_matrix(Mu.vertices, Nu.vertices)
    K = np.exp(-C/epsilon).astype(dtype)

    handle = cublas.cublasCreate()
    m = np.shape(K)[0]
    n = np.shape(K)[1]
    alpha = np.float64(1.0)
    beta = np.float64(0.0)

    mu_dev = gpuarray.to_gpu(mu)
    s1_dev = gpuarray.empty(mu.T.shape,dtype)
    nu_dev = gpuarray.to_gpu(nu)
    s2_dev = gpuarray.empty(nu.shape,dtype)
    K_dev = gpuarray.to_gpu(K)
    a_dev = gpuarray.to_gpu(a)
    an_dev = gpuarray.empty(a.shape,dtype)
    b_dev = gpuarray.to_gpu(mu)

    error_min_dev = gpuarray.to_gpu(np.array(1e-3).astype(np.float64))
    niter_max = 1000

    culinalg.init()

    for i in xrange(0, niter_max):

cublas.cublasDgemv(handle, 't', m, n, alpha, K_dev.gpudata, m,a_dev.gpudata, 1, beta, s1_dev.gpudata, 1)

        b_dev = cumisc.divide(mu_dev,culinalg.transpose(s1_dev))

cublas.cublasDgemv(handle, 'n', n, m, alpha, K_dev.gpudata, n,b_dev.gpudata, 1, beta, s2_dev.gpudata, 1)

        an_dev = cumisc.divide(nu_dev, s2_dev)

error_dev =cumisc.divide(cumisc.sum(cumisc.subtract(an_dev,a_dev)),cumisc.sum(a_dev))

        a_dev = an_dev

        print(error_dev.get_async(), error_min_dev.get_async())
        if(error_dev < error_min_dev):
            break

    a = a_dev.get()
    b = b_dev.get()
    psi = np.reshape(epsilon*np.log(a),(np.size(a),))
    phi = np.reshape(epsilon*np.log(b),(np.size(b),))
    Gamma = K*a*b
    cublas.cublasDestroy(handle)
    return Gamma, phi, psi


_______________________________________________
PyCUDA mailing list
[email protected]
https://lists.tiker.net/listinfo/pycuda

[PyCUDA] Stopping Criterion in for loops

Reply via email to