slegrand <> writes:

> Hello everybody,
> I'm currently using pycuda and scikit-cuda to parallelize a simple code. 
> Basically I repeat this structure inside a for loop:
> 1-matrix/vector product (cublas.cublasDgemv)
> 2-elementwise division(cumisc.divide)
> 3-matrix/vector product
> 4-elementwise division
> 5-Error calculation
> and I leave the loop when the error is small enough (You can see the 
> code at the end of the mail). I want to calculate the error on the GPU 
> and check with a if condition if it's small enough before breaking the 
> loop. error_dev and error_min_dev are both (1,) array but when I try to 
> compare them in the if condition, I get the following error:
> File "./lib/Solvers/IPFP_GPU/", line 109, in 
> solve_IPFP_simple_gpu
>      if(error_dev < error_min_dev):
> TypeError: an integer is required
> and if I try to access to the only element of these arrays:
> File "./lib/Solvers/IPFP_GPU/", line 129, in 
> solve_IPFP_simple_gpu
>      if(error_dev[0] < error_min_dev[0]):
>    File 
> "/home/slegrand/miniconda/lib/python2.7/site-packages/pycuda/", 
> line 838, in __getitem__
>      array_shape = self.shape[array_axis]
> IndexError: tuple index out of range
> The only solution I found was to use the get_async() and to compare both 
> arrays on the CPU but I guess this is not the best solution... I 
> wondered if there is a way to compare these arrays without sending them 
> back to the CPU.
> On the other hand, I wondered how is controlled the for loop. How are 
> the iterations synchronized with the GPU calculations?

This code does something similar:

Looking through that may be helpful.


PyCUDA mailing list

Reply via email to