slegrand <legrand.sim...@gmail.com> writes:

> Hello everybody,
>
> I'm currently using pycuda and scikit-cuda to parallelize a simple code. 
> Basically I repeat this structure inside a for loop:
>
> 1-matrix/vector product (cublas.cublasDgemv)
>
> 2-elementwise division(cumisc.divide)
>
> 3-matrix/vector product
>
> 4-elementwise division
>
> 5-Error calculation
>
> and I leave the loop when the error is small enough (You can see the 
> code at the end of the mail). I want to calculate the error on the GPU 
> and check with a if condition if it's small enough before breaking the 
> loop. error_dev and error_min_dev are both (1,) array but when I try to 
> compare them in the if condition, I get the following error:
>
> File "./lib/Solvers/IPFP_GPU/functionsGPU.py", line 109, in 
> solve_IPFP_simple_gpu
>      if(error_dev < error_min_dev):
> TypeError: an integer is required
>
> and if I try to access to the only element of these arrays:
>
> File "./lib/Solvers/IPFP_GPU/functionsGPU.py", line 129, in 
> solve_IPFP_simple_gpu
>      if(error_dev[0] < error_min_dev[0]):
>    File 
> "/home/slegrand/miniconda/lib/python2.7/site-packages/pycuda/gpuarray.py", 
> line 838, in __getitem__
>      array_shape = self.shape[array_axis]
> IndexError: tuple index out of range
>
> The only solution I found was to use the get_async() and to compare both 
> arrays on the CPU but I guess this is not the best solution... I 
> wondered if there is a way to compare these arrays without sending them 
> back to the CPU.
>
> On the other hand, I wondered how is controlled the for loop. How are 
> the iterations synchronized with the GPU calculations?

This code does something similar:

https://github.com/inducer/pycuda/blob/master/pycuda/sparse/cg.py

Looking through that may be helpful.

Andreas

_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
https://lists.tiker.net/listinfo/pycuda

Reply via email to