slegrand <[email protected]> writes: > Hello everybody, > > I'm currently using pycuda and scikit-cuda to parallelize a simple code. > Basically I repeat this structure inside a for loop: > > 1-matrix/vector product (cublas.cublasDgemv) > > 2-elementwise division(cumisc.divide) > > 3-matrix/vector product > > 4-elementwise division > > 5-Error calculation > > and I leave the loop when the error is small enough (You can see the > code at the end of the mail). I want to calculate the error on the GPU > and check with a if condition if it's small enough before breaking the > loop. error_dev and error_min_dev are both (1,) array but when I try to > compare them in the if condition, I get the following error: > > File "./lib/Solvers/IPFP_GPU/functionsGPU.py", line 109, in > solve_IPFP_simple_gpu > if(error_dev < error_min_dev): > TypeError: an integer is required > > and if I try to access to the only element of these arrays: > > File "./lib/Solvers/IPFP_GPU/functionsGPU.py", line 129, in > solve_IPFP_simple_gpu > if(error_dev[0] < error_min_dev[0]): > File > "/home/slegrand/miniconda/lib/python2.7/site-packages/pycuda/gpuarray.py", > line 838, in __getitem__ > array_shape = self.shape[array_axis] > IndexError: tuple index out of range > > The only solution I found was to use the get_async() and to compare both > arrays on the CPU but I guess this is not the best solution... I > wondered if there is a way to compare these arrays without sending them > back to the CPU. > > On the other hand, I wondered how is controlled the for loop. How are > the iterations synchronized with the GPU calculations?
This code does something similar: https://github.com/inducer/pycuda/blob/master/pycuda/sparse/cg.py Looking through that may be helpful. Andreas _______________________________________________ PyCUDA mailing list [email protected] https://lists.tiker.net/listinfo/pycuda
