> If by 'working' you mean 'actually overlapping', here's an additional > subtlety. If 'exec' includes any kind of memory allocations, those are > implicitly synchronization points--so you might be synchronizing without > even seeing it. A memory pool would be a good solution for that (but > would only help on the second run through). >
pyFFT (and my toy code) only allocate memory at the start. Otherwise we would not see overlap in the "Working.py". > If however 'not working' means 'wrong results', then something's even > more fishy. > > Andreas By working I mean overlapping exec and mem-copy. -Magnus _______________________________________________ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda