Freddie Witherden <[email protected]> writes: > I am wondering if anyone has worked up a class to automatically select a > suitable thread block dimension given a function, nrow and ncol. I know > using OccupancyRecord I can determine the occupancy for a given number > of threads but it does not appear to be able to solve the inverse problem. > > While I know there is more to performance than just occupancy it does > often correlate with performance.
I know of no such thing, but I do see the usefulness. Whether it should be a class or a function and many of the details are of course up in the air, but I could imagine accepting something like this into PyCUDA, if you've got time to work on it. Andreas _______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
