Hi all,

I have a problem that can be split into pieces of different sizes. Essentially, the larger the size is, the more efficiently it runs. However, on Windows (and I understand similar things happen on Linux) a single GPU kernel launch cannot take more than 5 seconds on XP or 2 seconds on Vista/Win7, or the Timeout Detection and Recovery (TDR) system will terminate it and raise an error (also causing the screen to flash). My problem is that I want to run my kernels for as long as possible for maximum efficiency, but I don't know how long the kernel launch will take as a function of problem size until I run it. I could profile my functions and work out something that would probably work, but this is for a software package that will be used by third parties, and I'd like it to be handled automatically (and preferably without the screen flashes, which will disturb users).

Has anyone worked out a good way of dealing with this?

One option is to increase the TDR window as detailed in:

http://www.microsoft.com/whdc/device/display/wddm_timeout.mspx

This might have adverse effects though, and I'm not sure all users of my package would be happy changing these values (it's also not automatic).

Another option is to have two GPUs, one of which is not attached to a monitor and only used in compute mode (as discussed at http://forums.nvidia.com/index.php?showtopic=171630). Again, fine for me (I have two), but not so good for users who I guess in many cases will only have one.

A final option that I thought of would be to check for a launch timeout failure after each kernel launch, and if it happens, divide my problem size by two and try again, repeating until I don't get any launch failures. The trouble with this approach is that I'll get multiple failures and screen flashes before it settles down to a value that works, wasting a little bit of time but more importantly being quite alarming. It also doesn't feel very elegant... ;-)

Any other ideas or experiences dealing with this problem?

Dan

_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Reply via email to