Noah Young <[email protected]> writes: > I'm trying to run jobs on several GPUs at the same time using multiple > threads, each with its own context. Sometimes this works flawlessly, but > ~75% of the time I get a cuModuleLoadDataEx error telling me the context > has been destroyed. What's frustrating is that nothing changes between > failed and successful runs of the code. From what I can tell it's down to > luck whether or not the error comes up:
"Context destroyed" is akin to a segmentation fault on the CPU. You should find evidence that your code performed an illegal access, e.g., using 'dmesg' in the kernel log. (If you see a message "NVRM Xid ...", that points to the problem) My first suspicion would be a bug in your code. Andreas
signature.asc
Description: PGP signature
_______________________________________________ PyCUDA mailing list [email protected] https://lists.tiker.net/listinfo/pycuda
