Hi,
I have 4 Tesla K80 gpus on my system. I want to run different process on
them at the same time. Basically network with different parameters so that
I can train 4 different network at the same time.
I have libgpuarray and pygpu installed. The first job
as THEANO_FLAGS=device=cuda0 python training_run.py runs just fine and uses
first gpu.
But when I try to use the second gpu as THEANO_FLAGS=device=cuda1 python
training_run.py it gives the error below and falls back on cpu.
--------------------------
ERROR (theano.gpuarray): Could not initialize pygpu, support disabled
Traceback (most recent call last):
File
"/usr/local/lib/python2.7/dist-packages/theano/gpuarray/__init__.py", line
164, in <module>
use(config.device)
File
"/usr/local/lib/python2.7/dist-packages/theano/gpuarray/__init__.py", line
151, in use
init_dev(device)
File
"/usr/local/lib/python2.7/dist-packages/theano/gpuarray/__init__.py", line
60, in init_dev
sched=config.gpuarray.sched)
File "pygpu/gpuarray.pyx", line 634, in pygpu.gpuarray.init
(pygpu/gpuarray.c:9417)
File "pygpu/gpuarray.pyx", line 584, in pygpu.gpuarray.pygpu_init
(pygpu/gpuarray.c:9108)
File "pygpu/gpuarray.pyx", line 1060, in
pygpu.gpuarray.GpuContext.__cinit__ (pygpu/gpuarray.c:13470)
GpuArrayException: cuMemAllocHost: CUDA_ERROR_MAP_FAILED: mapping of buffer
object failed: 1
---------------------------
*Is there a solution for this ?*
I tried *using the older cuda backend* as THEANO_FLAGS=device=gpu0 python
training_run.py. In this cases I am able to successfully use first (gpu0)
and second (gpu1) but when I try running the 3rd job on gpu2
as THEANO_FLAGS=device=gpu2 python training_run.py, it gives the following
error
Traceback (most recent call last):
File "training_run.py", line 1, in <module>
import lasagne.layers as layers
File "/usr/local/lib/python2.7/dist-packages/lasagne/__init__.py", line
12, in <module>
import theano
File "/usr/local/lib/python2.7/dist-packages/theano/__init__.py", line
108, in <module>
import theano.sandbox.cuda
File
"/usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda/__init__.py",
line 728, in <module>
use(device=config.device, force=config.force_device, test_driver=False)
File
"/usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda/__init__.py",
line 518, in use
cuda_initialization_error_message))
*EnvironmentError: You forced the use of gpu device gpu0, but CUDA
initialization failed with error:*
*Unable to get the number of gpus available: OS call failed or operation
not supported on this OS*
This is very strange. By the way, all my gpus works just fine. So there is
no problem with gpu2. In fact if I run first 2 jobs on gpu1 and gpu2 and
then try using gpu0 it gives the same error.
Any help is much appreciated.
Thanks,
Anurag
--
---
You received this message because you are subscribed to the Google Groups
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.