+-----------------------------------------------------------------------------+ | NVIDIA-SMI 375.26 Driver Version: 375.26 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 Tesla K80 Off | 0000:05:00.0 Off | 0 | | N/A 68C P0 134W / 149W | 82MiB / 11439MiB | 93% Default | +-------------------------------+----------------------+----------------------+ | 1 Tesla K80 Off | 0000:06:00.0 Off | 0 | | N/A 51C P0 148W / 149W | 82MiB / 11439MiB | 100% Default | +-------------------------------+----------------------+----------------------+ | 2 Tesla K80 Off | 0000:84:00.0 Off | 0 | | N/A 26C P8 26W / 149W | 82MiB / 11439MiB | 0% Default | +-------------------------------+----------------------+----------------------+ | 3 Tesla K80 Off | 0000:85:00.0 Off | 0 | | N/A 24C P8 29W / 149W | 82MiB / 11439MiB | 0% Default | +-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 18956 C nvidia-cuda-mps-server 80MiB | | 1 18956 C nvidia-cuda-mps-server 80MiB | | 2 18956 C nvidia-cuda-mps-server 80MiB | | 3 18956 C nvidia-cuda-mps-server 80MiB | +-----------------------------------------------------------------------------+ On Thursday, April 27, 2017 at 5:48:15 PM UTC-4, nouiz wrote: > > What is the output of "nvidia-smi" > > On Thu, Apr 27, 2017 at 3:53 PM anurag kumar <[email protected] > <javascript:>> wrote: > >> By the way, I am aware of the other question which is similar ( >> https://groups.google.com/forum/#!topic/theano-users/l9FlhYIiWMo). But I >> could not see a definite answer in that. >> >> >> On Thursday, April 27, 2017 at 3:51:31 PM UTC-4, anurag kumar wrote: >>> >>> Hi, >>> I have 4 Tesla K80 gpus on my system. I want to run different process on >>> them at the same time. Basically network with different parameters so that >>> I can train 4 different network at the same time. >>> >>> I have libgpuarray and pygpu installed. The first job >>> as THEANO_FLAGS=device=cuda0 python training_run.py runs just fine and uses >>> first gpu. >>> >>> But when I try to use the second gpu as THEANO_FLAGS=device=cuda1 python >>> training_run.py it gives the error below and falls back on cpu. >>> >>> -------------------------- >>> ERROR (theano.gpuarray): Could not initialize pygpu, support disabled >>> Traceback (most recent call last): >>> File >>> "/usr/local/lib/python2.7/dist-packages/theano/gpuarray/__init__.py", line >>> 164, in <module> >>> use(config.device) >>> File >>> "/usr/local/lib/python2.7/dist-packages/theano/gpuarray/__init__.py", line >>> 151, in use >>> init_dev(device) >>> File >>> "/usr/local/lib/python2.7/dist-packages/theano/gpuarray/__init__.py", line >>> 60, in init_dev >>> sched=config.gpuarray.sched) >>> File "pygpu/gpuarray.pyx", line 634, in pygpu.gpuarray.init >>> (pygpu/gpuarray.c:9417) >>> File "pygpu/gpuarray.pyx", line 584, in pygpu.gpuarray.pygpu_init >>> (pygpu/gpuarray.c:9108) >>> File "pygpu/gpuarray.pyx", line 1060, in >>> pygpu.gpuarray.GpuContext.__cinit__ (pygpu/gpuarray.c:13470) >>> GpuArrayException: cuMemAllocHost: CUDA_ERROR_MAP_FAILED: mapping of >>> buffer object failed: 1 >>> --------------------------- >>> >>> >>> *Is there a solution for this ?* >>> >>> I tried *using the older cuda backend* as THEANO_FLAGS=device=gpu0 >>> python training_run.py. In this cases I am able to successfully use first >>> (gpu0) and second (gpu1) but when I try running the 3rd job on gpu2 >>> as THEANO_FLAGS=device=gpu2 python training_run.py, it gives the following >>> error >>> >>> Traceback (most recent call last): >>> File "training_run.py", line 1, in <module> >>> import lasagne.layers as layers >>> File "/usr/local/lib/python2.7/dist-packages/lasagne/__init__.py", >>> line 12, in <module> >>> import theano >>> File "/usr/local/lib/python2.7/dist-packages/theano/__init__.py", line >>> 108, in <module> >>> import theano.sandbox.cuda >>> File >>> "/usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda/__init__.py", >>> line 728, in <module> >>> use(device=config.device, force=config.force_device, >>> test_driver=False) >>> File >>> "/usr/local/lib/python2.7/dist-packages/theano/sandbox/cuda/__init__.py", >>> line 518, in use >>> cuda_initialization_error_message)) >>> *EnvironmentError: You forced the use of gpu device gpu0, but CUDA >>> initialization failed with error:* >>> *Unable to get the number of gpus available: OS call failed or operation >>> not supported on this OS* >>> >>> This is very strange. By the way, all my gpus works just fine. So there >>> is no problem with gpu2. In fact if I run first 2 jobs on gpu1 and gpu2 and >>> then try using gpu0 it gives the same error. >>> >>> >>> Any help is much appreciated. >>> >>> Thanks, >>> Anurag >>> >>> >>> -- >> >> --- >> You received this message because you are subscribed to the Google Groups >> "theano-users" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> For more options, visit https://groups.google.com/d/optout. >> > -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
