Re: [theano-users] Re: Different processes on different gpus

Richard Hankins Thu, 04 May 2017 02:06:08 -0700

Hi,

Sorry for the late reply. Haven't looked at this in a while. But just
checked my set up and i've got device = cpu in .theanorc. But i'm not
forcing the device.
Checking nvidia-smi both my GPU's are in default compute mode. To select
different devices I use the following:


import theano.sandbox.cuda
theano.sandbox.cuda.use("gpuX")

Hope this helps,

Richard

On Thu, Apr 27, 2017 at 8:33 PM, anurag kumar <[email protected]> wrote:

> Is there a final solution to this problem ? I am having similar problem.
>
> Best,
> Anurag
>
> On Sunday, May 1, 2016 at 9:24:59 AM UTC-4, RHankins wrote:
>>
>> Sorry I meant "One on gpu0 and one on gpu1 (It begins by running a
>> process of gpu1 then starts another on gpu0)".
>>
>> On Sunday, May 1, 2016 at 2:19:58 PM UTC+1, RHankins wrote:
>>>
>>> Right. I know why it was throwing an error when i had device=cpu because
>>> i also had force_device=True. Setting device=cpu and not setting
>>> force_device allows me to select different gpus using
>>> theano.sandbox.cuda.use().
>>>
>>> But i'm still having the problem with it running on mulitple gpus. If I
>>> select gpu0 (Titan X) it runs a single process on the correct gpu. If I
>>> select gpu1 (GTX 980) to run exactly the same code it runs 2 processes. One
>>> on gpu0 and one on gpu1 (It begins by running a process of gpu0 then starts
>>> another on gpu1). It doesn't matter if they are run simultaneously or not.
>>> Or if when they are run simultaneously, something was already running on
>>> gpu0 or gpu1. Should I use nvidia-smi to force the code to run on a single
>>> gpu using
>>>
>>> nvidia-smi  −−compute−mode=EXCLUSIVE_PROCESS?
>>>
>>>
>>> My only concern is if I want to run different programs at the same time
>>> I will end up having mutiple processes running on the same gpu. So will
>>> they interfer with each other if they are importing the same modules? Is it
>>> okay to run multiple processes on the same gpu? Will it effect the results?
>>> Or does it not matter?
>>>
>>> Cheers,
>>>
>>> R
>>>
>>> On Friday, April 29, 2016 at 9:55:33 PM UTC+1, nouiz wrote:
>>>>
>>>> You can ignore those 2 errrors. It is just that those test seem too
>>>> sensitive.
>>>>
>>>> If you set a device in your theanorc file that isn't 'cpu' and call
>>>> use() on another one, it is normal that Theano don't like this, as only 1
>>>> GPU is supported in the current back-end. The new one support multiple GPU.
>>>>
>>>> Does it work if you try to use gpu0? Does something was already running
>>>> on gpu1?
>>>>
>>>> Fred
>>>>
>>>> On Fri, Apr 29, 2016 at 11:52 AM, RHankins <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi Fred,
>>>>>
>>>>> Thanks for your response. I'm using cuda back-end so I didn't install
>>>>> libgpuarray. Or am I supposed to install libgpuarray as well? When I say
>>>>> tests, I just mean testing some new code out not running theano tests (as
>>>>> in nose tests).
>>>>>
>>>>> I'm using Lasagne as I saw that you suggested to someone else to use
>>>>>
>>>>> import theano.sandbox.cuda
>>>>>
>>>>> theano.sandbox.cuda.use("gpu1")
>>>>>
>>>>>
>>>>> If in .theanorc device = gpu0 I get the following message
>>>>>
>>>>> WARNING (theano.sandbox.cuda): Ignoring call to use(1), GPU number 0
>>>>> is already in use.
>>>>>
>>>>>
>>>>> If in .theanorc device = cpu I get the following message
>>>>>
>>>>> WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu1 is
>>>>> not available (error: cuda unavailable)
>>>>>
>>>>>
>>>>> I updated Theano and Lasagne to the lastest versions - (0.9.0dev0) and
>>>>> (0.2.dev1) respectively but I've still got the same problem. But in
>>>>> addition to this now when I run theano.test() it won't pass. I get the
>>>>> following errors
>>>>>
>>>>>
>>>>> ======================================================================
>>>>> ERROR: test_grad (theano.tensor.tests.test_basic.ArctanhInplaceTester)
>>>>> ----------------------------------------------------------------------
>>>>> Traceback (most recent call last):
>>>>> File 
>>>>> "/usr/local/lib/python2.7/dist-packages/theano/tensor/tests/test_basic.py",
>>>>> line 483, in test_grad
>>>>> eps=_grad_eps)
>>>>> File 
>>>>> "/usr/local/lib/python2.7/dist-packages/theano/tests/unittest_tools.py",
>>>>> line 91, in verify_grad
>>>>> T.verify_grad(op, pt, n_tests, rng, *args, **kwargs)
>>>>> File "/usr/local/lib/python2.7/dist-packages/theano/gradient.py",
>>>>> line 1709, in verify_grad
>>>>> abs_tol, rel_tol)
>>>>> GradientError: GradientError: numeric gradient and analytic gradient
>>>>> exceed tolerance:
>>>>> At position 4 of argument 0,
>>>>> abs. error = 3.537018, abs. tolerance = 0.010000
>>>>> rel. error = 0.013429, rel. tolerance = 0.010000
>>>>> Exception args:
>>>>> The error happened with the following inputs:, [array([[ 0.28898013,
>>>>> 0.98691875, -0.37341487],
>>>>> [-0.83661169, -0.99454761, -0.57619613]], dtype=float32)],
>>>>> The value of eps is:, None,
>>>>> The out_type is:, None, Test arctanh_inplace::normal: Error occurred
>>>>> while computing the gradient on the following inputs: [array([[ 
>>>>> 0.28898013,
>>>>> 0.98691875, -0.37341487],
>>>>> [-0.83661169, -0.99454761, -0.57619613]], dtype=float32)]
>>>>>
>>>>> ======================================================================
>>>>> ERROR: test_grad (theano.tensor.tests.test_basic.ArctanhTester)
>>>>> ----------------------------------------------------------------------
>>>>> Traceback (most recent call last):
>>>>> File 
>>>>> "/usr/local/lib/python2.7/dist-packages/theano/tensor/tests/test_basic.py",
>>>>> line 483, in test_grad
>>>>> eps=_grad_eps)
>>>>> File 
>>>>> "/usr/local/lib/python2.7/dist-packages/theano/tests/unittest_tools.py",
>>>>> line 91, in verify_grad
>>>>> T.verify_grad(op, pt, n_tests, rng, *args, **kwargs)
>>>>> File "/usr/local/lib/python2.7/dist-packages/theano/gradient.py",
>>>>> line 1709, in verify_grad
>>>>> abs_tol, rel_tol)
>>>>> GradientError: GradientError: numeric gradient and analytic gradient
>>>>> exceed tolerance:
>>>>> At position 4 of argument 0,
>>>>> abs. error = 3.537018, abs. tolerance = 0.010000
>>>>> rel. error = 0.013429, rel. tolerance = 0.010000
>>>>> Exception args:
>>>>> The error happened with the following inputs:, [array([[ 0.28898013,
>>>>> 0.98691875, -0.37341487],
>>>>> [-0.83661169, -0.99454761, -0.57619613]], dtype=float32)],
>>>>> The value of eps is:, None,
>>>>> The out_type is:, None, Test Elemwise{arctanh,no_inplace}::normal:
>>>>> Error occurred while computing the gradient on the following inputs:
>>>>> [array([[ 0.28898013, 0.98691875, -0.37341487],
>>>>> [-0.83661169, -0.99454761, -0.57619613]], dtype=float32)]
>>>>>
>>>>> ----------------------------------------------------------------------
>>>>> Ran 3028 tests in 1688.020s
>>>>>
>>>>> FAILED (SKIP=108, errors=2)
>>>>>
>>>>>
>>>>> On Thursday, April 28, 2016 at 2:54:36 AM UTC+1, nouiz wrote:
>>>>>>
>>>>>> Did you install the new gpu back-end libgpuarray? If so, we know
>>>>>> there is a problem that you describe like this, but I only saw it in 
>>>>>> Theano
>>>>>> tests. When you mean tests, do you mean running your own job test or 
>>>>>> Theano
>>>>>> tests?
>>>>>>
>>>>>> Fred
>>>>>>
>>>>>> On Wed, Apr 27, 2016 at 12:34 PM, RHankins <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Update: Even with no experiment running on gpu0. Running some test
>>>>>>> code with gpu1 selected as the default device in .theanorc, according to
>>>>>>> nvidia-smi, it is still launches two seperate processes on gpu0 and 
>>>>>>> gpu1?
>>>>>>>
>>>>>>> Any thoughts? Appreciate everyones help.
>>>>>>>
>>>>>>> Richard
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Monday, April 25, 2016 at 9:54:59 PM UTC+1, RHankins wrote:
>>>>>>>>
>>>>>>>> Hi guys,
>>>>>>>>
>>>>>>>> I have two gpus and want to be able to run different processes in
>>>>>>>> each one so I can experiment with different model parameters etc. I am
>>>>>>>> currently running experiments on gpu0 whilst testing out new code using
>>>>>>>> gpu1. Gpu1 is selected as the default device in .theanorc. When I want 
>>>>>>>> to
>>>>>>>> run experiments on gpu0 I've been using the following code in my 
>>>>>>>> programs.
>>>>>>>>
>>>>>>>> os.environ["THEANO_FLAGS"]="device=gpu0"
>>>>>>>> import theano
>>>>>>>>
>>>>>>>> I thought this was working. However, whilst inspecting nvidia-smi
>>>>>>>> recently I noticed that when I started testing some new code on gpu0 it
>>>>>>>> started running processes on both gpu0 and gpu1. An experiment was 
>>>>>>>> already
>>>>>>>> running on gpu0. And both the code for the experiment and the test code
>>>>>>>> import shared modules which also import theano.
>>>>>>>>
>>>>>>>> Am I selecting the gpus in the wrong manner? Also since it appears
>>>>>>>> that the test code was running on both gpus would it invalidate the 
>>>>>>>> results
>>>>>>>> of the experiment? Would they interfere with each other?
>>>>>>>>
>>>>>>>> Thanks in advance.
>>>>>>>>
>>>>>>>> --
>>>>>>>
>>>>>>> ---
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups "theano-users" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>> send an email to [email protected].
>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>
>>>>>>
>>>>>> --
>>>>>
>>>>> ---
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "theano-users" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>> --
>
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "theano-users" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/
> topic/theano-users/l9FlhYIiWMo/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [theano-users] Re: Different processes on different gpus

Reply via email to