I am attempting to run some reinforcement learning code on the GPU. (The 
code is https://github.com/openai/imitation if it matters, running 
`scripts/run_rl_mj.py`.)

I converted the code to run on float32 by changing the way the data is 
supplied via numpy. Unfortunately, with the new GPU backend, I am gettting 
an out of memory error, despite having 12GB of memory on my Titan X Pascal 
GPU. Here are my settings:

$ cat ~/.theanorc 
[global] 
device = cuda 
floatX = float32 

[gpuarray] 
preallocate = 1 

[cuda] 
root = /usr/local/cuda-8.0


Theano seems to be importing correctly:

$ ipython
Python 2.7.13 |Anaconda custom (64-bit)| (default, Dec 20 2016, 23:09:15)  
Type "copyright", "credits" or "license" for more information. 
IPython 5.3.0 -- An enhanced Interactive Python. 
?         -> Introduction and overview of IPython's features. 
%quickref -> Quick reference. 
help      -> Python's own help system. 
object?   -> Details about 'object', use 'object??' for extra details. 

In [1]: import theano 
Using cuDNN version 5105 on context None 
Preallocating 11576/12186 Mb (0.950000) on cuda 
Mapped name None to device cuda: TITAN X (Pascal) (0000:01:00.0) 

In [2]: 



Unfortunately, running `python scripts/run_rl_mj.py --env_name CartPole-v0 
--log trpo_logs/CartPole-v0` on the very low-dimensional CartPole setting 
(state space is just four numbers, actions are just one number) gives me 
(after a bit of a setup):


Traceback (most recent call last):

  File "scripts/run_rl_mj.py", line 116, in <module>

    main()

  File "scripts/run_rl_mj.py", line 109, in main

    iter_info = opt.step()

  File "/home/daniel/imitation_noise/policyopt/rl.py", line 280, in step

    cfg=self.sim_cfg)

  File "/home/daniel/imitation_noise/policyopt/__init__.py", line 411, in 
sim_mp

    traj = job.get()

  File "/home/daniel/anaconda2/lib/python2.7/multiprocessing/pool.py", line 
567, in get

    raise self._value

pygpu.gpuarray.GpuArrayException: Out of memory

Apply node that caused the error: GpuFromHost<None>(obsfeat_B_Df)

Toposort index: 4

Inputs types: [TensorType(float32, matrix)]

Inputs shapes: [(1, 4)]

Inputs strides: [(16, 4)]

Inputs values: [array([[ 0.04058,  0.00428,  0.03311, -0.02898]], 
dtype=float32)]

Outputs clients: [[GpuElemwise{Composite{((i0 - i1) / 
i2)}}[]<gpuarray>(GpuFromHost<None>.0, 
/GibbsPolicy/obsnorm/Standardizer/mean_1_D, GpuElemwise{Composite{(i0 + 
sqrt((i1 * (Composite{(i0 - sqr(i1))}(i2, i3) + Abs(Composite{(i0 - 
sqr(i1))}(i2, i3))))))}}[]<gpuarray>.0)]]


HINT: Re-running with most Theano optimization disabled could give you a 
back-trace of when this node was created. This can be done with by setting 
the Theano flag 'optimizer=fast_compile'. If that does not work, Theano 
optimizations can be disabled with 'optimizer=None'.

HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and 
storage map footprint of this apply node.

Closing remaining open files:trpo_logs/CartPole-v0...done


What I'm confused about is that

   - This happens right at the beginning of the reinforcement learning, so 
   it's not as if the algorithm has been running a long time and then ran out 
   of memory.
   - The input shapes are quite small, (1,4) and (16,4). In addition, the 
   output is supposed to do normalization and several other element-wise 
   operations. None of this suggests high memory usage.

I tried `optimizer = fast_compile` and re-ran this, but the error message 
was actually less informative (it contains a subset of the above error 
message). Running with `exception_verbosity = high` results in a different 
error message:


Max traj len: 200

Traceback (most recent call last):

  File "scripts/run_rl_mj.py", line 116, in <module>

    main()

  File "scripts/run_rl_mj.py", line 109, in main

    iter_info = opt.step()

  File "/home/daniel/imitation_noise/policyopt/rl.py", line 280, in step

    cfg=self.sim_cfg)

  File "/home/daniel/imitation_noise/policyopt/__init__.py", line 411, in 
sim_mp

    traj = job.get()

  File "/home/daniel/anaconda2/lib/python2.7/multiprocessing/pool.py", line 
567, in get

    raise self._value

pygpu.gpuarray.GpuArrayException: initialization error

Closing remaining open files:trpo_logs/CartPole-v0...done

It somehow didn't even reach the correct point in the code??

I noticed a similar issue 
here: https://github.com/costapt/vess2ret/issues/5 which seems to suggest 
that the problem is not limited to just this script. What do you suggest I 
do? Thanks.

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to