When trying to compile a (rather large) neural net with device=cuda, at
some point I get the following error:
ImportError: ('libamdlibm.so: cannot open shared object file: No such file
or directory', '[Elemwise{pow,no_inplace}(<TensorType(float32, scalar)>,
<TensorType(float32, scalar)>)]')
Now, I don't mind the error itself, as it's probably caused by not running
ldconfig, but this suggests that the optimizer might want to run the op on
cpu instead of gpu, and I assume that this might be the reason for which my
net's training runs so slow (I assume that this also implies some redundant
copies between cpu and gpu memory). I also observe that the training
process takes 100% of the processor (or, rather, of a single core, as
python is not multi-threaded).
How can I tell whether there are any ops running on cpu after a successful
compilation (I already set assert_no_cpu_op='raise'), and how can I force
the ops to be executed on gpu instead?
--
---
You received this message because you are subscribed to the Google Groups
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.