When trying to compile a (rather large) neural net with device=cuda, at 
some point I get the following error:

ImportError: ('libamdlibm.so: cannot open shared object file: No such file 
or directory', '[Elemwise{pow,no_inplace}(<TensorType(float32, scalar)>, 
<TensorType(float32, scalar)>)]')

Now, I don't mind the error itself, as it's probably caused by not running 
ldconfig, but this suggests that the optimizer might want to run the op on 
cpu instead of gpu, and I assume that this might be the reason for which my 
net's training runs so slow (I assume that this also implies some redundant 
copies between cpu and gpu memory). I also observe that the training 
process takes 100% of the processor (or, rather, of a single core, as 
python is not multi-threaded).

How can I tell whether there are any ops running on cpu after a successful 
compilation (I already set assert_no_cpu_op='raise'), and how can I force 
the ops to be executed on gpu instead?


-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to