Hi,

My guess is that:

- without cnmem, allocation and deallocation of intermediate results
force synchronization of the GPU more often, so the overall time is
slower

- with cnmem and borrow=False, there is no synchronization at all, and
what is measured is just the time to launch the GPU kernels, not the
time to actually execute them.

- with cnmem and borrow=True, there seems to be one synchronization
forced after each function call, I'm not sure why.

On Sun, Oct 09, 2016, Chris Hanning wrote:
> Testing the following code from:
> 
> http://deeplearning.net/software/theano/tutorial/aliasing.html#borrowfunction
> 
> copy : https://paste.pound-python.org/show/vGCQlEMIoOPWZuUPo2DJ/
> 
> I found that running it on an iMac, i5, GeForce GT 640M gave significant 
> gains when enabling lib.cnmm
> 
> With CNMM disabled:
> 
> $ THEANO_FLAGS='device=gpu0,lib.cnmm=0' python borrow_test.py
> 
> Looping 1000 times took 0.49251699447631836 seconds without borrow and 
> 0.34339094161987305 seconds with borrow
> 
> With CNMM enabled:
> 
> $ THEANO_FLAGS='device=gpu0,lib.cnmm=0.3' python borrow_test.py
> 
> Looping 1000 times took 0.019893884658813477 seconds without borrow and 
> 0.3345789909362793 seconds with borrow
> 
> On this system, any value for cnmm over 0.4 would crash the program due to 
> memory constraints.
> There no significant difference in performance between 0.1 and 0.4.
> 
> -- 
> 
> --- 
> You received this message because you are subscribed to the Google Groups 
> "theano-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to theano-users+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.


-- 
Pascal

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to