I have an example of solving Laplace's equation on a 2D grid, which I'm 
trying to translate to Theano. I'm having problems with making it work fast 
— currently it's only twice as fast as the numpy implementation. I wonder 
if it is the limit for this task, or I'm doing something wrong/not 
utilizing some Theano facilities. See the code in this gist:


laplace.py: the reference implementation with numpy. Runs in 8.9s.
laplace_theano.py: my attempt at Theano-based implementation. Runs in 4.6s 
if prepare_function_subtensor() is used, and in 5.8s if 
prepare_function_conv() is used. 
laplace.jl: an optimized Julia code for comparison. Runs in 1.1s (for those 
who are interested, you can also get this result just by using 
ParallelAccelerator library, which will do all the loop unrolling 
automatically). Note that it is still the single-threaded time.

As far as I understand (and as indicated by the Julia implementation), this 
problem is sensitive to cache usage. That's why I tried to use 
theano.tensor.signal.conv.conv2d() instead of just combining subarrays, 
thinking that it would be optimized for that, but, for some reason, it ran 
even slower.

Could anyone tell me if it's possible to get closer to Julia speeds for 
this problem? Are there any hints I can provide for the Theano optimizer?


You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to