Hello,
I have an example of solving Laplace's equation on a 2D grid, which I'm
trying to translate to Theano. I'm having problems with making it work fast
— currently it's only twice as fast as the numpy implementation. I wonder
if it is the limit for this task, or I'm doing something wrong/not
utilizing some Theano facilities. See the code in this gist:

## Advertising

https://gist.github.com/fjarri/6639d8d83b6f1e06070046de12b47e5e
laplace.py: the reference implementation with numpy. Runs in 8.9s.
laplace_theano.py: my attempt at Theano-based implementation. Runs in 4.6s
if prepare_function_subtensor() is used, and in 5.8s if
prepare_function_conv() is used.
laplace.jl: an optimized Julia code for comparison. Runs in 1.1s (for those
who are interested, you can also get this result just by using
ParallelAccelerator library, which will do all the loop unrolling
automatically). Note that it is still the single-threaded time.
As far as I understand (and as indicated by the Julia implementation), this
problem is sensitive to cache usage. That's why I tried to use
theano.tensor.signal.conv.conv2d() instead of just combining subarrays,
thinking that it would be optimized for that, but, for some reason, it ran
even slower.
Could anyone tell me if it's possible to get closer to Julia speeds for
this problem? Are there any hints I can provide for the Theano optimizer?
--
---
You received this message because you are subscribed to the Google Groups
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.