Hello, I have an example of solving Laplace's equation on a 2D grid, which I'm trying to translate to Theano. I'm having problems with making it work fast — currently it's only twice as fast as the numpy implementation. I wonder if it is the limit for this task, or I'm doing something wrong/not utilizing some Theano facilities. See the code in this gist:
https://gist.github.com/fjarri/6639d8d83b6f1e06070046de12b47e5e laplace.py: the reference implementation with numpy. Runs in 8.9s. laplace_theano.py: my attempt at Theano-based implementation. Runs in 4.6s if prepare_function_subtensor() is used, and in 5.8s if prepare_function_conv() is used. laplace.jl: an optimized Julia code for comparison. Runs in 1.1s (for those who are interested, you can also get this result just by using ParallelAccelerator library, which will do all the loop unrolling automatically). Note that it is still the single-threaded time. As far as I understand (and as indicated by the Julia implementation), this problem is sensitive to cache usage. That's why I tried to use theano.tensor.signal.conv.conv2d() instead of just combining subarrays, thinking that it would be optimized for that, but, for some reason, it ran even slower. Could anyone tell me if it's possible to get closer to Julia speeds for this problem? Are there any hints I can provide for the Theano optimizer? -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
