I was shown that in *numpy* I could speed it up in the following way:
result = np.einsum('ijk,ijk->ik', prob, cases)[:,None,:]
result = np.matmul(prob.transpose(2,0,1), cases.T).T
Bot give me the expected speedup in *numpy*, but neither is implemented in
*Theano*. Is there a way to do the same in *Theano* on the *GPU*?
On Friday, 5 May 2017 11:15:26 UTC+3, Šarūnas S. wrote:
>
> In my current theano script the bottleneck is equivalent to the following
> numpy code:
>
> import numpy as np
>
> # 3D example
> axis = 0
> prob = np.random.random( ( 1, 1000, 50 ) )
> cases = np.random.random( ( 1000, 1000, 50 ) )
>
> start = time.time( )
> for i in xrange( 1000 ):
> result = ( cases * prob ).sum( axis=1-axis, keepdims=True )
> print '3D naive method took {} seconds'.format( time.time() - start )
> print result.shape
> print
>
> I had seen in 2D case that replacing elementwise+sum with a dot product
> gave me 5x speedup. Are there any theano matrix operations that could help
> me out here?
>
--
---
You received this message because you are subscribed to the Google Groups
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.