I was shown that in *numpy* I could speed it up in the following way:

result = np.einsum('ijk,ijk->ik', prob, cases)[:,None,:]


result = np.matmul(prob.transpose(2,0,1), cases.T).T


Bot give me the expected speedup in *numpy*, but neither is implemented in 
*Theano*. Is there a way to do the same in *Theano* on the *GPU*?



On Friday, 5 May 2017 11:15:26 UTC+3, Šarūnas S. wrote:
>
> In my current theano script the bottleneck is equivalent to the following 
> numpy code:
>
> import numpy as np
>
> # 3D example
> axis = 0
> prob = np.random.random( ( 1, 1000, 50 ) )
> cases = np.random.random( ( 1000, 1000, 50 ) )
>
> start = time.time(  )
> for i in xrange( 1000 ):
> result = ( cases * prob ).sum( axis=1-axis, keepdims=True )
> print '3D naive method took {} seconds'.format( time.time() - start )
> print result.shape
> print
>
> I had seen in 2D case that replacing elementwise+sum with a dot product 
> gave me 5x speedup. Are there any theano matrix operations that could help 
> me out here? 
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to