Currently, I have 3 approaches that are portable to theano: # 3D example axis = 0 prob = np.random.random( ( 1, 1000, 50 ) ) cases = np.random.random( ( 1000, 1000, 50 ) )
# Elementwise + sum for i in xrange( 100 ): result = ( cases * prob ).sum( axis=1-axis, keepdims=True ) # Loop version result = np.zeros( ( 1000, 1, 50 ) ) for i in xrange( 5 ): result[ :, :, i ] = np.dot( prob[ :, :, i ], cases[ :, :, i ] ) # Block diagonal sparse dot version prob_big = np.zeros( ( 1, 1000, 50, 50 ) ) cases_big = np.zeros( ( 1000, 1000, 50, 50 ) ) for i in xrange( 50 ): prob_big[ :, :, i, i ] = prob[ :, :, i, i ] cases_big[ :, :, i, i ] = prob[ :, :, i, i ] intermediate = np.tensordot( prob_big, cases_big, axes=[ [ 0 ], [ 1 ] ] ) result = np.zeros( 1000, 1, 50 ) for i in range( 50 ): result[ :, :, i ] = intermediate[ :, :, i, i ] I think the the one which would structure this as a sparse block diagonal matrix would work best since I've seen some support for the block sparse matrices. However, it looks like I would still need some loop for blocksparse to iterate over all the blocks. Is there a way to somehow do all the blocks at once and collect the diagonal without using scan? On Saturday, 6 May 2017 10:41:06 UTC+3, Šarūnas S. wrote: > > I have tried that, but to no avail. The problem is that I have to multiply > on 2 axes, but sum only on 1. > > On Friday, 5 May 2017 19:23:12 UTC+3, Jesse Livezey wrote: >> >> I think tensordot should do what you want >> >> http://deeplearning.net/software/theano/library/tensor/basic.html#theano.tensor.tensordot >> something like >> result = T.tensordot(prob, cases, axes=1) >> >> >> >> On Friday, May 5, 2017 at 3:17:14 AM UTC-7, Šarūnas S. wrote: >>> >>> I was shown that in *numpy* I could speed it up in the following way: >>> >>> result = np.einsum('ijk,ijk->ik', prob, cases)[:,None,:] >>> >>> >>> result = np.matmul(prob.transpose(2,0,1), cases.T).T >>> >>> >>> Bot give me the expected speedup in *numpy*, but neither is implemented >>> in *Theano*. Is there a way to do the same in *Theano* on the *GPU*? >>> >>> >>> >>> On Friday, 5 May 2017 11:15:26 UTC+3, Šarūnas S. wrote: >>>> >>>> In my current theano script the bottleneck is equivalent to the >>>> following numpy code: >>>> >>>> import numpy as np >>>> >>>> # 3D example >>>> axis = 0 >>>> prob = np.random.random( ( 1, 1000, 50 ) ) >>>> cases = np.random.random( ( 1000, 1000, 50 ) ) >>>> >>>> start = time.time( ) >>>> for i in xrange( 1000 ): >>>> result = ( cases * prob ).sum( axis=1-axis, keepdims=True ) >>>> print '3D naive method took {} seconds'.format( time.time() - start ) >>>> print result.shape >>>> print >>>> >>>> I had seen in 2D case that replacing elementwise+sum with a dot product >>>> gave me 5x speedup. Are there any theano matrix operations that could help >>>> me out here? >>>> >>> -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
