I have encountered a similar issue: https://groups.google.com/forum/#!searchin/theano-users/Gemv%7Csort:date/theano-users/UfPNnTI1pI4/2w48Gid_BwAJ
On Tue, Jan 31, 2017 at 2:56 AM, Raphael Shu <[email protected]> wrote: > Hi, > > > It turns out the LSTMs run very slow on CPU, > > > the profiling results show that theano.tensor.blas.Gemv is the reason and the > type of Gemv is Py. > > > Does this result imply that the Gemv operation is run on Python level? > > > Is there anyone can provide some tips on how to speed up the operation? > > > Thanks ! > > > Raphael Shu > > > > Function profiling > ================== > Message: /home/shu/research/deepy/deepy/networks/network.py:196 > Time in 581 calls to Function.__call__: 7.022281e+01s > Time in Function.fn.__call__: 7.018702e+01s (99.949%) > Time in thunks: 7.015664e+01s (99.906%) > Total compile time: 1.668830e-01s > Number of Apply nodes: 49 > Theano Optimizer time: 1.264119e-01s > Theano validate time: 1.095724e-02s > Theano Linker time (includes C, CUDA code generation/compiling): > 2.585006e-02s > Import time 1.466990e-03s > Node make_thunk time 2.393484e-02s > Node Elemwise{Composite{((scalar_sigmoid((i0 + i1)) * i2) + > (scalar_sigmoid((i3 + > i4)) * tanh((i5 + i6))))}}[(0, 1)](LSTM_bf, Gemv{inplace}.0, > Subtensor{int64:int64:}.0, LST > M_bi, Gemv{inplace}.0, LSTM_bc, Gemv{inplace}.0) time 1.141071e-03s > Node Gemv{inplace}(AllocEmpty{dtype='float32'}.0, > TensorConstant{1.0}, Elemwise{C > omposite{tanh((i0 + i1))}}.0, attention_va, TensorConstant{0.0}) time > 8.509159e-04s > Node Gemv{inplace}(AllocEmpty{dtype='float32'}.0, > TensorConstant{1.0}, LSTM_wf.T, > Join.0, TensorConstant{0.0}) time 8.258820e-04s > Node Gemv{inplace}(AllocEmpty{dtype='float32'}.0, > TensorConstant{1.0}, attention_ > wa.T, Subtensor{:int64:}.0, TensorConstant{0.0}) time 7.920265e-04s > Node Elemwise{Composite{tanh((i0 + > i1))}}(InplaceDimShuffle{x,0}.0, uah) time 7.7 > 41451e-04s > > Time in all call to theano.grad() 0.000000e+00s > Time since theano import 335.098s > Class > --- > <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Class > name> > 99.2% 99.2% 69.597s 1.20e-02s Py 5810 10 > theano.tensor.blas.Ge > mv > 0.7% 99.9% 0.488s 2.10e-04s C 2324 4 > theano.tensor.elemwis > e.Elemwise > 0.0% 99.9% 0.026s 4.48e-05s C 581 1 > theano.tensor.elemwis > e.Sum > 0.0% 100.0% 0.013s 1.81e-06s C 6972 12 > theano.tensor.elemwis > e.DimShuffle > 0.0% 100.0% 0.010s 8.33e-06s C 1162 2 > theano.tensor.basic.J > oin > 0.0% 100.0% 0.007s 1.21e-05s C 581 1 > theano.tensor.subtens > or.AdvancedSubtensor1 > 0.0% 100.0% 0.005s 2.95e-06s C 1743 3 > theano.tensor.subtens > or.Subtensor > 0.0% 100.0% 0.005s 1.35e-06s C 3486 6 > theano.compile.ops.Sh > ape_i > 0.0% 100.0% 0.003s 7.82e-07s C 3486 6 > theano.tensor.basic.A > llocEmpty > 0.0% 100.0% 0.002s 1.51e-06s C 1162 2 > theano.tensor.basic.R > eshape > 0.0% 100.0% 0.002s 2.60e-06s C 581 1 > theano.tensor.nnet.nn > et.Softmax > 0.0% 100.0% 0.001s 1.09e-06s C 581 1 > theano.compile.ops.Re > broadcast > ... (remaining 0 Classes account for 0.00%(0.00s) of the runtime) > > Ops > --- > <% time> <sum %> <apply time> <time per call> <type> <#call> <#apply> <Op > name> > 99.2% 99.2% 69.597s 1.20e-02s Py 5810 10 > Gemv{inplace} > 0.5% 99.7% 0.375s 6.46e-04s C 581 1 > Elemwise{Composite{t > anh((i0 + i1))}} > 0.1% 99.8% 0.052s 8.99e-05s C 581 1 > Elemwise{Composite{( > (scalar_sigmoid((i0 + i1)) * i2) + (scalar_sigmoid((i3 + i4)) * tanh((i5 + > i6))))}}[(0, 1)] > 0.0% 99.9% 0.033s 5.60e-05s C 581 1 > Elemwise{Composite{( > scalar_sigmoid((i0 + i1)) * tanh(i2))}}[(0, 1)] > 0.0% 99.9% 0.028s 4.78e-05s C 581 1 > Elemwise{mul,no_inpl > ace} > 0.0% 99.9% 0.026s 4.48e-05s C 581 1 > Sum{axis=[0], acc_dt > ype=float64} > 0.0% 99.9% 0.010s 1.96e-06s C 5229 9 > InplaceDimShuffle{1, > 0} > 0.0% 100.0% 0.010s 8.33e-06s C 1162 2 Join > 0.0% 100.0% 0.007s 1.21e-05s C 581 1 > AdvancedSubtensor1 > 0.0% 100.0% 0.004s 1.46e-06s C 2905 5 > Shape_i{1} > 0.0% 100.0% 0.003s 7.82e-07s C 3486 6 > AllocEmpty{dtype='fl > oat32'} > 0.0% 100.0% 0.003s 4.41e-06s C 581 1 > Subtensor{int64:int6 > 4:} > 0.0% 100.0% 0.002s 1.51e-06s C 1162 2 > Reshape{1} > 0.0% 100.0% 0.002s 1.46e-06s C 1162 2 > InplaceDimShuffle{x, > 0} > 0.0% 100.0% 0.002s 2.66e-06s C 581 1 > Subtensor{::, :int64 > :} > 0.0% 100.0% 0.002s 2.60e-06s C 581 1 > Softmax > 0.0% 100.0% 0.001s 1.78e-06s C 581 1 > Subtensor{:int64:} > 0.0% 100.0% 0.001s 1.16e-06s C 581 1 > InplaceDimShuffle{1, > x} > 0.0% 100.0% 0.001s 1.09e-06s C 581 1 > Rebroadcast{1} > 0.0% 100.0% 0.000s 8.04e-07s C 581 1 > Shape_i{0} > ... (remaining 0 Ops account for 0.00%(0.00s) of the runtime) > > Apply > ------ > <% time> <sum %> <apply time> <time per call> <#call> <id> <Apply name> > 22.3% 22.3% 15.677s 2.70e-02s 581 39 > Gemv{inplace}(AllocEmpty{dtype= > 'float32'}.0, TensorConstant{1.0}, LSTM_wi.T, Join.0, TensorConstant{0.0}) > 22.2% 44.5% 15.575s 2.68e-02s 581 38 > Gemv{inplace}(AllocEmpty{dtype= > 'float32'}.0, TensorConstant{1.0}, LSTM_wo.T, Join.0, TensorConstant{0.0}) > 22.0% 66.5% 15.409s 2.65e-02s 581 41 > Gemv{inplace}(AllocEmpty{dtype= > 'float32'}.0, TensorConstant{1.0}, LSTM_wc.T, Join.0, TensorConstant{0.0}) > 21.8% 88.4% 15.324s 2.64e-02s 581 40 > Gemv{inplace}(AllocEmpty{dtype= > 'float32'}.0, TensorConstant{1.0}, LSTM_wf.T, Join.0, TensorConstant{0.0}) > 2.2% 90.5% 1.523s 2.62e-03s 581 42 > Gemv{inplace}(Gemv{inplace}.0, > TensorConstant{1.0}, LSTM_uc.T, Subtensor{:int64:}.0, TensorConstant{1.0}) > 2.2% 92.7% 1.523s 2.62e-03s 581 43 > Gemv{inplace}(Gemv{inplace}.0, > TensorConstant{1.0}, LSTM_ui.T, Subtensor{:int64:}.0, TensorConstant{1.0}) > 2.2% 94.9% 1.514s 2.61e-03s 581 27 > Gemv{inplace}(AllocEmpty{dtype= > 'float32'}.0, TensorConstant{1.0}, attention_wa.T, Subtensor{:int64:}.0, > TensorConstant{0.0} > ) > 2.2% 97.0% 1.513s 2.60e-03s 581 45 > Gemv{inplace}(Gemv{inplace}.0, > TensorConstant{1.0}, LSTM_uo.T, Subtensor{:int64:}.0, TensorConstant{1.0}) > 2.2% 99.2% 1.509s 2.60e-03s 581 44 > Gemv{inplace}(Gemv{inplace}.0, > TensorConstant{1.0}, LSTM_uf.T, Subtensor{:int64:}.0, TensorConstant{1.0}) > 0.5% 99.7% 0.375s 6.46e-04s 581 30 > Elemwise{Composite{tanh((i0 + i > 1))}}(InplaceDimShuffle{x,0}.0, uah) > 0.1% 99.8% 0.052s 8.99e-05s 581 46 > Elemwise{Composite{((scalar_sig > moid((i0 + i1)) * i2) + (scalar_sigmoid((i3 + i4)) * tanh((i5 + i6))))}}[(0, > 1)](LSTM_bf, Ge > mv{inplace}.0, Subtensor{int64:int64:}.0, LSTM_bi, Gemv{inplace}.0, LSTM_bc, > Gemv{inplace}.0 > ) > 0.0% 99.8% 0.033s 5.60e-05s 581 47 > Elemwise{Composite{(scalar_sigm > oid((i0 + i1)) * tanh(i2))}}[(0, 1)](LSTM_bo, Gemv{inplace}.0, > Elemwise{Composite{((scalar_s > igmoid((i0 + i1)) * i2) + (scalar_sigmoid((i3 + i4)) * tanh((i5 + > i6))))}}[(0, 1)].0) > 0.0% 99.9% 0.031s 5.27e-05s 581 31 > Gemv{inplace}(AllocEmpty{dtype= > 'float32'}.0, TensorConstant{1.0}, Elemwise{Composite{tanh((i0 + i1))}}.0, > attention_va, Ten > sorConstant{0.0}) > 0.0% 99.9% 0.028s 4.78e-05s 581 35 > Elemwise{mul,no_inplace}(Inplac > eDimShuffle{1,x}.0, Subtensor{::, :int64:}.0) > 0.0% 99.9% 0.026s 4.48e-05s 581 36 Sum{axis=[0], > acc_dtype=float64 > }(Elemwise{mul,no_inplace}.0) > 0.0% 99.9% 0.007s 1.21e-05s 581 26 > AdvancedSubtensor1(word_embed_e > mbeddings, Rebroadcast{1}.0) > 0.0% 100.0% 0.006s 1.03e-05s 581 48 > Join(TensorConstant{0}, Elemwis > e{Composite{(scalar_sigmoid((i0 + i1)) * tanh(i2))}}[(0, 1)].0, > Elemwise{Composite{((scalar_ > sigmoid((i0 + i1)) * i2) + (scalar_sigmoid((i3 + i4)) * tanh((i5 + > i6))))}}[(0, 1)].0, Tenso > rConstant{(1,) of 0.0}) > 0.0% 100.0% 0.004s 6.38e-06s 581 37 > Join(TensorConstant{0}, Sum{axi > s=[0], acc_dtype=float64}.0, Reshape{1}.0) > 0.0% 100.0% 0.003s 4.41e-06s 581 3 > Subtensor{int64:int64:}(s, Cons > tant{1000}, Constant{2000}) > 0.0% 100.0% 0.002s 2.87e-06s 581 16 > InplaceDimShuffle{1,0}(LSTM_uo) > ... (remaining 29 Apply instances account for 0.04%(0.02s) of the runtime) > > -- > > --- > You received this message because you are subscribed to the Google Groups > "theano-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
