Also, are you using openmp or a threaded BLAS? You might want to try
disabling openmp or trying different numbers of BLAS threads.
On Tuesday, October 18, 2016 at 9:38:56 AM UTC-7, Jesse Livezey wrote:
>
> I think that nn.conv2d() will have the best improvement when the
> inner-product dimension is large, i.e. when filter_x * filter_y *
> n_channels is large. For 3x3 filters with 1 channel, it may just be too
> small for the im2col/CorrMM version to show any improvement.
>
> On Monday, October 17, 2016 at 10:30:13 PM UTC-7, Bogdan Opanchuk wrote:
>>
>> The difference in performance between nnet.conv2d() and
>> nnet.conv.conv2d() seems to be about the same for 100x100 matrices.
>>
>> The profile is as follows:
>>
>> Apply
>> ------
>> <% time> <sum %> <apply time> <time per call> <#call> <id> <Apply name>
>> 88.0% 88.0% 18.591s 1.84e-01s 101 5
>> CorrMM{valid, (1, 1), (1, 1)}(InplaceDimShuffle{x,x,0,1}.0, Subtensor{::,
>> ::, ::int64, ::int64}.0)
>> 5.5% 93.5% 1.170s 1.16e-02s 101 11
>> IncSubtensor{Set;int64:int64:, int64:int64:}(u, Reshape{2}.0, Constant{1},
>> Constant{-1}, Constant{1}, Constant{-1})
>> 4.8% 98.3% 1.022s 1.01e-02s 101 12
>> Elemwise{Composite{sqr((i0 - i1))}}(IncSubtensor{Set;int64:int64:,
>> int64:int64:}.0, u)
>> 1.7% 100.0% 0.352s 3.49e-03s 101 13
>> Sum{acc_dtype=float64}(Elemwise{Composite{sqr((i0 - i1))}}.0)
>> ...
>>
>> Making `u` shared does not change the timings much, I expect it will be
>> more important if I use the GPU backend.
>>
>
--
---
You received this message because you are subscribed to the Google Groups
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.