Also, are you using openmp or a threaded BLAS? You might want to try 
disabling openmp or trying different numbers of BLAS threads.

On Tuesday, October 18, 2016 at 9:38:56 AM UTC-7, Jesse Livezey wrote:
>
> I think that nn.conv2d() will have the best improvement when the 
> inner-product dimension is large, i.e. when filter_x * filter_y * 
> n_channels is large. For 3x3 filters with 1 channel, it may just be too 
> small for the im2col/CorrMM version to show any improvement.
>
> On Monday, October 17, 2016 at 10:30:13 PM UTC-7, Bogdan Opanchuk wrote:
>>
>> The difference in performance between nnet.conv2d() and 
>> nnet.conv.conv2d() seems to be about the same for 100x100 matrices. 
>>
>> The profile is as follows:
>>
>> Apply
>> ------
>> <% time> <sum %> <apply time> <time per call> <#call> <id> <Apply name>
>>   88.0%    88.0%      18.591s       1.84e-01s    101     5   
>> CorrMM{valid, (1, 1), (1, 1)}(InplaceDimShuffle{x,x,0,1}.0, Subtensor{::, 
>> ::, ::int64, ::int64}.0)
>>    5.5%    93.5%       1.170s       1.16e-02s    101    11   
>> IncSubtensor{Set;int64:int64:, int64:int64:}(u, Reshape{2}.0, Constant{1}, 
>> Constant{-1}, Constant{1}, Constant{-1})
>>    4.8%    98.3%       1.022s       1.01e-02s    101    12   
>> Elemwise{Composite{sqr((i0 - i1))}}(IncSubtensor{Set;int64:int64:, 
>> int64:int64:}.0, u)
>>    1.7%   100.0%       0.352s       3.49e-03s    101    13   
>> Sum{acc_dtype=float64}(Elemwise{Composite{sqr((i0 - i1))}}.0)
>> ...
>>
>> Making `u` shared does not change the timings much, I expect it will be 
>> more important if I use the GPU backend.
>>
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to theano-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to