It says "'-Wl,-framework -Wl,Accelerate'", so, I guess, my assumption was
correct.
--
---
You received this message because you are subscribed to the Google Groups
"theano-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to theano-users+unsubscr..
Try
import theano
print theano.config.blas.ldflags
It will tell you which flag Theano will use. If empty string, it mean
wasn't able to find one.
Le 19 oct. 2016 19:03, "Bogdan Opanchuk" a écrit :
> None. I'm on OSX, so I just assumed Theano used Accelerate (it is also
> implied by the documen
None. I'm on OSX, so I just assumed Theano used Accelerate (it is also
implied by the documentation). Is there a way to see which backend is used?
--
---
You received this message because you are subscribed to the Google Groups
"theano-users" group.
To unsubscribe from this group and stop rec
Which version of blas did you install? They have different performance.
Le 18 oct. 2016 22:18, "Bogdan Opanchuk" a écrit :
> > For 3x3 filters with 1 channel, it may just be too small for the
> im2col/CorrMM version to show any improvement.
>
> So the variant that I had with straightforward addi
> For 3x3 filters with 1 channel, it may just be too small for the
im2col/CorrMM version to show any improvement.
So the variant that I had with straightforward addition of submatrices is
the best way to go?
> Also, are you using openmp or a threaded BLAS? You might want to try
disabling openm
Also, are you using openmp or a threaded BLAS? You might want to try
disabling openmp or trying different numbers of BLAS threads.
On Tuesday, October 18, 2016 at 9:38:56 AM UTC-7, Jesse Livezey wrote:
>
> I think that nn.conv2d() will have the best improvement when the
> inner-product dimension
I think that nn.conv2d() will have the best improvement when the
inner-product dimension is large, i.e. when filter_x * filter_y *
n_channels is large. For 3x3 filters with 1 channel, it may just be too
small for the im2col/CorrMM version to show any improvement.
On Monday, October 17, 2016 at
The difference in performance between nnet.conv2d() and nnet.conv.conv2d()
seems to be about the same for 100x100 matrices.
The profile is as follows:
Apply
--
<% time><#call>
88.0%88.0% 18.591s 1.84e-01s101 5 CorrMM{valid,
(1, 1), (1, 1)}(InplaceDimShuffle
That's intriguing, I'd be curious to see a profile. Maybe for large
images it is actually worse.
One thing that might help would be to make `u` a shared variable and to
update it in `lap_and_err`, you may save a memory copy, but that may not
be a big deal.
On Mon, Oct 17, 2016, Bogdan Opanchuk wr
Hi Pascal,
Thanks for the suggestion. Paradoxically though, I get 4 times worse
performance with nnet.conv2d: 21.3s vs 5.7s with the old nnet.conv.conv2d.
The new function constructor that I have:
from theano.tensor.nnet import conv2d
...
def prepare_function_conv(dxd, dyd):
flt = theano
Hi,
signal.conv2d uses a legacy implementation of convolution that is
significantly slower than some alternatives.
If you can call theano.tensor.nnet.conv2d instead, you could benefit
from a better implementation (based on GEMM).
On Sun, Oct 16, 2016, Bogdan Opanchuk wrote:
> Hello,
>
> I have
11 matches
Mail list logo