I noticed the commented out BLAS.gemm! and BLAS.axpy! lines: did these help?

-Simon

On Thursday, 21 January 2016 11:12:53 UTC, Viral Shah wrote:
>
> The matrix-vector multiply in there will lose the benefit of BLAS in 
> devectorization. This is one area where we ought to be better, since this 
> code is best not devectorized (from a user's perspective).
>
> On my mac, python is .27 seconds and julia 0.4 is .47 seconds. Python is 
> perhaps not using a fast BLAS, since it is whatever came with pip.
>
> -viral
>
> On Thursday, January 21, 2016 at 4:22:52 PM UTC+5:30, Kristoffer Carlsson 
> wrote:
>>
>> There is no need to annotate your function argument types so tightly, 
>> unless you have a good reason for it.
>>
>> You will generate a lot of temporaries in your V = ...
>>
>> Rewrite it as a loop and it will be a lot faster. You could also take a 
>> look at the Devectorize.jl package.
>>
>>

Reply via email to