Re: [julia-users] Re: Why is BLAS.dot slower than BLAS.axpy!?

2016-09-11 Thread Andreas Noack
I get similar results for OpenBLAS. I expect that axpy gains more from vectorization than dot. On Fri, Sep 9, 2016 at 5:31 PM, Sheehan Olver wrote: > I did blas_set_num_threads(1) with the same profile numbers. This is > using Apple’s BLAS. > > Maybe I’ll try 0.5 and

Re: [julia-users] Re: Why is BLAS.dot slower than BLAS.axpy!?

2016-09-09 Thread Sheehan Olver
I did blas_set_num_threads(1) with the same profile numbers. This is using Apple’s BLAS. Maybe I’ll try 0.5 and OpenBLAS for comparison. > On 10 Sep 2016, at 2:34 AM, Andreas Noack > wrote: > > Try to time it again with threading disabled. Sometimes the

[julia-users] Re: Why is BLAS.dot slower than BLAS.axpy!?

2016-09-09 Thread Andreas Noack
Try to time it again with threading disabled. Sometimes the threading heuristics can cause unintuitive performance. On Friday, September 9, 2016 at 6:39:13 AM UTC-4, Sheehan Olver wrote: > > > I have the following code that is part of a Householder routine, where > j::Int64, > N::Int64,