as an example, the data looks like this:

using(TimeIt)
v = rand(3)
r = rand(6000,3)
x = linspace(1.0, 2.0, 300) * (v./sqrt(sumabs2(v)))'

*# Julia 0.4 function*

function s04(xl, rl)
        result = zeros(size(xl,1))
        for i = 1:size(xl,1)
                dotprods = rl * xl[i,:]'                                   
          #10000 loops, best of 3: 17.66 µs per loop
                imexp  = exp(im .* dotprods)                               
#1000 loops, best of 3: 172.33 µs per loop
                sumprod  = sum(imexp) * sum(conj(imexp))       #10000 
loops, best of 3: 21.04 µs per loop
                result[i] = sumprod
        end
        return result
end

and using @timeit s04(x,r) gives 
10 loops, best of 3: *67.52 ms* per loop
where most time is spend in the exp() calls. Now in 0.5dev, the individual 
parts have similar or actually better timings like the dot product:

*# Julia 0.5 function*

function s05(xl, rl)
        result = zeros(size(xl,1))
        for i = 1:size(xl,1)
                dotprods = rl * xl[i,:]                                     
   #10000 loops, best of 3: 10.99 µs per loop
                imexp  = exp(im .* dotprods)                          #1000 
loops, best of 3: 158.50 µs per loop
                sumprod  = sum(imexp) * sum(conj(imexp))  #10000 loops, 
best of 3: 21.81 µs per loop
                result[i] = sumprod
        end
        return result
end

but @timeit s05(x,r) always gives something ~70% worse runtime:
10 loops, best of 3: *113.80 ms* per loop

And always the same on my Fedora23 workstation, individual calls inside the 
function have slightly better performance in 0.5dev, but the whole function 
is slower. But oddly enough only on my Fedora workstation! On a OS X 
laptop, those 0.5dev speedups from the parts inside the loop translate in 
the expected speedup for the whole function!
So that puzzles me, could someone perhaps reproduce this with above 
function and input on an linux system, preferably also fedora?

cheers, Johannes

On Friday, February 26, 2016 at 4:28:05 PM UTC+1, Kristoffer Carlsson wrote:
>
> What code and where is it spending time? You talk about openblas, does it 
> mean that blas got slower for you? How about peakflops() on the different 
> versions?
>
> On Friday, February 26, 2016 at 4:08:06 PM UTC+1, Johannes Wagner wrote:
>>
>> hey guys,
>> I just experienced something weird. I have some code that runs fine on 
>> 0.43, then I updated to 0.5dev to test the new Arrays, run same code and 
>> noticed it got about ~50% slower. Then I downgraded back to 0.43, ran the 
>> old code, but speed remained slow. I noticed while reinstalling 0.43, 
>> openblas-threads didn't get isntalled along with it. So I manually 
>> installed it, but no change. 
>> Does anyone has an idea what could be going on? LLVM on fedora23 is 3.7
>>
>> Cheers, Johannes
>>
>

Reply via email to