as an example, the data looks like this:
using(TimeIt)
v = rand(3)
r = rand(6000,3)
x = linspace(1.0, 2.0, 300) * (v./sqrt(sumabs2(v)))'
*# Julia 0.4 function*
function s04(xl, rl)
result = zeros(size(xl,1))
for i = 1:size(xl,1)
dotprods = rl * xl[i,:]'
#10000 loops, best of 3: 17.66 µs per loop
imexp = exp(im .* dotprods)
#1000 loops, best of 3: 172.33 µs per loop
sumprod = sum(imexp) * sum(conj(imexp)) #10000
loops, best of 3: 21.04 µs per loop
result[i] = sumprod
end
return result
end
and using @timeit s04(x,r) gives
10 loops, best of 3: *67.52 ms* per loop
where most time is spend in the exp() calls. Now in 0.5dev, the individual
parts have similar or actually better timings like the dot product:
*# Julia 0.5 function*
function s05(xl, rl)
result = zeros(size(xl,1))
for i = 1:size(xl,1)
dotprods = rl * xl[i,:]
#10000 loops, best of 3: 10.99 µs per loop
imexp = exp(im .* dotprods) #1000
loops, best of 3: 158.50 µs per loop
sumprod = sum(imexp) * sum(conj(imexp)) #10000 loops,
best of 3: 21.81 µs per loop
result[i] = sumprod
end
return result
end
but @timeit s05(x,r) always gives something ~70% worse runtime:
10 loops, best of 3: *113.80 ms* per loop
And always the same on my Fedora23 workstation, individual calls inside the
function have slightly better performance in 0.5dev, but the whole function
is slower. But oddly enough only on my Fedora workstation! On a OS X
laptop, those 0.5dev speedups from the parts inside the loop translate in
the expected speedup for the whole function!
So that puzzles me, could someone perhaps reproduce this with above
function and input on an linux system, preferably also fedora?
cheers, Johannes
On Friday, February 26, 2016 at 4:28:05 PM UTC+1, Kristoffer Carlsson wrote:
>
> What code and where is it spending time? You talk about openblas, does it
> mean that blas got slower for you? How about peakflops() on the different
> versions?
>
> On Friday, February 26, 2016 at 4:08:06 PM UTC+1, Johannes Wagner wrote:
>>
>> hey guys,
>> I just experienced something weird. I have some code that runs fine on
>> 0.43, then I updated to 0.5dev to test the new Arrays, run same code and
>> noticed it got about ~50% slower. Then I downgraded back to 0.43, ran the
>> old code, but speed remained slow. I noticed while reinstalling 0.43,
>> openblas-threads didn't get isntalled along with it. So I manually
>> installed it, but no change.
>> Does anyone has an idea what could be going on? LLVM on fedora23 is 3.7
>>
>> Cheers, Johannes
>>
>