Hi All,
I would like to ask if someone has an experience with Threads as they are
implemented at the moment in the master branch.
After the successful compilation (put JULIA_THREADS=1 to Make.user)
I have played with different levels of granularity, but usually the code
was slower or more or less the same speed as single threaded version. I
have even tried a totally stupid execution like this
using Base.Threads;
function one()
x=randn(1000000);
f=0;
for i in x
f+=i;
end
end
function two()
x=randn(1000000);
f=zeros(nthreads())
@inbounds @threads for i in 1:length(x)
f[threadid()]+=x[i];
end
sum(f)
end
one()
@time one()
two()
@time two()
and the times on my 2013 Macbook air were
0.068617 seconds (2.00 M allocations: 38.157 MB, 9.72% gc time)
0.394164 seconds (5.72 M allocations: 99.015 MB, 5.00% gc time)
Wov, that is quite poor. I would expect an overhead, but not big like this.
Can anyone suggest, what is going wrong? I have been trying a profiler, but
it does not help. It seems that it does not work with Threads at the
moment. Or, is it because Threads are still not really supported.
I would like to get speed-up showing in this video
https://www.youtube.com/watch?v=GvLhseZ4D8M
Any suggestions welcomed.
Tomas