Re: [julia-users] @threads not providing as big speedup as expected

Tim Holy Mon, 29 Aug 2016 15:51:26 -0700

Just noticed that you're allocating memory on each iteration. If you have the 
patience to write out all those matrix operations explicitly, it should help. 
Alternatively, perhaps try ParallelAccelerator.


Best,
--Tim

On Monday, August 29, 2016 10:49:40 AM CDT Marius Millea wrote:
> Thanks, just tried wrapping the for loop inside a function and it seems to
> make the @threads version slightly slower and serial version slightly
> faster, so I'm even further from the speedup I was hoping for! Reading
> through that Issue and linked ones, I guess I may not be the only one
> seeing this.
> 
> For ref, what I did:
> 
> function myloop(inv_cl,d_cl,fish,ijs,nl)
>     @threads for ij in ijs
>         i,j = ij
>         for l in 1:nl
>             fish[i,j] +=
> (2*l+1)/2*trace(inv_cl[:,:,l]*d_cl[i][:,:,l]*inv_cl[:,:,l]*d_cl[j][:,:,l])
>         end
>     end
> end
> 
> function test(nl,np)
>     inv_cl = ones(3,3,nl)
>     d_cl = Dict(i => ones(3,3,nl) for i=1:np)
> 
>     fish = zeros(np,np)
>     ijs = [(i,j) for i=1:np, j=1:np]
> 
>     myloop(inv_cl,d_cl,fish,ijs,nl)
> end
> 
> # with @threads
> @timeit test(3000,40)
> 1 loops, best of 3: 3.84 s per loop
> 
> # without @threads
> @timeit test(3000,40)
> 1 loops, best of 3: 2.33 s per loop
> 
> On Monday, August 29, 2016 at 6:50:15 PM UTC+2, Tim Holy wrote:
> > Very quickly (train to catch!): try this
> > https://github.com/JuliaLang/julia/
> > 
> > issues/17395#issuecomment-241911387
> > <https://github.com/JuliaLang/julia/issues/17395#issuecomment-241911387>
> > and see if it helps.
> > 
> > --Tim
> > 
> > On Monday, August 29, 2016 9:22:09 AM CDT Marius Millea wrote:
> > > I've parallelized some code with @threads, but instead of a factor NCPUs
> > > speed improvement (for me, 8), I'm seeing rather a bit under a factor 2.
> > 
> > I
> > 
> > > suppose the answer may be that my bottleneck isn't computation, rather
> > > memory access. But during running the code, I see my CPU usage go to
> > 
> > 100%
> > 
> > > on all 8 CPUs, if it were memory access would I still see this? Maybe
> > 
> > the
> > 
> > > answer is yes, in which case memory access is likely the culprit; is
> > 
> > there
> > 
> > > some way to confirm this though? If no, how do I figure out what *is*
> > 
> > the
> > 
> > > culprit?
> > > 
> > > Here's a stripped down version of my code,
> > > 
> > > 
> > > function test(nl,np)
> > > 
> > >     inv_cl = ones(3,3,nl)
> > >     d_cl = Dict(i => ones(3,3,nl) for i=1:np)
> > >     
> > >     fish = zeros(np,np)
> > >     ijs = [(i,j) for i=1:np, j=1:np]
> > >     
> > >     Threads.@threads for ij in ijs
> > >     
> > >         i,j = ij
> > >         for l in 1:nl
> > >         
> > >             fish[i,j] +=
> > 
> > (2*l+1)/2*trace(inv_cl[:,:,l]*d_cl[i][:,:,l]*inv_cl
> > 
> > > [:,:,l]*d_cl[j][:,:,l])
> > > 
> > >         end
> > >     
> > >     end
> > > 
> > > end
> > > 
> > > 
> > > # with the @threads
> > > @timeit test(3000,40)
> > > 1 loops, best of 3: 3.17 s per loop
> > > 
> > > # now remove the @threads from above
> > > @timeit test(3000,40)
> > > 1 loops, best of 3: 4.42 s per loop
> > > 
> > > 
> > > 
> > > Thanks.

Re: [julia-users] @threads not providing as big speedup as expected

Reply via email to