Hey everyone,

I've been looking at parallel programming in julia and was getting some 
very unexpected results and rather bad performance because of this. 
Sadly I ran out of ideas of what could be going on, disproving all ideas I 
had. Hence this post :)

I was able to construct a similar (simpler) example which exhibits the same 
behavior (attached file).
The example is a very naive and suboptimal implementation in many ways (the 
actual code is much more optimal), but that's not the issue.

The issue I'm trying to investigate is the big difference in worker time 
when a single worker is active and when multiple are active.

Ideas I disproved:
  - julia processes pinned to a single core
  - julia process uses multiple threads to do the work, and processes are 
fighting for the cores
  - not enough cores on the machine (there are plenty)
  - htop nicely shows 4 julia processes working on different cores
  - there is no communication at the application level stalling anyone

All I'm left with now is that julia is doing some hidden synchronization 
somewhere.
Any input is appreciated. Thanks in advance.

Kind regards,
Tom

Attachment: parallel-test.jl
Description: Binary data

Reply via email to