I still had a shared-memory model in mind, which is totally wrong for Julia 
of course. There will be nworkers() copies of A in cache instead of one. 
I thus believe the problem is indeed memory-bandwidth. A quick test with a 
cpu-intensive example does give the expected scaling.

Thanks,
Tom

Reply via email to