You'll probably need to run blas_set_num_threads(1) on each worker, otherwise by default the linear algebra operations are multithreaded in OpenBLAS and this is likely causing contention.
On Friday, May 29, 2015 at 5:56:00 AM UTC-7, jojo lalpin wrote: > > Hi All, > > I have a project doing mainly matrix multiplication and inversion (pinv > and inv). > No parallel features are used, julia is launch without additionnal workers. > Inversed matrix sizes vary from [50 to 800, 10 to 100]. > > I develop with last stable 3.8 on a standard desktop computer running > ubuntu 14.4. > Processor is a intel I5 quadcore, 8 Go ram. > I want to use the code in production on a "server" with 2 Octocores using > Hyper-V for a virtual machine running same os and julia. > VM configuration is 16 virtual cores (32 maximum), 64 Go ram. > > On desktop, main function call use 2 cores/4 at 100%, 3.4 Ghz. a lot of > free memory. > On server, same call fully use all 16 virtual cores but it takes 5 times > more time to run (on average over more then thousands runs) > Server processor speed is 2.6 Ghz, a lot of free memory. > > I would like to understand why it takes all VM ressources to finally be > slower. > > Anyone have any hints about how julia manage machine ressources and how to > maximise performance for linear algebra ? > Maybe it comes from virtualization parameters, any hints/links ? > > Thanks in advance, > > Jojo >
