Yes you're right. I forgot to mention that the OpenMP schedule used by the multi-threaded system was a static schedule as well (#pragma omp parallel schedule(static))
Rohan On Tue, Dec 14, 2021 at 3:26 PM Victor Eijkhout <eijkh...@tacc.utexas.edu> wrote: > > > On , 2021Dec11, at 17:56, Rohan Yadav <roh...@alumni.cmu.edu> wrote: > > 40 mpi ranks on a single node should be similar performance as 40 threads. > Both petsc and taco are doing a row-based parallelism strategy so it should > line up. > > > An MPI division of rows is static. Petsc divides strictly by numbers of > rows. > > A thread based system can do things like “schedule(guided)” (OpenMP) and > get better load balancing if the rows have widely differing numbers of > nonzero. > > Victor. > >