1000 by 1000 (sparse presumably) is way to small for scaling studies. With 
sparse matrices you want at least 10-20,000 unknowns per MPI process. 

     What is the Number of steps in the table? Is this Krylov iterations? If so 
that is also problematic because you are comparing two things: parallel work 
but much more work because many more iterations.

   Barry

> On Jun 16, 2017, at 7:57 AM, Damian Kaliszan <[email protected]> wrote:
> 
> Hi,
> 
> For  several  days  I've been trying to figure out what is going wrong
> with my python app timings solving Ax=b with KSP (GMRES) solver when trying 
> to run on Intel's KNL 7210/7230.
> 
> I  downsized  the  problem  to  1000x1000 A matrix and a single node and
> observed the following:
> 
> <int_1.jpg>
> I'm attaching 2 extreme timings where configurations differ only by 1 OMP 
> thread (64MPI/1 OMP vs 64/2 OMPs),
> 23321 vs 23325 slurm task ids.
> 
> Any help will be appreciated....
> 
> Best,
> Damian
> <slurm-23321.out><slurm-23325.out>

Reply via email to