It would be very helpful if you could run on 1 and 2 ranks with -log_view and 
send all the output.

  

> On Sep 8, 2023, at 4:52 PM, Chris Hewson <[email protected]> wrote:
> 
> Hi There,
> 
> I am trying to solve a linear problem and am having an issue when I use more 
> MPI processes with the KSPsolve slowing down considerably the more processes 
> I add.
> 
> The matrix itself is 620100 X 620100 with ~5 million non-zero entries, I am 
> using petsc version 3.19.5 and have tried with a couple different versions of 
> mpich getting the same behavior (v4.1.2 w/ device ch4:ofi and v3.3.2 w/ 
> ch3:sock).
> 
> In testing, I've noticed the following trend for speed for the KSPSolve 
> function call:
> 1 core: 4042 ms
> 2 core: 7085 ms
> 4 core: 26573 ms
> 8 core: 65745 ms
> 16 core: 149283 ms
> 
> This was all done on a single node machine w/ 16 non-hyperthreaded cores. We 
> solve quite a few different matrices with PETSc using MPI and haven't noticed 
> an impact like this on performance before.
> 
> I am very confused by this and am a little stumped at the moment as to why 
> this was happening. I've been using the KSPBCGS solver to solve the problem. 
> I have tried with multiple different solvers and pre-conditioners (we usually 
> don't use a pre-conditioner for this part of our code). 
> 
> It did seem that using the piped BCGS solver did help improve the parallel 
> speed slightly (maybe 15%), but it still doesn't come close to the single 
> threaded speed. 
> 
> I'll attach a link to a folder that contains the specific A, x and b matrices 
> for this problem, as well as a main.cpp file that I was using for testing. 
> 
> https://drive.google.com/drive/folders/1CEDinKxu8ZbKpLtwmqKqP1ZIDG7JvDI1?usp=sharing
> 
> I was testing this in our main code base, but don't include that here, and 
> observe very similar speed results to the ones above. We do use Metis to 
> graph partition in our own code and checked the vector and matrix 
> partitioning and that all made sense. I could be doing the partitioning 
> incorrectly in the example (not 100% sure how it works with the viewer/load 
> functions).
> 
> Any insight or thoughts on this would be greatly appreciated.
> 
> Thanks,
> 
> Chris Hewson
> Senior Reservoir Simulation Engineer
> ResFrac
> +1.587.575.9792

Reply via email to