Take a look at the discussion in https://petsc.gitlab.io/-/petsc/-/jobs/5814862879/artifacts/public/html/manual/streams.html and I suggest you run the streams benchmark from the branch barry/2023-09-15/fix-log-pcmpi on your machine to get a baseline for what kind of speedup you can expect.
Then let us know your thoughts. Barry > On Jan 11, 2024, at 11:37 AM, Stefano Zampini <stefano.zamp...@gmail.com> > wrote: > > You are creating the matrix on the wrong communicator if you want it > parallel. You are using PETSc.COMM_SELF > > On Thu, Jan 11, 2024, 19:28 Steffen Wilksen | Universitaet Bremen > <swilk...@itp.uni-bremen.de <mailto:swilk...@itp.uni-bremen.de>> wrote: >> Hi all, >> >> I'm trying to do repeated matrix-vector-multiplication of large sparse >> matrices in python using petsc4py. Even the most simple method of >> parallelization, dividing up the calculation to run on multiple processes >> indenpendtly, does not seem to give a singnificant speed up for large >> matrices. I constructed a minimal working example, which I run using >> >> mpiexec -n N python parallel_example.py, >> >> where N is the number of processes. Instead of taking approximately the same >> time irrespective of the number of processes used, the calculation is much >> slower when starting more MPI processes. This translates to little to no >> speed up when splitting up a fixed number of calculations over N processes. >> As an example, running with N=1 takes 9s, while running with N=4 takes 34s. >> When running with smaller matrices, the problem is not as severe (only >> slower by a factor of 1.5 when setting MATSIZE=1e+5 instead of >> MATSIZE=1e+6). I get the same problems when just starting the script four >> times manually without using MPI. >> I attached both the script and the log file for running the script with N=4. >> Any help would be greatly appreciated. Calculations are done on my laptop, >> arch linux version 6.6.8 and PETSc version 3.20.2. >> >> Kind Regards >> Steffen >>