Take a look at the discussion in
https://petsc.gitlab.io/-/petsc/-/jobs/5814862879/artifacts/public/html/manual/streams.html
and I suggest you run the streams benchmark from the branch
barry/2023-09-15/fix-log-pcmpi on your machine to get a baseline for what kind
of speedup you can expect.
Then let us know your thoughts.
Barry
> On Jan 11, 2024, at 11:37 AM, Stefano Zampini <[email protected]>
> wrote:
>
> You are creating the matrix on the wrong communicator if you want it
> parallel. You are using PETSc.COMM_SELF
>
> On Thu, Jan 11, 2024, 19:28 Steffen Wilksen | Universitaet Bremen
> <[email protected] <mailto:[email protected]>> wrote:
>> Hi all,
>>
>> I'm trying to do repeated matrix-vector-multiplication of large sparse
>> matrices in python using petsc4py. Even the most simple method of
>> parallelization, dividing up the calculation to run on multiple processes
>> indenpendtly, does not seem to give a singnificant speed up for large
>> matrices. I constructed a minimal working example, which I run using
>>
>> mpiexec -n N python parallel_example.py,
>>
>> where N is the number of processes. Instead of taking approximately the same
>> time irrespective of the number of processes used, the calculation is much
>> slower when starting more MPI processes. This translates to little to no
>> speed up when splitting up a fixed number of calculations over N processes.
>> As an example, running with N=1 takes 9s, while running with N=4 takes 34s.
>> When running with smaller matrices, the problem is not as severe (only
>> slower by a factor of 1.5 when setting MATSIZE=1e+5 instead of
>> MATSIZE=1e+6). I get the same problems when just starting the script four
>> times manually without using MPI.
>> I attached both the script and the log file for running the script with N=4.
>> Any help would be greatly appreciated. Calculations are done on my laptop,
>> arch linux version 6.6.8 and PETSc version 3.20.2.
>>
>> Kind Regards
>> Steffen
>>