"MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 73239"
The preallocation is VERY wrong. This is why the computation is so slow; this
number should be zero.
> On Dec 12, 2022, at 10:20 PM, 김성익 wrote:
>
> Following your comments,
> I checked by using '-info'.
>
> As
On Mon, Dec 12, 2022 at 11:54 PM 김성익 wrote:
>
> With the following example
> https://gitlab.com/petsc/petsc/-/blob/main/src/mat/tests/ex230.c
>
> There are some questions about MatPreallocator.
>
> 1. In parallel run, all the MPI ranks should do the same preallocator
> procedure?
>
In parallel,
With the following example
https://gitlab.com/petsc/petsc/-/blob/main/src/mat/tests/ex230.c
There are some questions about MatPreallocator.
1. In parallel run, all the MPI ranks should do the same preallocator
procedure?
2. In ex230.c, the difference between ex1 of ex230.c and ex2 of ex230.c is
Following your comments,
I checked by using '-info'.
As you suspected, most elements being computed on wrong MPI rank.
Also, there are a lot of stashed entries.
Should I divide the domain from the problem define stage?
Or is a proper preallocation sufficient?
[0] PetscCommDuplicate():
Since you run with multiple ranks, you should use matrix type mpiaij and
MatMPIAIJSetPreallocation. If preallocation is difficult to estimate, you
can use MatPreallocator, see an example at
https://gitlab.com/petsc/petsc/-/blob/main/src/mat/tests/ex230.c
--Junchao Zhang
On Mon, Dec 12, 2022 at
The problem is possibly due to most elements being computed on "wrong" MPI
rank and thus requiring almost all the matrix entries to be "stashed" when
computed and then sent off to the owning MPI rank. Please send ALL the output
of a parallel run with -info so we can see how much
Hello Mark,
Following your comments,
I did run with '-info' and the outputs are as below
[image: image.png]
Global matrix seem to have preallocated well enough
And, ass I said earlier in the former email, If I run this code with mpi ,
It will be 70,000secs..
In this case, What is the problem?
Hi Hyung,
First, verify that you are preallocating correctly.
Run with '-info' and grep on "alloc" in the large output that you get.
You will see lines like "number of mallocs in assembly: 0". You want 0.
Do this with one processor and the 8.
I don't understand your loop. You are iterating over
Hello,
I need some keyword or some examples for parallelizing matrix assemble
process.
My current state is as below.
- Finite element analysis code for Structural mechanics.
- problem size : 3D solid hexa element (number of elements : 125,000),
number of degree of freedom : 397,953
- Matrix