Re: [petsc-users] parallelize matrix assembly process

2022-12-13 Thread Barry Smith
"MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 73239" The preallocation is VERY wrong. This is why the computation is so slow; this number should be zero. > On Dec 12, 2022, at 10:20 PM, 김성익 wrote: > > Following your comments, > I checked by using '-info'. > > As

Re: [petsc-users] parallelize matrix assembly process

2022-12-12 Thread Matthew Knepley
On Mon, Dec 12, 2022 at 11:54 PM 김성익 wrote: > > With the following example > https://gitlab.com/petsc/petsc/-/blob/main/src/mat/tests/ex230.c > > There are some questions about MatPreallocator. > > 1. In parallel run, all the MPI ranks should do the same preallocator > procedure? > In parallel,

Re: [petsc-users] parallelize matrix assembly process

2022-12-12 Thread 김성익
With the following example https://gitlab.com/petsc/petsc/-/blob/main/src/mat/tests/ex230.c There are some questions about MatPreallocator. 1. In parallel run, all the MPI ranks should do the same preallocator procedure? 2. In ex230.c, the difference between ex1 of ex230.c and ex2 of ex230.c is

Re: [petsc-users] parallelize matrix assembly process

2022-12-12 Thread 김성익
Following your comments, I checked by using '-info'. As you suspected, most elements being computed on wrong MPI rank. Also, there are a lot of stashed entries. Should I divide the domain from the problem define stage? Or is a proper preallocation sufficient? [0] PetscCommDuplicate():

Re: [petsc-users] parallelize matrix assembly process

2022-12-12 Thread Junchao Zhang
Since you run with multiple ranks, you should use matrix type mpiaij and MatMPIAIJSetPreallocation. If preallocation is difficult to estimate, you can use MatPreallocator, see an example at https://gitlab.com/petsc/petsc/-/blob/main/src/mat/tests/ex230.c --Junchao Zhang On Mon, Dec 12, 2022 at

Re: [petsc-users] parallelize matrix assembly process

2022-12-12 Thread Barry Smith
The problem is possibly due to most elements being computed on "wrong" MPI rank and thus requiring almost all the matrix entries to be "stashed" when computed and then sent off to the owning MPI rank. Please send ALL the output of a parallel run with -info so we can see how much

Re: [petsc-users] parallelize matrix assembly process

2022-12-12 Thread 김성익
Hello Mark, Following your comments, I did run with '-info' and the outputs are as below [image: image.png] Global matrix seem to have preallocated well enough And, ass I said earlier in the former email, If I run this code with mpi , It will be 70,000secs.. In this case, What is the problem?

Re: [petsc-users] parallelize matrix assembly process

2022-12-12 Thread Mark Adams
Hi Hyung, First, verify that you are preallocating correctly. Run with '-info' and grep on "alloc" in the large output that you get. You will see lines like "number of mallocs in assembly: 0". You want 0. Do this with one processor and the 8. I don't understand your loop. You are iterating over

[petsc-users] parallelize matrix assembly process

2022-12-12 Thread 김성익
Hello, I need some keyword or some examples for parallelizing matrix assemble process. My current state is as below. - Finite element analysis code for Structural mechanics. - problem size : 3D solid hexa element (number of elements : 125,000), number of degree of freedom : 397,953 - Matrix