Re: Nonlinear term MPI running does not end

Konstantinos Poulios Sun, 23 May 2021 05:47:10 -0700

Dear Tetsuo,

You can now test the code in the mpi-fixes branch. Would you like to help
with a bit more extensive testing of MPI in GetFEM and maybe also with
making some new unit tests for MPI?


Best regards
Kostas

On Sun, May 23, 2021 at 7:29 AM Tetsuo Koyama <[email protected]> wrote:

> Dear Kostas
>
> Thanks a lot. I will check the code.
>
> BR
> Tetsuo
>
> 2021年5月23日(日) 10:34 Konstantinos Poulios <[email protected]>:
>
>> I think I have fixed it but need to test it a bit more and tidy it up.
>> BR
>> Kostas
>>
>> On Fri, May 21, 2021 at 8:42 AM Konstantinos Poulios <
>> [email protected]> wrote:
>>
>>> oh sorry, my fault, I can reproduce the error now. I had forgotten that
>>> I had to replace the linear term with a nonlinear one.
>>>
>>> BR
>>> Kostas
>>>
>>> On Thu, May 20, 2021 at 7:42 AM Tetsuo Koyama <[email protected]>
>>> wrote:
>>>
>>>> Sorry for lack of explanation.
>>>>
>>>> I  build getfem on ubuntu:20.04 and  using the configuration command
>>>> " --with-pic --enable-paralevel=2"
>>>>
>>>> I am using...
>>>> - automake
>>>> - libtool
>>>> - make
>>>> - g++
>>>> - libqd-dev
>>>> - libqhull-dev
>>>> - libmumps-dev
>>>> - liblapack-dev
>>>> - libopenblas-dev
>>>> - libpython3-dev
>>>> - gfortran
>>>> - libmetis-dev
>>>>
>>>> I attach a Dockerfile and a Python file to reproduce.
>>>> You can reproduce by the following command.
>>>>
>>>> $ sudo docker build -t demo_parallel_laplacian_nonlinear_term.py .
>>>>
>>>> Best Regards
>>>> Tetsuo
>>>>
>>>> 2021年5月19日(水) 23:12 Konstantinos Poulios <[email protected]>:
>>>>
>>>>> I think the instructions page is correct. What distribution do you
>>>>> build getfem on and what is your configuration command?
>>>>> Best regards
>>>>> Kostas
>>>>>
>>>>> On Wed, May 19, 2021 at 10:44 AM Tetsuo Koyama <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Dear Kostas
>>>>>>
>>>>>> This page (http://getfem.org/tutorial/install.html) says
>>>>>> - Parallel MUMPS, METIS and MPI4PY packages if you want to use the
>>>>>> MPI parallelized version of GetFEM.
>>>>>>
>>>>>> Is there a recommended way to install Parallel Parallel MUMPS, METIS
>>>>>> and MPI4PY ?
>>>>>> I could not find the information in the page.
>>>>>>
>>>>>> If you could give me any information I will add it to the following
>>>>>> page.
>>>>>> http://getfem.org/install/install_linux.html
>>>>>>
>>>>>> BR
>>>>>> Tetsuo
>>>>>>
>>>>>> 2021年5月19日(水) 10:45 Tetsuo Koyama <[email protected]>:
>>>>>>
>>>>>>> Dear Kostast
>>>>>>>
>>>>>>> No I haven't. I am using libmumps-seq-dev of Ubuntu repository.
>>>>>>> I will use parallel version of mumps again.
>>>>>>>
>>>>>>> BR
>>>>>>> Tetsuo
>>>>>>>
>>>>>>> 2021年5月19日(水) 4:50 Konstantinos Poulios <[email protected]>:
>>>>>>>
>>>>>>>> Dear Tetsuo,
>>>>>>>>
>>>>>>>> Have you compiled GetFEM with the parallel version of mumps? In
>>>>>>>> Ubuntu/Debian you must link to dmumps instead of dmumps_seq for 
>>>>>>>> example.
>>>>>>>>
>>>>>>>> BR
>>>>>>>> Kostast
>>>>>>>>
>>>>>>>> On Tue, May 18, 2021 at 2:09 PM Tetsuo Koyama <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Dear Kostas
>>>>>>>>>
>>>>>>>>> Thank you for your report.
>>>>>>>>> I am happy that it runs well in your system.
>>>>>>>>> I will organize the procedure that can reproduce this error.
>>>>>>>>> Please wait.
>>>>>>>>>
>>>>>>>>> Best Regards Tetsuo
>>>>>>>>>
>>>>>>>>> 2021年5月18日(火) 18:10 Konstantinos Poulios <[email protected]
>>>>>>>>> >:
>>>>>>>>>
>>>>>>>>>> Dear Tetsuo,
>>>>>>>>>> I could not confirm this issue. On my system the example runs
>>>>>>>>>> well both on 1 and 2 processes (it doesn't scale well though)
>>>>>>>>>> BR
>>>>>>>>>> Kostas
>>>>>>>>>>
>>>>>>>>>> [image: image.png]
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Sun, May 16, 2021 at 10:07 AM Tetsuo Koyama <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> Dear Kostas
>>>>>>>>>>>
>>>>>>>>>>> I am looking inside the source code.
>>>>>>>>>>> > if (generic_expressions.size()) {...}
>>>>>>>>>>> Sorry it looks complex for me.
>>>>>>>>>>>
>>>>>>>>>>> FYI. I found that MPI process 1 and 2 is different in the
>>>>>>>>>>> following line.
>>>>>>>>>>> >    if (iter.finished(crit)) {
>>>>>>>>>>> This is in the "Newton_with_step_control" function in
>>>>>>>>>>> getfem_model_solvers.h.
>>>>>>>>>>>
>>>>>>>>>>> "crit" is calculated by rit = res / approx_eln and res and
>>>>>>>>>>> approx_eln is ...
>>>>>>>>>>>
>>>>>>>>>>> $ mpirun -n 1 python demo_parallel_laplacian.py
>>>>>>>>>>> res=1.31449e-11
>>>>>>>>>>> approx_eln=6.10757
>>>>>>>>>>> crit=2.15222e-12
>>>>>>>>>>>
>>>>>>>>>>> $ mpirun -n 2 python demo_parallel_laplacian.py
>>>>>>>>>>> res=6.02926
>>>>>>>>>>> approx_eln=12.2151
>>>>>>>>>>> crit=0.493588
>>>>>>>>>>>
>>>>>>>>>>> res=0.135744
>>>>>>>>>>> approx_eln=12.2151
>>>>>>>>>>> crit=0.0111128
>>>>>>>>>>>
>>>>>>>>>>> I am now trying to understand what is the correct residual value
>>>>>>>>>>> of  Newton(-Raphson) algorithm.
>>>>>>>>>>> I will be glad if you have an opinion.
>>>>>>>>>>>
>>>>>>>>>>> Best Regards Tetsuo
>>>>>>>>>>> 2021年5月11日(火) 19:28 Tetsuo Koyama <[email protected]>:
>>>>>>>>>>>
>>>>>>>>>>>> Dear Kostas
>>>>>>>>>>>>
>>>>>>>>>>>> > The relevant code is in the void model::assembly function in
>>>>>>>>>>>> getfem_models.cc. The relevant code assembling the term you add 
>>>>>>>>>>>> with
>>>>>>>>>>>> md.add_nonlinear_term(..) must be executed inside the if condition
>>>>>>>>>>>> >
>>>>>>>>>>>> > if (generic_expressions.size()) {...}
>>>>>>>>>>>> > You can have a look there and ask for further help if it
>>>>>>>>>>>> looks too complex. You should also check if the test works when 
>>>>>>>>>>>> you run it
>>>>>>>>>>>> with md.add_nonlinear_term but setting the number of MPI processes 
>>>>>>>>>>>> to one.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks. I will check it. And the following command completed
>>>>>>>>>>>> successfully..
>>>>>>>>>>>>
>>>>>>>>>>>> $ mpirun -n 1 python demo_parallel_laplacian.py
>>>>>>>>>>>>
>>>>>>>>>>>> So all we have to check is compare -n 1 with -n2 .
>>>>>>>>>>>>
>>>>>>>>>>>> Best regards Tetsuo
>>>>>>>>>>>>
>>>>>>>>>>>> 2021年5月11日(火) 18:44 Konstantinos Poulios <
>>>>>>>>>>>> [email protected]>:
>>>>>>>>>>>>
>>>>>>>>>>>>> Dear Tetsuo,
>>>>>>>>>>>>>
>>>>>>>>>>>>> The relevant code is in the void model::assembly function in
>>>>>>>>>>>>> getfem_models.cc. The relevant code assembling the term you add 
>>>>>>>>>>>>> with
>>>>>>>>>>>>> md.add_nonlinear_term(..) must be executed inside the if condition
>>>>>>>>>>>>>
>>>>>>>>>>>>> if (generic_expressions.size()) {...}
>>>>>>>>>>>>>
>>>>>>>>>>>>> You can have a look there and ask for further help if it looks
>>>>>>>>>>>>> too complex. You should also check if the test works when you run 
>>>>>>>>>>>>> it with
>>>>>>>>>>>>> md.add_nonlinear_term but setting the number of MPI processes to 
>>>>>>>>>>>>> one.
>>>>>>>>>>>>>
>>>>>>>>>>>>> BR
>>>>>>>>>>>>> Kostas
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, May 11, 2021 at 10:44 AM Tetsuo Koyama <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Dear Kostas
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thank you for your reply.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> > Interesting. In order to isolate the issue, can you also
>>>>>>>>>>>>>> check with
>>>>>>>>>>>>>> > md.add_linear_term(..)
>>>>>>>>>>>>>> > ?
>>>>>>>>>>>>>> It ends when using md.add_linear_term(..).
>>>>>>>>>>>>>> It seems that it is a problem of md.add_nonlinear_term(..).
>>>>>>>>>>>>>> Is there a point which I can check?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best regards Tetsuo.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2021年5月11日(火) 17:19 Konstantinos Poulios <
>>>>>>>>>>>>>> [email protected]>:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Dear Tetsuo,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Interesting. In order to isolate the issue, can you also
>>>>>>>>>>>>>>> check with
>>>>>>>>>>>>>>> md.add_linear_term(..)
>>>>>>>>>>>>>>> ?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Best regards
>>>>>>>>>>>>>>> Kostas
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, May 11, 2021 at 12:22 AM Tetsuo Koyama <
>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Dear GetFEM community
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I am running MPI Parallelization of GetFEM.The running
>>>>>>>>>>>>>>>> command is
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> $ git clone https://git.savannah.nongnu.org/git/getfem.git
>>>>>>>>>>>>>>>> $ cd getfem
>>>>>>>>>>>>>>>> $ bash autogen.sh
>>>>>>>>>>>>>>>> $ ./configure --with-pic --enable-paralevel=2
>>>>>>>>>>>>>>>> $ make
>>>>>>>>>>>>>>>> $ make install
>>>>>>>>>>>>>>>> $ mpirun -n 2 python demo_parallel_laplacian.py
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The python script ends correctly. But when I changed the
>>>>>>>>>>>>>>>> following linear term to nonlinear term the script did not end.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> -md.add_Laplacian_brick(mim, 'u')
>>>>>>>>>>>>>>>> +md.add_nonlinear_term(mim, "Grad_u.Grad_Test_u")
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Do you know the reason?
>>>>>>>>>>>>>>>> Best regards Tetsuo
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>

Re: Nonlinear term MPI running does not end

Reply via email to