subject:"Re\: \[petsc\-users\] Bad memory scaling with PETSc 3.10"

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-05-29 Thread Myriam Peyrounette via petsc-users

Oh sorry, I missed that. That's great!

Thanks,

Myriam


Le 05/29/19 à 16:55, Zhang, Hong a écrit :
> Myriam:
> This branch is merged to master.
> Thanks for your work and patience. It helps us a lot. The graphs are
> very nice :-)
>
> We plan to re-organise the APIs of mat-mat opts, make them easier for
> users.
> Hong
>
> Hi,
>
> Do you have any idea when Barry's fix
> 
> (https://bitbucket.org/petsc/petsc/pull-requests/1606/change-handling-of-matptap_mpiaij_mpimaij/diff)
> will be released? I can see it has been merged to the "next"
> branch. Does it mean it will be soon available on master?
>
> +for your information, I plotted a summary of the scalings of
> interest (memory and time):
> - using petsc-3.10.2 (ref "bad" scaling)
> - using petsc-3.6.4 (ref "good" scaling)
> - using commit d330a26 + Barry's fix and different algorithms
> (none, scalable, allatonce, allatonce_merged)
>
> Best regards,
>
> Myriam
>
>
> Le 05/13/19 à 17:20, Fande Kong a écrit :
>> Hi Myriam,
>>
>> Thanks for your report back.
>>
>> On Mon, May 13, 2019 at 2:01 AM Myriam Peyrounette
>> > > wrote:
>>
>> Hi all,
>>
>> I tried with 3.11.1 version and Barry's fix. The good scaling
>> is back!
>> See the green curve in the plot attached. It is even better
>> than PETSc
>> 3.6! And it runs faster (10-15s instead of 200-300s with 3.6).
>>
>>
>> We are glad your issue was resolved here. 
>>  
>>
>>
>> So you were right. It seems that not all the PtAPs used the
>> scalable
>> version.
>>
>> I was a bit confused about the options to set... I used the
>> options:
>> -matptap_via scalable and -mat_freeintermediatedatastructures
>> 1. Do you
>> think it would be even better with allatonce?
>>
>>
>> "scalable" and "allatonce" correspond to different algorithms
>> respectively. ``allatonce" should be using less memory than
>> "scalable". The "allatonce" algorithm  would be a good
>> alternative if your application is memory sensitive and the
>> problem size is large. 
>> We are definitely curious about the memory usage of ``allatonce"
>> in your test cases but don't feel obligated to do these tests
>> since your concern were resolved now. In case you are also
>> interested in how our new algorithms perform, I post petsc
>> options here that are used to 
>> choose these algorithms:
>>
>> algorithm 1: ``allatonce" 
>>
>> -matptap_via allatonce
>> -mat_freeintermediatedatastructures 1
>>
>> algorithm 2: ``allatonce_merged" 
>>
>> -matptap_via allatonce_merged
>> -mat_freeintermediatedatastructures 1
>>
>>
>> Again, thanks for your report that help us improve PETSc.
>>
>> Fande,
>>  
>>
>>
>> It is unfortunate that this fix can't be merged with the
>> master branch.
>> But the patch works well and I can consider the issue as
>> solved now.
>>
>> Thanks a lot for your time!
>>
>> Myriam
>>
>>
>> Le 05/04/19 à 06:54, Smith, Barry F. a écrit :
>> >    Hmm, I had already fixed this, I think,
>> >
>> >   
>> 
>> https://bitbucket.org/petsc/petsc/pull-requests/1606/change-handling-of-matptap_mpiaij_mpimaij/diff
>> >
>> >    but unfortunately our backlog of pull requests kept it
>> out of master. We are (well Satish and Jed) working on a new
>> CI infrastructure that will hopefully be more stable than the
>> current CI that we are using.
>> >
>> >    Fande,
>> >       Sorry you had to spend time on this.
>> >
>> >
>> >    Barry
>> >
>> >
>> >
>> >> On May 3, 2019, at 11:20 PM, Fande Kong via petsc-users
>> mailto:petsc-users@mcs.anl.gov>> wrote:
>> >>
>> >> Hi Myriam,
>> >>
>> >> I run the example you attached earlier with "-mx 48 -my 48
>> -mz 48 -levels 3 -ksp_view  -matptap_via allatonce -log_view ". 
>> >>
>> >> There are six PtAPs. Two of them are sill using the
>> nonscalable version of the algorithm (this might explain why
>> the memory still exponentially increases) even though we have
>> asked PETSc to use the ``allatonce" algorithm. This is
>> happening because MATMAIJ does not honor the petsc option,
>> instead, it uses the default setting of MPIAIJ.  I have a fix
>> at
>> 
>> https://bitbucket.org/petsc/petsc/pull-requests/1623/choose-algorithms-in/diff.
>> The PR should fix the issue.
>> >>
>> >> Thanks again for your report,
>> >>
>> >> Fande,
>> >>
>> >> 
>>
>> -- 
>> Myriam Peyrounette
>> CNRS/IDRIS - HLST
>> --
>>
>>
>

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-05-29 Thread Zhang, Hong via petsc-users

Myriam:
This branch is merged to master.
Thanks for your work and patience. It helps us a lot. The graphs are very nice 
:-)

We plan to re-organise the APIs of mat-mat opts, make them easier for users.
Hong

Hi,

Do you have any idea when Barry's fix 
(https://bitbucket.org/petsc/petsc/pull-requests/1606/change-handling-of-matptap_mpiaij_mpimaij/diff)
 will be released? I can see it has been merged to the "next" branch. Does it 
mean it will be soon available on master?

+for your information, I plotted a summary of the scalings of interest (memory 
and time):
- using petsc-3.10.2 (ref "bad" scaling)
- using petsc-3.6.4 (ref "good" scaling)
- using commit d330a26 + Barry's fix and different algorithms (none, scalable, 
allatonce, allatonce_merged)

Best regards,

Myriam

Le 05/13/19 à 17:20, Fande Kong a écrit :
Hi Myriam,

Thanks for your report back.

On Mon, May 13, 2019 at 2:01 AM Myriam Peyrounette 
mailto:myriam.peyroune...@idris.fr>> wrote:
Hi all,

I tried with 3.11.1 version and Barry's fix. The good scaling is back!
See the green curve in the plot attached. It is even better than PETSc
3.6! And it runs faster (10-15s instead of 200-300s with 3.6).

We are glad your issue was resolved here.

So you were right. It seems that not all the PtAPs used the scalable
version.

I was a bit confused about the options to set... I used the options:
-matptap_via scalable and -mat_freeintermediatedatastructures 1. Do you
think it would be even better with allatonce?

"scalable" and "allatonce" correspond to different algorithms respectively. 
``allatonce" should be using less memory than "scalable". The "allatonce" 
algorithm  would be a good alternative if your application is memory sensitive 
and the problem size is large.
We are definitely curious about the memory usage of ``allatonce" in your test 
cases but don't feel obligated to do these tests since your concern were 
resolved now. In case you are also interested in how our new algorithms 
perform, I post petsc options here that are used to
choose these algorithms:

algorithm 1: ``allatonce"

-matptap_via allatonce
-mat_freeintermediatedatastructures 1

algorithm 2: ``allatonce_merged"

-matptap_via allatonce_merged
-mat_freeintermediatedatastructures 1

Again, thanks for your report that help us improve PETSc.

Fande,

It is unfortunate that this fix can't be merged with the master branch.
But the patch works well and I can consider the issue as solved now.

Thanks a lot for your time!

Myriam

Le 05/04/19 à 06:54, Smith, Barry F. a écrit :
>Hmm, I had already fixed this, I think,
>
>
> https://bitbucket.org/petsc/petsc/pull-requests/1606/change-handling-of-matptap_mpiaij_mpimaij/diff
>
>but unfortunately our backlog of pull requests kept it out of master. We 
> are (well Satish and Jed) working on a new CI infrastructure that will 
> hopefully be more stable than the current CI that we are using.
>
>Fande,
>   Sorry you had to spend time on this.
>
>
>Barry
>
>
>
>> On May 3, 2019, at 11:20 PM, Fande Kong via petsc-users 
>> mailto:petsc-users@mcs.anl.gov>> wrote:
>>
>> Hi Myriam,
>>
>> I run the example you attached earlier with "-mx 48 -my 48 -mz 48 -levels 3 
>> -ksp_view  -matptap_via allatonce -log_view ".
>>
>> There are six PtAPs. Two of them are sill using the nonscalable version of 
>> the algorithm (this might explain why the memory still exponentially 
>> increases) even though we have asked PETSc to use the ``allatonce" 
>> algorithm. This is happening because MATMAIJ does not honor the petsc 
>> option, instead, it uses the default setting of MPIAIJ.  I have a fix at 
>> https://bitbucket.org/petsc/petsc/pull-requests/1623/choose-algorithms-in/diff.
>>  The PR should fix the issue.
>>
>> Thanks again for your report,
>>
>> Fande,
>>
>>

--
Myriam Peyrounette
CNRS/IDRIS - HLST
--

--
Myriam Peyrounette
CNRS/IDRIS - HLST
--

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-05-03 Thread Smith, Barry F. via petsc-users



   Hmm, I had already fixed this, I think, 

   
https://bitbucket.org/petsc/petsc/pull-requests/1606/change-handling-of-matptap_mpiaij_mpimaij/diff

   but unfortunately our backlog of pull requests kept it out of master. We are 
(well Satish and Jed) working on a new CI infrastructure that will hopefully be 
more stable than the current CI that we are using.

   Fande,
  Sorry you had to spend time on this. 


   Barry



> On May 3, 2019, at 11:20 PM, Fande Kong via petsc-users 
>  wrote:
> 
> Hi Myriam,
> 
> I run the example you attached earlier with "-mx 48 -my 48 -mz 48 -levels 3 
> -ksp_view  -matptap_via allatonce -log_view ".  
> 
> There are six PtAPs. Two of them are sill using the nonscalable version of 
> the algorithm (this might explain why the memory still exponentially 
> increases) even though we have asked PETSc to use the ``allatonce" algorithm. 
> This is happening because MATMAIJ does not honor the petsc option, instead, 
> it uses the default setting of MPIAIJ.  I have a fix at 
> https://bitbucket.org/petsc/petsc/pull-requests/1623/choose-algorithms-in/diff.
>  The PR should fix the issue.
> 
> Thanks again for your report,
> 
> Fande,
> 
>

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-05-03 Thread Zhang, Hong via petsc-users

Myriam:
Very interesting results. Do you have time for petsc-3.10 (blue) and 3.6 
(green)?
I do not understand why all algorithms gives non-scalable memory performance 
except petsc-3.6. We can easily resume petsc-3.6's MatPtAP though.
Hong

And the attached files... Sorry

Le 05/03/19 à 16:11, Myriam Peyrounette a écrit :

Hi,

I plotted new scalings (memory and time) using the new algorithms. I used the 
options -options_left true to make sure that the options are effectively used. 
They are.

I don't have access to the platform I used to run my computations on, so I ran 
them on a different one. In particular, I can't reach problem size = 1e8 and 
the values might be different from the previous scalings I sent you. But the 
comparison of the PETSc versions and options is still relevant.

I plotted the scalings of reference: the "good" one (PETSc 3.6.4) in green, the 
"bad" one (PETSc 3.10.2) in blue.

I used the commit d330a26 (3.11.1) for all the other scalings, adding different 
sets of options:

Light blue -> -matptap_via allatonce  -mat_freeintermediatedatastructures 1
Orange -> -matptap_via allatonce_merged -mat_freeintermediatedatastructures 1
Purple -> -matptap_via allatonce  -mat_freeintermediatedatastructures 1 
-inner_diag_matmatmult_via scalable -inner_offdiag_matmatmult_via scalable
Yellow: -matptap_via allatonce_merged -mat_freeintermediatedatastructures 1 
-inner_diag_matmatmult_via scalable -inner_offdiag_matmatmult_via scalable

Conclusion: with regard to memory, the two algorithms imply a similarly good 
improvement of the scaling. The use of the -inner_(off)diag_matmatmult_via 
options is also very interesting. The scaling is still not as good as 3.6.4 
though.
With regard to time, I noted a real improvement in time execution! I used to 
spend 200-300s on these executions. Now they take 10-15s. Beside that, the 
"_merged" versions are more efficient. And the -inner_(off)diaf_matmatmult_via 
options are slightly expensive but it is not critical.

What do you think? Is it possible to match again the scaling of PETSc 3.6.4? Is 
it worthy keeping investigating?

Myriam


Le 04/30/19 à 17:00, Fande Kong a écrit :
HI Myriam,

We are interesting how the new algorithms perform. So there are two new 
algorithms you could try.

Algorithm 1:

-matptap_via allatonce  -mat_freeintermediatedatastructures 1

Algorithm 2:

-matptap_via allatonce_merged -mat_freeintermediatedatastructures 1


Note that you need to use the current petsc-master, and also please put 
"-snes_view" in your script so that we can confirm these options are actually 
get set.

Thanks,

Fande,


On Tue, Apr 30, 2019 at 2:26 AM Myriam Peyrounette via petsc-users 
mailto:petsc-users@mcs.anl.gov>> wrote:

Hi,

that's really good news for us, thanks! I will plot again the memory scaling 
using these new options and let you know. Next week I hope.

Before that, I just need to clarify the situation. Throughout our discussions, 
we mentionned a number of options concerning the scalability:

-matptatp_via scalable
-inner_diag_matmatmult_via scalable
-inner_diag_matmatmult_via scalable
-mat_freeintermediatedatastructures
-matptap_via allatonce
-matptap_via allatonce_merged

Which ones of them are compatible? Should I use all of them at the same time? 
Is there redundancy?

Thanks,

Myriam

Le 04/25/19 à 21:47, Zhang, Hong a écrit :
Myriam:
Checking MatPtAP() in petsc-3.6.4, I realized that it uses different algorithm 
than petsc-10 and later versions. petsc-3.6 uses out-product for C=P^T * AP, 
while petsc-3.10 uses local transpose of P. petsc-3.10 accelerates data 
accessing, but doubles the memory of P.

Fande added two new implementations for MatPtAP() to petsc-master which use 
much smaller and scalable memories with slightly higher computing time (faster 
than hypre though). You may use these new implementations if you have concern 
on memory scalability. The option for these new implementation are:
-matptap_via allatonce
-matptap_via allatonce_merged

Hong

On Mon, Apr 15, 2019 at 12:10 PM hzh...@mcs.anl.gov 
mailto:hzh...@mcs.anl.gov>> wrote:
Myriam:
Thank you very much for providing these results!
I have put effort to accelerate execution time and avoid using global sizes in 
PtAP, for which the algorithm of transpose of P_local and P_other likely 
doubles the memory usage. I'll try to investigate why it becomes unscalable.
Hong

Hi,

you'll find the new scaling attached (green line). I used the version 3.11 and 
the four scalability options :
-matptap_via scalable
-inner_diag_matmatmult_via scalable
-inner_offdiag_matmatmult_via scalable
-mat_freeintermediatedatastructures

The scaling is much better! The code even uses less memory for the smallest 
cases. There is still an increase for the larger one.

With regard to the time scaling, I used KSPView and LogView on the two previous 
scalings (blue and yellow lines) but not on the last one (green line). So we 
can't really compare them, am I right?

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-05-03 Thread Myriam Peyrounette via petsc-users

Hi,

I plotted new scalings (memory and time) using the new algorithms. I
used the options /-options_left true /to make sure that the options are
effectively used. They are.

I don't have access to the platform I used to run my computations on, so
I ran them on a different one. In particular, I can't reach problem size
= 1e8 and the values might be different from the previous scalings I
sent you. But the comparison of the PETSc versions and options is still
relevant.

I plotted the scalings of reference: the "good" one (PETSc 3.6.4) in
green, the "bad" one (PETSc 3.10.2) in blue.

I used the commit d330a26 (3.11.1) for all the other scalings, adding
different sets of options:

/Light blue/ -> -matptap_via
allatonce  -mat_freeintermediatedatastructures 1
/Orange/ -> -matptap_via
allatonce_*merged* -mat_freeintermediatedatastructures 1
/Purple/ -> -matptap_via allatonce  -mat_freeintermediatedatastructures
1 *-inner_diag_matmatmult_via scalable -inner_offdiag_matmatmult_via
scalable*
/Yellow/: -matptap_via
allatonce_*merged* -mat_freeintermediatedatastructures 1
*-inner_diag_matmatmult_via scalable -inner_offdiag_matmatmult_via scalable*

Conclusion: with regard to memory, the two algorithms imply a similarly
good improvement of the scaling. The use of the
-inner_(off)diag_matmatmult_via options is also very interesting. The
scaling is still not as good as 3.6.4 though.
With regard to time, I noted a real improvement in time execution! I
used to spend 200-300s on these executions. Now they take 10-15s. Beside
that, the "_merged" versions are more efficient. And the
-inner_(off)diaf_matmatmult_via options are slightly expensive but it is
not critical.

What do you think? Is it possible to match again the scaling of PETSc
3.6.4? Is it worthy keeping investigating?

Myriam


Le 04/30/19 à 17:00, Fande Kong a écrit :
> HI Myriam,
>
> We are interesting how the new algorithms perform. So there are two
> new algorithms you could try.
>
> Algorithm 1:
>
> -matptap_via allatonce  -mat_freeintermediatedatastructures 1
>
> Algorithm 2:
>
> -matptap_via allatonce_merged -mat_freeintermediatedatastructures 1
>
>
> Note that you need to use the current petsc-master, and also please
> put "-snes_view" in your script so that we can confirm these options
> are actually get set.
>
> Thanks,
>
> Fande,
>
>
> On Tue, Apr 30, 2019 at 2:26 AM Myriam Peyrounette via petsc-users
> mailto:petsc-users@mcs.anl.gov>> wrote:
>
> Hi,
>
> that's really good news for us, thanks! I will plot again the
> memory scaling using these new options and let you know. Next week
> I hope.
>
> Before that, I just need to clarify the situation. Throughout our
> discussions, we mentionned a number of options concerning the
> scalability:
>
> -matptatp_via scalable
> -inner_diag_matmatmult_via scalable
> -inner_diag_matmatmult_via scalable
> -mat_freeintermediatedatastructures
> -matptap_via allatonce
> -matptap_via allatonce_merged
>
> Which ones of them are compatible? Should I use all of them at the
> same time? Is there redundancy?
>
> Thanks,
>
> Myriam
>
>
> Le 04/25/19 à 21:47, Zhang, Hong a écrit :
>> Myriam:
>> Checking MatPtAP() in petsc-3.6.4, I realized that it uses
>> different algorithm than petsc-10 and later versions. petsc-3.6
>> uses out-product for C=P^T * AP, while petsc-3.10 uses local
>> transpose of P. petsc-3.10 accelerates data accessing, but
>> doubles the memory of P. 
>>
>> Fande added two new implementations for MatPtAP() to petsc-master
>> which use much smaller and scalable memories with slightly higher
>> computing time (faster than hypre though). You may use these new
>> implementations if you have concern on memory scalability. The
>> option for these new implementation are: 
>> -matptap_via allatonce
>> -matptap_via allatonce_merged
>>
>> Hong
>>
>> On Mon, Apr 15, 2019 at 12:10 PM hzh...@mcs.anl.gov
>>  > > wrote:
>>
>> Myriam:
>> Thank you very much for providing these results!
>> I have put effort to accelerate execution time and avoid
>> using global sizes in PtAP, for which the algorithm of
>> transpose of P_local and P_other likely doubles the memory
>> usage. I'll try to investigate why it becomes unscalable.
>> Hong
>>
>> Hi,
>>
>> you'll find the new scaling attached (green line). I used
>> the version 3.11 and the four scalability options :
>> -matptap_via scalable
>> -inner_diag_matmatmult_via scalable
>> -inner_offdiag_matmatmult_via scalable
>> -mat_freeintermediatedatastructures
>>
>> The scaling is much better! The code even uses less
>> memory for the smallest cases. There is still an increase
>> for the larger one.
>>
>> With

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-04-30 Thread Mark Adams via petsc-users

On Tue, Mar 5, 2019 at 8:06 AM Matthew Knepley  wrote:

> On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette <
> myriam.peyroune...@idris.fr> wrote:
>
>> Hi Matt,
>>
>> I plotted the memory scalings using different threshold values. The two
>> scalings are slightly translated (from -22 to -88 mB) but this gain is
>> neglectable. The 3.6-scaling keeps being robust while the 3.10-scaling
>> deteriorates.
>>
>> Do you have any other suggestion?
>>
>> Mark, what is the option she can give to output all the GAMG data?
>

I think we did this and it was fine, or am I getting threads mixed up?

Use -info and grep on GAMG. This will print out the average nnz/row on each
level, which is way of seeing if the coarse grids are getting out of
control. But coarse grids are smaller so they should not be a big deal.


>
> Also, run using -ksp_view. GAMG will report all the sizes of its grids, so
> it should be easy to see
> if the coarse grid sizes are increasing, and also what the effect of the
> threshold value is.
>

The coarse grid "sizes" will go way down, that is what MG does unless
something is very wrong. The nnz/row will go up in many cases. If ksp_view
prints out the nnz for the operator on each level then you can compute the
average nnz/row (-info just does that for you).

The only change that I can think of in GAMG wrt coarsening was the
treatment of the square graph parameters. It used to be a bool (square
first level or not). Now it is an integer q (square the first q levels).

If you suspect GAMG, can you test with Hypre?


>
>   Thanks,
>
>  Matt
>
>> Thanks
>> Myriam
>>
>> Le 03/02/19 à 02:27, Matthew Knepley a écrit :
>>
>> On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via petsc-users <
>> petsc-users@mcs.anl.gov> wrote:
>>
>>> Hi,
>>>
>>> I used to run my code with PETSc 3.6. Since I upgraded the PETSc version
>>> to 3.10, this code has a bad memory scaling.
>>>
>>> To report this issue, I took the PETSc script ex42.c and slightly
>>> modified it so that the KSP and PC configurations are the same as in my
>>> code. In particular, I use a "personnalised" multi-grid method. The
>>> modifications are indicated by the keyword "TopBridge" in the attached
>>> scripts.
>>>
>>> To plot the memory (weak) scaling, I ran four calculations for each
>>> script with increasing problem sizes and computations cores:
>>>
>>> 1. 100,000 elts on 4 cores
>>> 2. 1 million elts on 40 cores
>>> 3. 10 millions elts on 400 cores
>>> 4. 100 millions elts on 4,000 cores
>>>
>>> The resulting graph is also attached. The scaling using PETSc 3.10
>>> clearly deteriorates for large cases, while the one using PETSc 3.6 is
>>> robust.
>>>
>>> After a few tests, I found that the scaling is mostly sensitive to the
>>> use of the AMG method for the coarse grid (line 1780 in
>>> main_ex42_petsc36.cc). In particular, the performance strongly
>>> deteriorates when commenting lines 1777 to 1790 (in
>>> main_ex42_petsc36.cc).
>>>
>>> Do you have any idea of what changed between version 3.6 and version
>>> 3.10 that may imply such degradation?
>>>
>>
>> I believe the default values for PCGAMG changed between versions. It
>> sounds like the coarsening rate
>> is not great enough, so that these grids are too large. This can be set
>> using:
>>
>>
>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html
>>
>> There is some explanation of this effect on that page. Let us know if
>> setting this does not correct the situation.
>>
>>   Thanks,
>>
>>  Matt
>>
>>
>>> Let me know if you need further information.
>>>
>>> Best,
>>>
>>> Myriam Peyrounette
>>>
>>>
>>> --
>>> Myriam Peyrounette
>>> CNRS/IDRIS - HLST
>>> --
>>>
>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> 
>>
>>
>> --
>> Myriam Peyrounette
>> CNRS/IDRIS - HLST
>> --
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 
>

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-04-30 Thread Myriam Peyrounette via petsc-users

Hi,

that's really good news for us, thanks! I will plot again the memory
scaling using these new options and let you know. Next week I hope.

Before that, I just need to clarify the situation. Throughout our
discussions, we mentionned a number of options concerning the scalability:

-matptatp_via scalable
-inner_diag_matmatmult_via scalable
-inner_diag_matmatmult_via scalable
-mat_freeintermediatedatastructures
-matptap_via allatonce
-matptap_via allatonce_merged

Which ones of them are compatible? Should I use all of them at the same
time? Is there redundancy?

Thanks,

Myriam


Le 04/25/19 à 21:47, Zhang, Hong a écrit :
> Myriam:
> Checking MatPtAP() in petsc-3.6.4, I realized that it uses different
> algorithm than petsc-10 and later versions. petsc-3.6 uses out-product
> for C=P^T * AP, while petsc-3.10 uses local transpose of P. petsc-3.10
> accelerates data accessing, but doubles the memory of P. 
>
> Fande added two new implementations for MatPtAP() to petsc-master
> which use much smaller and scalable memories with slightly higher
> computing time (faster than hypre though). You may use these new
> implementations if you have concern on memory scalability. The option
> for these new implementation are: 
> -matptap_via allatonce
> -matptap_via allatonce_merged
>
> Hong
>
> On Mon, Apr 15, 2019 at 12:10 PM hzh...@mcs.anl.gov
>   > wrote:
>
> Myriam:
> Thank you very much for providing these results!
> I have put effort to accelerate execution time and avoid using
> global sizes in PtAP, for which the algorithm of transpose of
> P_local and P_other likely doubles the memory usage. I'll try to
> investigate why it becomes unscalable.
> Hong
>
> Hi,
>
> you'll find the new scaling attached (green line). I used the
> version 3.11 and the four scalability options :
> -matptap_via scalable
> -inner_diag_matmatmult_via scalable
> -inner_offdiag_matmatmult_via scalable
> -mat_freeintermediatedatastructures
>
> The scaling is much better! The code even uses less memory for
> the smallest cases. There is still an increase for the larger
> one.
>
> With regard to the time scaling, I used KSPView and LogView on
> the two previous scalings (blue and yellow lines) but not on
> the last one (green line). So we can't really compare them, am
> I right? However, we can see that the new time scaling looks
> quite good. It slightly increases from ~8s to ~27s.
>
> Unfortunately, the computations are expensive so I would like
> to avoid re-run them if possible. How relevant would be a
> proper time scaling for you? 
>
> Myriam
>
>
> Le 04/12/19 à 18:18, Zhang, Hong a écrit :
>> Myriam :
>> Thanks for your effort. It will help us improve PETSc.
>> Hong
>>
>> Hi all,
>>
>> I used the wrong script, that's why it diverged... Sorry
>> about that. 
>> I tried again with the right script applied on a tiny
>> problem (~200
>> elements). I can see a small difference in memory usage
>> (gain ~ 1mB).
>> when adding the -mat_freeintermediatestructures option. I
>> still have to
>> execute larger cases to plot the scaling. The
>> supercomputer I am used to
>> run my jobs on is really busy at the moment so it takes a
>> while. I hope
>> I'll send you the results on Monday.
>>
>> Thanks everyone,
>>
>> Myriam
>>
>>
>> Le 04/11/19 à 06:01, Jed Brown a écrit :
>> > "Zhang, Hong" > > writes:
>> >
>> >> Jed:
>>  Myriam,
>>  Thanks for the plot.
>> '-mat_freeintermediatedatastructures' should not affect
>> solution. It releases almost half of memory in C=PtAP if
>> C is not reused.
>> >>> And yet if turning it on causes divergence, that
>> would imply a bug.
>> >>> Hong, are you able to reproduce the experiment to see
>> the memory
>> >>> scaling?
>> >> I like to test his code using an alcf machine, but my
>> hands are full now. I'll try it as soon as I find time,
>> hopefully next week.
>> > I have now compiled and run her code locally.
>> >
>> > Myriam, thanks for your last mail adding configuration
>> and removing the
>> > MemManager.h dependency.  I ran with and without
>> > -mat_freeintermediatedatastructures and don't see a
>> difference in
>> > convergence.  What commands did you run to observe that
>> difference?
>>
>>

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-04-26 Thread Mark Adams via petsc-users

On Mon, Mar 11, 2019 at 6:32 AM Myriam Peyrounette <
myriam.peyroune...@idris.fr> wrote:

> Hi,
>
> good point, I changed the 3.10 version so that it is configured with
> --with-debugging=0. You'll find attached the output of the new LogView. The
> execution time is reduced (although still not as good as 3.6) but I can't
> see any improvement with regard to memory.
>
> You'll also find attached the grep GAMG on -info outputs for both
> versions. There are slight differences in grid dimensions or nnz values,
> but is it significant?
>

No. And the GAMG runs seem to be on a 13x13x13 grid.

The nnz goes down a lot here, which is odd, but the problem is so small
that we are probably just seeing boundary effects.


> Thanks,
>
> Myriam
>
>
>
> Le 03/08/19 à 23:23, Mark Adams a écrit :
>
> Just seeing this now. It is hard to imagine how bad GAMG could be on a
> coarse grid, but you can run with -info and grep on GAMG and send that. You
> will see listing of levels, number of equations and number of non-zeros
> (nnz). You can send that and I can get some sense of GAMG is going nuts.
>
> Mark
>
> On Fri, Mar 8, 2019 at 11:56 AM Jed Brown via petsc-users <
> petsc-users@mcs.anl.gov> wrote:
>
>> It may not address the memory issue, but can you build 3.10 with the
>> same options you used for 3.6?  It is currently a debugging build:
>>
>>   ##
>>   ##
>>   #   WARNING!!!   #
>>   ##
>>   #   This code was compiled with a debugging option.  #
>>   #   To get timing results run ./configure#
>>   #   using --with-debugging=no, the performance will  #
>>   #   be generally two or three times faster.  #
>>   ##
>>   ##
>>
>>
> --
> Myriam Peyrounette
> CNRS/IDRIS - HLST
> --
>
>

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-04-26 Thread Mark Adams via petsc-users

Increasing the threshold should increase the size of the coarse grids, but
yours are decreasing. I'm puzzled by that.

On Tue, Mar 5, 2019 at 11:53 AM Myriam Peyrounette <
myriam.peyroune...@idris.fr> wrote:

> I used PCView to display the size of the linear system in each level of
> the MG. You'll find the outputs attached to this mail (zip file) for both
> the default threshold value and a value of 0.1, and for both 3.6 and 3.10
> PETSc versions.
>
> For convenience, I summarized the information in a graph, also attached
> (png file).
>
> As you can see, there are slight differences between the two versions but
> none is critical, in my opinion. Do you see anything suspicious in the
> outputs?
>
> + I can't find the default threshold value. Do you know where I can find
> it?
>
> Thanks for the follow-up
>
> Myriam
>
> Le 03/05/19 à 14:06, Matthew Knepley a écrit :
>
> On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette <
> myriam.peyroune...@idris.fr> wrote:
>
>> Hi Matt,
>>
>> I plotted the memory scalings using different threshold values. The two
>> scalings are slightly translated (from -22 to -88 mB) but this gain is
>> neglectable. The 3.6-scaling keeps being robust while the 3.10-scaling
>> deteriorates.
>>
>> Do you have any other suggestion?
>>
> Mark, what is the option she can give to output all the GAMG data?
>
> Also, run using -ksp_view. GAMG will report all the sizes of its grids, so
> it should be easy to see
> if the coarse grid sizes are increasing, and also what the effect of the
> threshold value is.
>
>   Thanks,
>
>  Matt
>
>> Thanks
>> Myriam
>>
>> Le 03/02/19 à 02:27, Matthew Knepley a écrit :
>>
>> On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via petsc-users <
>> petsc-users@mcs.anl.gov> wrote:
>>
>>> Hi,
>>>
>>> I used to run my code with PETSc 3.6. Since I upgraded the PETSc version
>>> to 3.10, this code has a bad memory scaling.
>>>
>>> To report this issue, I took the PETSc script ex42.c and slightly
>>> modified it so that the KSP and PC configurations are the same as in my
>>> code. In particular, I use a "personnalised" multi-grid method. The
>>> modifications are indicated by the keyword "TopBridge" in the attached
>>> scripts.
>>>
>>> To plot the memory (weak) scaling, I ran four calculations for each
>>> script with increasing problem sizes and computations cores:
>>>
>>> 1. 100,000 elts on 4 cores
>>> 2. 1 million elts on 40 cores
>>> 3. 10 millions elts on 400 cores
>>> 4. 100 millions elts on 4,000 cores
>>>
>>> The resulting graph is also attached. The scaling using PETSc 3.10
>>> clearly deteriorates for large cases, while the one using PETSc 3.6 is
>>> robust.
>>>
>>> After a few tests, I found that the scaling is mostly sensitive to the
>>> use of the AMG method for the coarse grid (line 1780 in
>>> main_ex42_petsc36.cc). In particular, the performance strongly
>>> deteriorates when commenting lines 1777 to 1790 (in
>>> main_ex42_petsc36.cc).
>>>
>>> Do you have any idea of what changed between version 3.6 and version
>>> 3.10 that may imply such degradation?
>>>
>>
>> I believe the default values for PCGAMG changed between versions. It
>> sounds like the coarsening rate
>> is not great enough, so that these grids are too large. This can be set
>> using:
>>
>>
>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html
>>
>> There is some explanation of this effect on that page. Let us know if
>> setting this does not correct the situation.
>>
>>   Thanks,
>>
>>  Matt
>>
>>
>>> Let me know if you need further information.
>>>
>>> Best,
>>>
>>> Myriam Peyrounette
>>>
>>>
>>> --
>>> Myriam Peyrounette
>>> CNRS/IDRIS - HLST
>>> --
>>>
>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> 
>>
>>
>> --
>> Myriam Peyrounette
>> CNRS/IDRIS - HLST
>> --
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 
>
>
> --
> Myriam Peyrounette
> CNRS/IDRIS - HLST
> --
>
>

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-04-26 Thread Mark Adams via petsc-users

> Mark, what is the option she can give to output all the GAMG data?
>
>

-info and then grep on GAMG.

This will print the number of non-zeros per row, which is useful. The
memory size of the matrices will also give you data on this.


> Also, run using -ksp_view. GAMG will report all the sizes of its grids, so
> it should be easy to see
> if the coarse grid sizes are increasing, and also what the effect of the
> threshold value is.
>
>   Thanks,
>
>  Matt
>
>> Thanks
>> Myriam
>>
>> Le 03/02/19 à 02:27, Matthew Knepley a écrit :
>>
>> On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via petsc-users <
>> petsc-users@mcs.anl.gov> wrote:
>>
>>> Hi,
>>>
>>> I used to run my code with PETSc 3.6. Since I upgraded the PETSc version
>>> to 3.10, this code has a bad memory scaling.
>>>
>>> To report this issue, I took the PETSc script ex42.c and slightly
>>> modified it so that the KSP and PC configurations are the same as in my
>>> code. In particular, I use a "personnalised" multi-grid method. The
>>> modifications are indicated by the keyword "TopBridge" in the attached
>>> scripts.
>>>
>>> To plot the memory (weak) scaling, I ran four calculations for each
>>> script with increasing problem sizes and computations cores:
>>>
>>> 1. 100,000 elts on 4 cores
>>> 2. 1 million elts on 40 cores
>>> 3. 10 millions elts on 400 cores
>>> 4. 100 millions elts on 4,000 cores
>>>
>>> The resulting graph is also attached. The scaling using PETSc 3.10
>>> clearly deteriorates for large cases, while the one using PETSc 3.6 is
>>> robust.
>>>
>>> After a few tests, I found that the scaling is mostly sensitive to the
>>> use of the AMG method for the coarse grid (line 1780 in
>>> main_ex42_petsc36.cc). In particular, the performance strongly
>>> deteriorates when commenting lines 1777 to 1790 (in
>>> main_ex42_petsc36.cc).
>>>
>>> Do you have any idea of what changed between version 3.6 and version
>>> 3.10 that may imply such degradation?
>>>
>>
>> I believe the default values for PCGAMG changed between versions. It
>> sounds like the coarsening rate
>> is not great enough, so that these grids are too large. This can be set
>> using:
>>
>>
>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html
>>
>> There is some explanation of this effect on that page. Let us know if
>> setting this does not correct the situation.
>>
>>   Thanks,
>>
>>  Matt
>>
>>
>>> Let me know if you need further information.
>>>
>>> Best,
>>>
>>> Myriam Peyrounette
>>>
>>>
>>> --
>>> Myriam Peyrounette
>>> CNRS/IDRIS - HLST
>>> --
>>>
>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> 
>>
>>
>> --
>> Myriam Peyrounette
>> CNRS/IDRIS - HLST
>> --
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 
>

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-04-25 Thread Zhang, Hong via petsc-users

Myriam:
Checking MatPtAP() in petsc-3.6.4, I realized that it uses different algorithm 
than petsc-10 and later versions. petsc-3.6 uses out-product for C=P^T * AP, 
while petsc-3.10 uses local transpose of P. petsc-3.10 accelerates data 
accessing, but doubles the memory of P.

Fande added two new implementations for MatPtAP() to petsc-master which use 
much smaller and scalable memories with slightly higher computing time (faster 
than hypre though). You may use these new implementations if you have concern 
on memory scalability. The option for these new implementation are:
-matptap_via allatonce
-matptap_via allatonce_merged

Hong

On Mon, Apr 15, 2019 at 12:10 PM hzh...@mcs.anl.gov 
mailto:hzh...@mcs.anl.gov>> wrote:
Myriam:
Thank you very much for providing these results!
I have put effort to accelerate execution time and avoid using global sizes in 
PtAP, for which the algorithm of transpose of P_local and P_other likely 
doubles the memory usage. I'll try to investigate why it becomes unscalable.
Hong

Hi,

you'll find the new scaling attached (green line). I used the version 3.11 and 
the four scalability options :
-matptap_via scalable
-inner_diag_matmatmult_via scalable
-inner_offdiag_matmatmult_via scalable
-mat_freeintermediatedatastructures

The scaling is much better! The code even uses less memory for the smallest 
cases. There is still an increase for the larger one.

With regard to the time scaling, I used KSPView and LogView on the two previous 
scalings (blue and yellow lines) but not on the last one (green line). So we 
can't really compare them, am I right? However, we can see that the new time 
scaling looks quite good. It slightly increases from ~8s to ~27s.

Unfortunately, the computations are expensive so I would like to avoid re-run 
them if possible. How relevant would be a proper time scaling for you?

Myriam

Le 04/12/19 à 18:18, Zhang, Hong a écrit :
Myriam :
Thanks for your effort. It will help us improve PETSc.
Hong

Hi all,

I used the wrong script, that's why it diverged... Sorry about that.
I tried again with the right script applied on a tiny problem (~200
elements). I can see a small difference in memory usage (gain ~ 1mB).
when adding the -mat_freeintermediatestructures option. I still have to
execute larger cases to plot the scaling. The supercomputer I am used to
run my jobs on is really busy at the moment so it takes a while. I hope
I'll send you the results on Monday.

Thanks everyone,

Myriam

Le 04/11/19 à 06:01, Jed Brown a écrit :
> "Zhang, Hong" mailto:hzh...@mcs.anl.gov>> writes:
>
>> Jed:
 Myriam,
 Thanks for the plot. '-mat_freeintermediatedatastructures' should not 
 affect solution. It releases almost half of memory in C=PtAP if C is not 
 reused.
>>> And yet if turning it on causes divergence, that would imply a bug.
>>> Hong, are you able to reproduce the experiment to see the memory
>>> scaling?
>> I like to test his code using an alcf machine, but my hands are full now. 
>> I'll try it as soon as I find time, hopefully next week.
> I have now compiled and run her code locally.
>
> Myriam, thanks for your last mail adding configuration and removing the
> MemManager.h dependency.  I ran with and without
> -mat_freeintermediatedatastructures and don't see a difference in
> convergence.  What commands did you run to observe that difference?

--
Myriam Peyrounette
CNRS/IDRIS - HLST
--

--
Myriam Peyrounette
CNRS/IDRIS - HLST
--

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-04-15 Thread Zhang, Hong via petsc-users

Myriam:
Thank you very much for providing these results!
I have put effort to accelerate execution time and avoid using global sizes in 
PtAP, for which the algorithm of transpose of P_local and P_other likely 
doubles the memory usage. I'll try to investigate why it becomes unscalable.
Hong

Hi,

you'll find the new scaling attached (green line). I used the version 3.11 and 
the four scalability options :
-matptap_via scalable
-inner_diag_matmatmult_via scalable
-inner_offdiag_matmatmult_via scalable
-mat_freeintermediatedatastructures

The scaling is much better! The code even uses less memory for the smallest 
cases. There is still an increase for the larger one.

With regard to the time scaling, I used KSPView and LogView on the two previous 
scalings (blue and yellow lines) but not on the last one (green line). So we 
can't really compare them, am I right? However, we can see that the new time 
scaling looks quite good. It slightly increases from ~8s to ~27s.

Unfortunately, the computations are expensive so I would like to avoid re-run 
them if possible. How relevant would be a proper time scaling for you?

Myriam

Le 04/12/19 à 18:18, Zhang, Hong a écrit :
Myriam :
Thanks for your effort. It will help us improve PETSc.
Hong

Hi all,

I used the wrong script, that's why it diverged... Sorry about that.
I tried again with the right script applied on a tiny problem (~200
elements). I can see a small difference in memory usage (gain ~ 1mB).
when adding the -mat_freeintermediatestructures option. I still have to
execute larger cases to plot the scaling. The supercomputer I am used to
run my jobs on is really busy at the moment so it takes a while. I hope
I'll send you the results on Monday.

Thanks everyone,

Myriam

Le 04/11/19 à 06:01, Jed Brown a écrit :
> "Zhang, Hong" mailto:hzh...@mcs.anl.gov>> writes:
>
>> Jed:
 Myriam,
 Thanks for the plot. '-mat_freeintermediatedatastructures' should not 
 affect solution. It releases almost half of memory in C=PtAP if C is not 
 reused.
>>> And yet if turning it on causes divergence, that would imply a bug.
>>> Hong, are you able to reproduce the experiment to see the memory
>>> scaling?
>> I like to test his code using an alcf machine, but my hands are full now. 
>> I'll try it as soon as I find time, hopefully next week.
> I have now compiled and run her code locally.
>
> Myriam, thanks for your last mail adding configuration and removing the
> MemManager.h dependency.  I ran with and without
> -mat_freeintermediatedatastructures and don't see a difference in
> convergence.  What commands did you run to observe that difference?

--
Myriam Peyrounette
CNRS/IDRIS - HLST
--

--
Myriam Peyrounette
CNRS/IDRIS - HLST
--

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-04-15 Thread Myriam Peyrounette via petsc-users

Hi,

you'll find the new scaling attached (green line). I used the version
3.11 and the four scalability options :
-matptap_via scalable
-inner_diag_matmatmult_via scalable
-inner_offdiag_matmatmult_via scalable
-mat_freeintermediatedatastructures

The scaling is much better! The code even uses less memory for the
smallest cases. There is still an increase for the larger one.

With regard to the time scaling, I used KSPView and LogView on the two
previous scalings (blue and yellow lines) but not on the last one (green
line). So we can't really compare them, am I right? However, we can see
that the new time scaling looks quite good. It slightly increases from
~8s to ~27s.

Unfortunately, the computations are expensive so I would like to avoid
re-run them if possible. How relevant would be a proper time scaling for
you? 

Myriam


Le 04/12/19 à 18:18, Zhang, Hong a écrit :
> Myriam :
> Thanks for your effort. It will help us improve PETSc.
> Hong
>
> Hi all,
>
> I used the wrong script, that's why it diverged... Sorry about that. 
> I tried again with the right script applied on a tiny problem (~200
> elements). I can see a small difference in memory usage (gain ~ 1mB).
> when adding the -mat_freeintermediatestructures option. I still
> have to
> execute larger cases to plot the scaling. The supercomputer I am
> used to
> run my jobs on is really busy at the moment so it takes a while. I
> hope
> I'll send you the results on Monday.
>
> Thanks everyone,
>
> Myriam
>
>
> Le 04/11/19 à 06:01, Jed Brown a écrit :
> > "Zhang, Hong" mailto:hzh...@mcs.anl.gov>>
> writes:
> >
> >> Jed:
>  Myriam,
>  Thanks for the plot. '-mat_freeintermediatedatastructures'
> should not affect solution. It releases almost half of memory in
> C=PtAP if C is not reused.
> >>> And yet if turning it on causes divergence, that would imply a
> bug.
> >>> Hong, are you able to reproduce the experiment to see the memory
> >>> scaling?
> >> I like to test his code using an alcf machine, but my hands are
> full now. I'll try it as soon as I find time, hopefully next week.
> > I have now compiled and run her code locally.
> >
> > Myriam, thanks for your last mail adding configuration and
> removing the
> > MemManager.h dependency.  I ran with and without
> > -mat_freeintermediatedatastructures and don't see a difference in
> > convergence.  What commands did you run to observe that difference?
>
> -- 
> Myriam Peyrounette
> CNRS/IDRIS - HLST
> --
>
>

-- 
Myriam Peyrounette
CNRS/IDRIS - HLST
--



smime.p7s
Description: Signature cryptographique S/MIME

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-04-12 Thread Zhang, Hong via petsc-users

Myriam :
Thanks for your effort. It will help us improve PETSc.
Hong

Hi all,

I used the wrong script, that's why it diverged... Sorry about that.
I tried again with the right script applied on a tiny problem (~200
elements). I can see a small difference in memory usage (gain ~ 1mB).
when adding the -mat_freeintermediatestructures option. I still have to
execute larger cases to plot the scaling. The supercomputer I am used to
run my jobs on is really busy at the moment so it takes a while. I hope
I'll send you the results on Monday.

Thanks everyone,

Myriam

Le 04/11/19 à 06:01, Jed Brown a écrit :
> "Zhang, Hong" mailto:hzh...@mcs.anl.gov>> writes:
>
>> Jed:
 Myriam,
 Thanks for the plot. '-mat_freeintermediatedatastructures' should not 
 affect solution. It releases almost half of memory in C=PtAP if C is not 
 reused.
>>> And yet if turning it on causes divergence, that would imply a bug.
>>> Hong, are you able to reproduce the experiment to see the memory
>>> scaling?
>> I like to test his code using an alcf machine, but my hands are full now. 
>> I'll try it as soon as I find time, hopefully next week.
> I have now compiled and run her code locally.
>
> Myriam, thanks for your last mail adding configuration and removing the
> MemManager.h dependency.  I ran with and without
> -mat_freeintermediatedatastructures and don't see a difference in
> convergence.  What commands did you run to observe that difference?

--
Myriam Peyrounette
CNRS/IDRIS - HLST
--

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-04-12 Thread Myriam Peyrounette via petsc-users

Hi all,

I used the wrong script, that's why it diverged... Sorry about that. 
I tried again with the right script applied on a tiny problem (~200
elements). I can see a small difference in memory usage (gain ~ 1mB).
when adding the -mat_freeintermediatestructures option. I still have to
execute larger cases to plot the scaling. The supercomputer I am used to
run my jobs on is really busy at the moment so it takes a while. I hope
I'll send you the results on Monday.

Thanks everyone,

Myriam

Le 04/11/19 à 06:01, Jed Brown a écrit :
> "Zhang, Hong"  writes:
>
>> Jed:
 Myriam,
 Thanks for the plot. '-mat_freeintermediatedatastructures' should not 
 affect solution. It releases almost half of memory in C=PtAP if C is not 
 reused.
>>> And yet if turning it on causes divergence, that would imply a bug.
>>> Hong, are you able to reproduce the experiment to see the memory
>>> scaling?
>> I like to test his code using an alcf machine, but my hands are full now. 
>> I'll try it as soon as I find time, hopefully next week.
> I have now compiled and run her code locally.
>
> Myriam, thanks for your last mail adding configuration and removing the
> MemManager.h dependency.  I ran with and without
> -mat_freeintermediatedatastructures and don't see a difference in
> convergence.  What commands did you run to observe that difference?

-- 
Myriam Peyrounette
CNRS/IDRIS - HLST
--

smime.p7s
Description: Signature cryptographique S/MIME

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-04-10 Thread Jed Brown via petsc-users

"Zhang, Hong"  writes:

> Jed:
>>> Myriam,
>>> Thanks for the plot. '-mat_freeintermediatedatastructures' should not 
>>> affect solution. It releases almost half of memory in C=PtAP if C is not 
>>> reused.
>
>> And yet if turning it on causes divergence, that would imply a bug.
>> Hong, are you able to reproduce the experiment to see the memory
>> scaling?
>
> I like to test his code using an alcf machine, but my hands are full now. 
> I'll try it as soon as I find time, hopefully next week.

I have now compiled and run her code locally.

Myriam, thanks for your last mail adding configuration and removing the
MemManager.h dependency.  I ran with and without
-mat_freeintermediatedatastructures and don't see a difference in
convergence.  What commands did you run to observe that difference?

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-04-10 Thread Zhang, Hong via petsc-users

Jed:
> Myriam,
> Thanks for the plot. '-mat_freeintermediatedatastructures' should not affect 
> solution. It releases almost half of memory in C=PtAP if C is not reused.

And yet if turning it on causes divergence, that would imply a bug.
Hong, are you able to reproduce the experiment to see the memory
scaling?
I like to test his code using an alcf machine, but my hands are full now. I'll 
try it as soon as I find time, hopefully next week.
Hong

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-04-10 Thread Jed Brown via petsc-users

"Zhang, Hong via petsc-users"  writes:

> Myriam,
> Thanks for the plot. '-mat_freeintermediatedatastructures' should not affect 
> solution. It releases almost half of memory in C=PtAP if C is not reused.

And yet if turning it on causes divergence, that would imply a bug.
Hong, are you able to reproduce the experiment to see the memory
scaling?

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-04-10 Thread Zhang, Hong via petsc-users

Myriam,
Thanks for the plot. '-mat_freeintermediatedatastructures' should not affect 
solution. It releases almost half of memory in C=PtAP if C is not reused.
Hong

On Wed, Apr 10, 2019 at 7:21 AM Mark Adams 
mailto:mfad...@lbl.gov>> wrote:
This looks like it might be noisy data. I'd make sure you run each size on the 
same set of nodes and you might run each job twice (A,B,A,B) in a job script.

On Wed, Apr 10, 2019 at 8:12 AM Myriam Peyrounette via petsc-users 
mailto:petsc-users@mcs.anl.gov>> wrote:

Here is the time weak scaling from the same study. The 3.10.2 version seems to 
be much more stable with regard to the execution time. But not necessarily 
faster for "large scale" simulations (problem size = 1e8).

I didn't use -mat_freeintermediatedatastructures. I tested it this morning and 
the solver diverges when using this option (KSPReason -3).

Myriam

Le 04/09/19 à 17:23, Zhang, Hong a écrit :
Myriam,
Do you have 'execution time scalability' plot? Did you use 
'-mat_freeintermediatedatastructures' for PETSc 3.10.2?
We made several computing optimizations on MatPtAP(), which might trade memory 
for speed. It would be helpful to see a complete comparison.
Hong

On Tue, Apr 9, 2019 at 7:43 AM Myriam Peyrounette via petsc-users 
mailto:petsc-users@mcs.anl.gov>> wrote:
Hi,

in my first mail, I provided a memory scaling concerning the PETSc
example #42. You'll find attached the main files used (one for PETSc
3.6.4, one for PETSc 3.10.2), and the corresponding memory scaling.

In the main files, I modified the solver/preconditioner, so that it
corresponds to my problem. You'll find the modifications by searching
the keyword "TopBridge". In particular, I use GAMG.

Note that the example is about solving Stokes equation, so using GAMG
may not be adapted. However, the memory gap appears and that's the
point. No matter if the results are correct.

Are these scripts useful for you? Let me know.

Thanks,

Myriam

Le 04/04/19 à 00:09, Jed Brown a écrit :
> Myriam Peyrounette via petsc-users 
> mailto:petsc-users@mcs.anl.gov>> writes:
>
>> Hi all,
>>
>> for your information, you'll find attached the comparison of the weak
>> memory scalings when using :
>>
>> - PETSc 3.6.4 (reference)
>> - PETSc 3.10.4 without specific options
>> - PETSc 3.10.4 with the three scalability options you mentionned
>>
>> Using the scalability options does improve the memory scaling. However,
>> the 3.6 version still has a better one...
> Yes, this still looks significant.  Is this an effect we can still
> reproduce with a PETSc example and/or using a memory profiler (such as
> massif or gperftools)?  I think it's important for us to narrow down
> what causes this difference (looks like almost 2x on your 1e8 problem
> size) so we can fix.

--
Myriam Peyrounette
CNRS/IDRIS - HLST
--

--
Myriam Peyrounette
CNRS/IDRIS - HLST
--

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-04-10 Thread Mark Adams via petsc-users

This looks like it might be noisy data. I'd make sure you run each size on
the same set of nodes and you might run each job twice (A,B,A,B) in a job
script.

On Wed, Apr 10, 2019 at 8:12 AM Myriam Peyrounette via petsc-users <
petsc-users@mcs.anl.gov> wrote:

> Here is the time weak scaling from the same study. The 3.10.2 version
> seems to be much more stable with regard to the execution time. But not
> necessarily faster for "large scale" simulations (problem size = 1e8).
>
> I didn't use -mat_freeintermediatedatastructures. I tested it this morning
> and the solver diverges when using this option (KSPReason -3).
>
> Myriam
>
> Le 04/09/19 à 17:23, Zhang, Hong a écrit :
>
> Myriam,
> Do you have 'execution time scalability' plot? Did you use
> '-mat_freeintermediatedatastructures' for PETSc 3.10.2?
> We made several computing optimizations on MatPtAP(), which might trade
> memory for speed. It would be helpful to see a complete comparison.
> Hong
>
> On Tue, Apr 9, 2019 at 7:43 AM Myriam Peyrounette via petsc-users <
> petsc-users@mcs.anl.gov> wrote:
>
>> Hi,
>>
>> in my first mail, I provided a memory scaling concerning the PETSc
>> example #42. You'll find attached the main files used (one for PETSc
>> 3.6.4, one for PETSc 3.10.2), and the corresponding memory scaling.
>>
>> In the main files, I modified the solver/preconditioner, so that it
>> corresponds to my problem. You'll find the modifications by searching
>> the keyword "TopBridge". In particular, I use GAMG.
>>
>> Note that the example is about solving Stokes equation, so using GAMG
>> may not be adapted. However, the memory gap appears and that's the
>> point. No matter if the results are correct.
>>
>> Are these scripts useful for you? Let me know.
>>
>> Thanks,
>>
>> Myriam
>>
>>
>> Le 04/04/19 à 00:09, Jed Brown a écrit :
>> > Myriam Peyrounette via petsc-users  writes:
>> >
>> >> Hi all,
>> >>
>> >> for your information, you'll find attached the comparison of the weak
>> >> memory scalings when using :
>> >>
>> >> - PETSc 3.6.4 (reference)
>> >> - PETSc 3.10.4 without specific options
>> >> - PETSc 3.10.4 with the three scalability options you mentionned
>> >>
>> >> Using the scalability options does improve the memory scaling. However,
>> >> the 3.6 version still has a better one...
>> > Yes, this still looks significant.  Is this an effect we can still
>> > reproduce with a PETSc example and/or using a memory profiler (such as
>> > massif or gperftools)?  I think it's important for us to narrow down
>> > what causes this difference (looks like almost 2x on your 1e8 problem
>> > size) so we can fix.
>>
>> --
>> Myriam Peyrounette
>> CNRS/IDRIS - HLST
>> --
>>
>>
> --
> Myriam Peyrounette
> CNRS/IDRIS - HLST
> --
>
>

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-04-10 Thread Myriam Peyrounette via petsc-users

Here is the time weak scaling from the same study. The 3.10.2 version
seems to be much more stable with regard to the execution time. But not
necessarily faster for "large scale" simulations (problem size = 1e8).

I didn't use -mat_freeintermediatedatastructures. I tested it this
morning and the solver diverges when using this option (KSPReason -3).

Myriam


Le 04/09/19 à 17:23, Zhang, Hong a écrit :
> Myriam,
> Do you have 'execution time scalability' plot? Did you use
> '-mat_freeintermediatedatastructures' for PETSc 3.10.2?
> We made several computing optimizations on MatPtAP(), which might
> trade memory for speed. It would be helpful to see a complete comparison.
> Hong
>
> On Tue, Apr 9, 2019 at 7:43 AM Myriam Peyrounette via petsc-users
> mailto:petsc-users@mcs.anl.gov>> wrote:
>
> Hi,
>
> in my first mail, I provided a memory scaling concerning the PETSc
> example #42. You'll find attached the main files used (one for PETSc
> 3.6.4, one for PETSc 3.10.2), and the corresponding memory scaling.
>
> In the main files, I modified the solver/preconditioner, so that it
> corresponds to my problem. You'll find the modifications by searching
> the keyword "TopBridge". In particular, I use GAMG.
>
> Note that the example is about solving Stokes equation, so using GAMG
> may not be adapted. However, the memory gap appears and that's the
> point. No matter if the results are correct.
>
> Are these scripts useful for you? Let me know.
>
> Thanks,
>
> Myriam
>
>
> Le 04/04/19 à 00:09, Jed Brown a écrit :
> > Myriam Peyrounette via petsc-users  > writes:
> >
> >> Hi all,
> >>
> >> for your information, you'll find attached the comparison of
> the weak
> >> memory scalings when using :
> >>
> >> - PETSc 3.6.4 (reference)
> >> - PETSc 3.10.4 without specific options
> >> - PETSc 3.10.4 with the three scalability options you mentionned
> >>
> >> Using the scalability options does improve the memory scaling.
> However,
> >> the 3.6 version still has a better one...
> > Yes, this still looks significant.  Is this an effect we can still
> > reproduce with a PETSc example and/or using a memory profiler
> (such as
> > massif or gperftools)?  I think it's important for us to narrow down
> > what causes this difference (looks like almost 2x on your 1e8
> problem
> > size) so we can fix.
>
> -- 
> Myriam Peyrounette
> CNRS/IDRIS - HLST
> --
>

-- 
Myriam Peyrounette
CNRS/IDRIS - HLST
--



smime.p7s
Description: Signature cryptographique S/MIME

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-04-09 Thread Zhang, Hong via petsc-users

Myriam,
Do you have 'execution time scalability' plot? Did you use 
'-mat_freeintermediatedatastructures' for PETSc 3.10.2?
We made several computing optimizations on MatPtAP(), which might trade memory 
for speed. It would be helpful to see a complete comparison.
Hong

On Tue, Apr 9, 2019 at 7:43 AM Myriam Peyrounette via petsc-users 
mailto:petsc-users@mcs.anl.gov>> wrote:
Hi,

in my first mail, I provided a memory scaling concerning the PETSc
example #42. You'll find attached the main files used (one for PETSc
3.6.4, one for PETSc 3.10.2), and the corresponding memory scaling.

In the main files, I modified the solver/preconditioner, so that it
corresponds to my problem. You'll find the modifications by searching
the keyword "TopBridge". In particular, I use GAMG.

Note that the example is about solving Stokes equation, so using GAMG
may not be adapted. However, the memory gap appears and that's the
point. No matter if the results are correct.

Are these scripts useful for you? Let me know.

Thanks,

Myriam

Le 04/04/19 à 00:09, Jed Brown a écrit :
> Myriam Peyrounette via petsc-users 
> mailto:petsc-users@mcs.anl.gov>> writes:
>
>> Hi all,
>>
>> for your information, you'll find attached the comparison of the weak
>> memory scalings when using :
>>
>> - PETSc 3.6.4 (reference)
>> - PETSc 3.10.4 without specific options
>> - PETSc 3.10.4 with the three scalability options you mentionned
>>
>> Using the scalability options does improve the memory scaling. However,
>> the 3.6 version still has a better one...
> Yes, this still looks significant.  Is this an effect we can still
> reproduce with a PETSc example and/or using a memory profiler (such as
> massif or gperftools)?  I think it's important for us to narrow down
> what causes this difference (looks like almost 2x on your 1e8 problem
> size) so we can fix.

--
Myriam Peyrounette
CNRS/IDRIS - HLST
--

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-04-03 Thread Jed Brown via petsc-users

Myriam Peyrounette via petsc-users  writes:

> Hi all,
>
> for your information, you'll find attached the comparison of the weak
> memory scalings when using :
>
> - PETSc 3.6.4 (reference)
> - PETSc 3.10.4 without specific options
> - PETSc 3.10.4 with the three scalability options you mentionned
>
> Using the scalability options does improve the memory scaling. However,
> the 3.6 version still has a better one...

Yes, this still looks significant.  Is this an effect we can still
reproduce with a PETSc example and/or using a memory profiler (such as
massif or gperftools)?  I think it's important for us to narrow down
what causes this difference (looks like almost 2x on your 1e8 problem
size) so we can fix.

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-03-26 Thread Myriam Peyrounette via petsc-users

*SetFromOptions() was not called indeed... Thanks! The code performance
is better now with regard to memory usage!

I still have to plot the memory scaling on bigger cases to see if it has
the same good behaviour as when using the 3.6 version.

I'll let ou know as soon as I have plotted it.

Thanks again

Myriam


Le 03/26/19 à 14:30, Matthew Knepley a écrit :
> On Tue, Mar 26, 2019 at 9:27 AM Myriam Peyrounette
> mailto:myriam.peyroune...@idris.fr>> wrote:
>
> I checked with -ksp_view (attached) but no prefix is associated
> with the matrix. Some are associated to the KSP and PC, but none
> to the Mat
>
> Another thing that could prevent options being used is that
> *SetFromOptions() is not called for the object.
>
>   Thanks,
>
>      Matt
>  
>
> Le 03/26/19 à 11:55, Dave May a écrit :
>>
>>
>> On Tue, 26 Mar 2019 at 10:36, Myriam Peyrounette
>> > > wrote:
>>
>> Oh you were right, the three options are unsused
>> (-matptap_via scalable, -inner_offdiag_matmatmult_via
>> scalable and -inner_diag_matmatmult_via scalable). Does this
>> mean I am not using the associated PtAP functions?
>>
>>
>> No - not necessarily. All it means is the options were not parsed. 
>>
>> If your matrices have an option prefix associated with them (e.g.
>> abc) , then you need to provide the option as
>>   -abc_matptap_via scalable
>>
>> If you are not sure if you matrices have a prefix, look at the
>> result of -ksp_view (see below for an example)
>>
>>   Mat Object: 2 MPI processes
>>
>>     type: mpiaij
>>
>>     rows=363, cols=363, bs=3
>>
>>     total: nonzeros=8649, allocated nonzeros=8649
>>
>>     total number of mallocs used during MatSetValues calls =0
>>
>>   Mat Object: (B_) 2 MPI processes
>>
>>     type: mpiaij
>>
>>     rows=363, cols=363, bs=3
>>
>>     total: nonzeros=8649, allocated nonzeros=8649
>>
>>     total number of mallocs used during MatSetValues calls =0
>>
>>
>> The first matrix has no options prefix, but the second does and
>> it's called "B_".
>>
>>
>>
>>  
>>
>> Myriam
>>
>>
>> Le 03/26/19 à 11:10, Dave May a écrit :
>>>
>>> On Tue, 26 Mar 2019 at 09:52, Myriam Peyrounette via
>>> petsc-users >> > wrote:
>>>
>>> How can I be sure they are indeed used? Can I print this
>>> information in some log file?
>>>
>>> Yes. Re-run the job with the command line option
>>>
>>> -options_left true
>>>
>>> This will report all options parsed, and importantly, will
>>> also indicate if any options were unused.
>>>  
>>>
>>> Thanks
>>> Dave
>>>
>>> Thanks in advance
>>>
>>> Myriam
>>>
>>>
>>> Le 03/25/19 à 18:24, Matthew Knepley a écrit :
 On Mon, Mar 25, 2019 at 10:54 AM Myriam Peyrounette via
 petsc-users >>> > wrote:

 Hi,

 thanks for the explanations. I tried the last PETSc
 version (commit
 fbc5705bc518d02a4999f188aad4ccff5f754cbf), which
 includes the patch you talked about. But the memory
 scaling shows no improvement (see scaling
 attached), even when using the "scalable" options :(

 I had a look at the PETSc functions
 MatPtAPNumeric_MPIAIJ_MPIAIJ and
 MatPtAPSymbolic_MPIAIJ_MPIAIJ (especially at the
 differences before and after the first "bad"
 commit), but I can't find what induced this memory
 issue.

 Are you sure that the option was used? It just looks
 suspicious to me that they use exactly the same amount
 of memory. It should be different, even if it does not
 solve the problem.

    Thanks,

      Matt 

 Myriam




 Le 03/20/19 à 17:38, Fande Kong a écrit :
> Hi Myriam,
>
> There are three algorithms in PETSc to do PtAP
> ( const char          *algTypes[3] =
> {"scalable","nonscalable","hypre"};), and can be
> specified using the petsc options: -matptap_via .
>
> (1) -matptap_via hypre: This call the hypre
> package to do the PtAP trough an all-at-once
> triple product. In our experiences, it is the most
> memory efficient, but could be slow.
>
> (2)  -matptap_via scalable: This involves a
> row-wise

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-03-26 Thread Matthew Knepley via petsc-users

On Tue, Mar 26, 2019 at 9:27 AM Myriam Peyrounette <
myriam.peyroune...@idris.fr> wrote:

> I checked with -ksp_view (attached) but no prefix is associated with the
> matrix. Some are associated to the KSP and PC, but none to the Mat
>
Another thing that could prevent options being used is that
*SetFromOptions() is not called for the object.

  Thanks,

 Matt


> Le 03/26/19 à 11:55, Dave May a écrit :
>
>
>
> On Tue, 26 Mar 2019 at 10:36, Myriam Peyrounette <
> myriam.peyroune...@idris.fr> wrote:
>
>> Oh you were right, the three options are unsused (-matptap_via scalable,
>> -inner_offdiag_matmatmult_via scalable and -inner_diag_matmatmult_via
>> scalable). Does this mean I am not using the associated PtAP functions?
>>
>
> No - not necessarily. All it means is the options were not parsed.
>
> If your matrices have an option prefix associated with them (e.g. abc) ,
> then you need to provide the option as
>   -abc_matptap_via scalable
>
> If you are not sure if you matrices have a prefix, look at the result of
> -ksp_view (see below for an example)
>
>   Mat Object: 2 MPI processes
>
> type: mpiaij
>
> rows=363, cols=363, bs=3
>
> total: nonzeros=8649, allocated nonzeros=8649
>
> total number of mallocs used during MatSetValues calls =0
>
>   Mat Object: (B_) 2 MPI processes
>
> type: mpiaij
>
> rows=363, cols=363, bs=3
>
> total: nonzeros=8649, allocated nonzeros=8649
>
> total number of mallocs used during MatSetValues calls =0
>
> The first matrix has no options prefix, but the second does and it's
> called "B_".
>
>
>
>
>
>> Myriam
>>
>> Le 03/26/19 à 11:10, Dave May a écrit :
>>
>>
>> On Tue, 26 Mar 2019 at 09:52, Myriam Peyrounette via petsc-users <
>> petsc-users@mcs.anl.gov> wrote:
>>
>>> How can I be sure they are indeed used? Can I print this information in
>>> some log file?
>>>
>> Yes. Re-run the job with the command line option
>>
>> -options_left true
>>
>> This will report all options parsed, and importantly, will also indicate
>> if any options were unused.
>>
>>
>> Thanks
>> Dave
>>
>> Thanks in advance
>>>
>>> Myriam
>>>
>>> Le 03/25/19 à 18:24, Matthew Knepley a écrit :
>>>
>>> On Mon, Mar 25, 2019 at 10:54 AM Myriam Peyrounette via petsc-users <
>>> petsc-users@mcs.anl.gov> wrote:
>>>
 Hi,

 thanks for the explanations. I tried the last PETSc version (commit
 fbc5705bc518d02a4999f188aad4ccff5f754cbf), which includes the patch you
 talked about. But the memory scaling shows no improvement (see scaling
 attached), even when using the "scalable" options :(

 I had a look at the PETSc functions MatPtAPNumeric_MPIAIJ_MPIAIJ and
 MatPtAPSymbolic_MPIAIJ_MPIAIJ (especially at the differences before and
 after the first "bad" commit), but I can't find what induced this memory
 issue.

>>> Are you sure that the option was used? It just looks suspicious to me
>>> that they use exactly the same amount of memory. It should be different,
>>> even if it does not solve the problem.
>>>
>>>Thanks,
>>>
>>>  Matt
>>>
 Myriam




 Le 03/20/19 à 17:38, Fande Kong a écrit :

 Hi Myriam,

 There are three algorithms in PETSc to do PtAP ( const char
 *algTypes[3] = {"scalable","nonscalable","hypre"};), and can be specified
 using the petsc options: -matptap_via .

 (1) -matptap_via hypre: This call the hypre package to do the PtAP
 trough an all-at-once triple product. In our experiences, it is the most
 memory efficient, but could be slow.

 (2)  -matptap_via scalable: This involves a row-wise algorithm plus an
 outer product.  This will use more memory than hypre, but way faster. This
 used to have a bug that could take all your memory, and I have a fix at
 https://bitbucket.org/petsc/petsc/pull-requests/1452/mpiptap-enable-large-scale-simulations/diff.
 When using this option, we may want to have extra options such as
  -inner_offdiag_matmatmult_via scalable -inner_diag_matmatmult_via
 scalable  to select inner scalable algorithms.

 (3)  -matptap_via nonscalable:  Suppose to be even faster, but use more
 memory. It does dense matrix operations.


 Thanks,

 Fande Kong




 On Wed, Mar 20, 2019 at 10:06 AM Myriam Peyrounette via petsc-users <
 petsc-users@mcs.anl.gov> wrote:

> More precisely: something happens when upgrading the functions
> MatPtAPNumeric_MPIAIJ_MPIAIJ and/or MatPtAPSymbolic_MPIAIJ_MPIAIJ.
>
> Unfortunately, there are a lot of differences between the old and new
> versions of these functions. I keep investigating but if you have any 
> idea,
> please let me know.
>
> Best,
>
> Myriam
>
> Le 03/20/19 à 13:48, Myriam Peyrounette a écrit :
>
> Hi all,
>
> I used git bisect to determine when the memory need increased. I found
> that the first "bad" commit is

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-03-26 Thread Myriam Peyrounette via petsc-users

I checked with -ksp_view (attached) but no prefix is associated with the
matrix. Some are associated to the KSP and PC, but none to the Mat.


Le 03/26/19 à 11:55, Dave May a écrit :
>
>
> On Tue, 26 Mar 2019 at 10:36, Myriam Peyrounette
> mailto:myriam.peyroune...@idris.fr>> wrote:
>
> Oh you were right, the three options are unsused (-matptap_via
> scalable, -inner_offdiag_matmatmult_via scalable and
> -inner_diag_matmatmult_via scalable). Does this mean I am not
> using the associated PtAP functions?
>
>
> No - not necessarily. All it means is the options were not parsed. 
>
> If your matrices have an option prefix associated with them (e.g. abc)
> , then you need to provide the option as
>   -abc_matptap_via scalable
>
> If you are not sure if you matrices have a prefix, look at the result
> of -ksp_view (see below for an example)
>
>   Mat Object: 2 MPI processes
>
>     type: mpiaij
>
>     rows=363, cols=363, bs=3
>
>     total: nonzeros=8649, allocated nonzeros=8649
>
>     total number of mallocs used during MatSetValues calls =0
>
>   Mat Object: (B_) 2 MPI processes
>
>     type: mpiaij
>
>     rows=363, cols=363, bs=3
>
>     total: nonzeros=8649, allocated nonzeros=8649
>
>     total number of mallocs used during MatSetValues calls =0
>
>
> The first matrix has no options prefix, but the second does and it's
> called "B_".
>
>
>
>  
>
> Myriam
>
>
> Le 03/26/19 à 11:10, Dave May a écrit :
>>
>> On Tue, 26 Mar 2019 at 09:52, Myriam Peyrounette via petsc-users
>> mailto:petsc-users@mcs.anl.gov>> wrote:
>>
>> How can I be sure they are indeed used? Can I print this
>> information in some log file?
>>
>> Yes. Re-run the job with the command line option
>>
>> -options_left true
>>
>> This will report all options parsed, and importantly, will also
>> indicate if any options were unused.
>>  
>>
>> Thanks
>> Dave
>>
>> Thanks in advance
>>
>> Myriam
>>
>>
>> Le 03/25/19 à 18:24, Matthew Knepley a écrit :
>>> On Mon, Mar 25, 2019 at 10:54 AM Myriam Peyrounette via
>>> petsc-users >> > wrote:
>>>
>>> Hi,
>>>
>>> thanks for the explanations. I tried the last PETSc
>>> version (commit
>>> fbc5705bc518d02a4999f188aad4ccff5f754cbf), which
>>> includes the patch you talked about. But the memory
>>> scaling shows no improvement (see scaling attached),
>>> even when using the "scalable" options :(
>>>
>>> I had a look at the PETSc functions
>>> MatPtAPNumeric_MPIAIJ_MPIAIJ and
>>> MatPtAPSymbolic_MPIAIJ_MPIAIJ (especially at the
>>> differences before and after the first "bad" commit),
>>> but I can't find what induced this memory issue.
>>>
>>> Are you sure that the option was used? It just looks
>>> suspicious to me that they use exactly the same amount of
>>> memory. It should be different, even if it does not solve
>>> the problem.
>>>
>>>    Thanks,
>>>
>>>      Matt 
>>>
>>> Myriam
>>>
>>>
>>>
>>>
>>> Le 03/20/19 à 17:38, Fande Kong a écrit :
 Hi Myriam,

 There are three algorithms in PETSc to do PtAP ( const
 char          *algTypes[3] =
 {"scalable","nonscalable","hypre"};), and can be
 specified using the petsc options: -matptap_via .

 (1) -matptap_via hypre: This call the hypre package to
 do the PtAP trough an all-at-once triple product. In
 our experiences, it is the most memory efficient, but
 could be slow.

 (2)  -matptap_via scalable: This involves a row-wise
 algorithm plus an outer product.  This will use more
 memory than hypre, but way faster. This used to have a
 bug that could take all your memory, and I have a fix
 at 
 https://bitbucket.org/petsc/petsc/pull-requests/1452/mpiptap-enable-large-scale-simulations/diff.
  
 When using this option, we may want to have extra
 options such as   -inner_offdiag_matmatmult_via
 scalable -inner_diag_matmatmult_via scalable  to select
 inner scalable algorithms.

 (3)  -matptap_via nonscalable:  Suppose to be even
 faster, but use more memory. It does dense matrix
 operations.


 Thanks,

 Fande Kong




 On Wed, Mar 20, 2019 at 10:06 AM Myriam Peyrounette via
 petsc-users >>> > wrote:

 More precisely: something happens when upgrading
 the

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-03-26 Thread Myriam Peyrounette via petsc-users

Oh you were right, the three options are unsused (-matptap_via scalable,
-inner_offdiag_matmatmult_via scalable and -inner_diag_matmatmult_via
scalable). Does this mean I am not using the associated PtAP functions?

Myriam


Le 03/26/19 à 11:10, Dave May a écrit :
>
> On Tue, 26 Mar 2019 at 09:52, Myriam Peyrounette via petsc-users
> mailto:petsc-users@mcs.anl.gov>> wrote:
>
> How can I be sure they are indeed used? Can I print this
> information in some log file?
>
> Yes. Re-run the job with the command line option
>
> -options_left true
>
> This will report all options parsed, and importantly, will also
> indicate if any options were unused.
>  
>
> Thanks
> Dave
>
> Thanks in advance
>
> Myriam
>
>
> Le 03/25/19 à 18:24, Matthew Knepley a écrit :
>> On Mon, Mar 25, 2019 at 10:54 AM Myriam Peyrounette via
>> petsc-users > > wrote:
>>
>> Hi,
>>
>> thanks for the explanations. I tried the last PETSc version
>> (commit fbc5705bc518d02a4999f188aad4ccff5f754cbf), which
>> includes the patch you talked about. But the memory scaling
>> shows no improvement (see scaling attached), even when using
>> the "scalable" options :(
>>
>> I had a look at the PETSc functions
>> MatPtAPNumeric_MPIAIJ_MPIAIJ and
>> MatPtAPSymbolic_MPIAIJ_MPIAIJ (especially at the differences
>> before and after the first "bad" commit), but I can't find
>> what induced this memory issue.
>>
>> Are you sure that the option was used? It just looks suspicious
>> to me that they use exactly the same amount of memory. It should
>> be different, even if it does not solve the problem.
>>
>>    Thanks,
>>
>>      Matt 
>>
>> Myriam
>>
>>
>>
>>
>> Le 03/20/19 à 17:38, Fande Kong a écrit :
>>> Hi Myriam,
>>>
>>> There are three algorithms in PETSc to do PtAP ( const char 
>>>         *algTypes[3] = {"scalable","nonscalable","hypre"};),
>>> and can be specified using the petsc options: -matptap_via .
>>>
>>> (1) -matptap_via hypre: This call the hypre package to do
>>> the PtAP trough an all-at-once triple product. In our
>>> experiences, it is the most memory efficient, but could be slow.
>>>
>>> (2)  -matptap_via scalable: This involves a row-wise
>>> algorithm plus an outer product.  This will use more memory
>>> than hypre, but way faster. This used to have a bug that
>>> could take all your memory, and I have a fix
>>> at 
>>> https://bitbucket.org/petsc/petsc/pull-requests/1452/mpiptap-enable-large-scale-simulations/diff.
>>>  
>>> When using this option, we may want to have extra options
>>> such as   -inner_offdiag_matmatmult_via scalable
>>> -inner_diag_matmatmult_via scalable  to select inner
>>> scalable algorithms.
>>>
>>> (3)  -matptap_via nonscalable:  Suppose to be even faster,
>>> but use more memory. It does dense matrix operations.
>>>
>>>
>>> Thanks,
>>>
>>> Fande Kong
>>>
>>>
>>>
>>>
>>> On Wed, Mar 20, 2019 at 10:06 AM Myriam Peyrounette via
>>> petsc-users >> > wrote:
>>>
>>> More precisely: something happens when upgrading the
>>> functions MatPtAPNumeric_MPIAIJ_MPIAIJ and/or
>>> MatPtAPSymbolic_MPIAIJ_MPIAIJ.
>>>
>>> Unfortunately, there are a lot of differences between
>>> the old and new versions of these functions. I keep
>>> investigating but if you have any idea, please let me know.
>>>
>>> Best,
>>>
>>> Myriam
>>>
>>>
>>> Le 03/20/19 à 13:48, Myriam Peyrounette a écrit :

 Hi all,

 I used git bisect to determine when the memory need
 increased. I found that the first "bad" commit is  
 aa690a28a7284adb519c28cb44eae20a2c131c85.

 Barry was right, this commit seems to be about an
 evolution of MatPtAPSymbolic_MPIAIJ_MPIAIJ. You
 mentioned the option "-matptap_via scalable" but I
 can't find any information about it. Can you tell me more?

 Thanks

 Myriam


 Le 03/11/19 à 14:40, Mark Adams a écrit :
> Is there a difference in memory usage on your tiny
> problem? I assume no.
>
> I don't see anything that could come from GAMG other
> than the RAP stuff that you have discussed already.
>
> On Mon, Mar 11, 2019 at 9:32 AM Myriam Peyrounette
>  > wrote:
>
> The code I am using here is the example 42 of
> PETSc
>

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-03-26 Thread Myriam Peyrounette via petsc-users

How can I be sure they are indeed used? Can I print this information in
some log file?

Thanks in advance

Myriam


Le 03/25/19 à 18:24, Matthew Knepley a écrit :
> On Mon, Mar 25, 2019 at 10:54 AM Myriam Peyrounette via petsc-users
> mailto:petsc-users@mcs.anl.gov>> wrote:
>
> Hi,
>
> thanks for the explanations. I tried the last PETSc version
> (commit fbc5705bc518d02a4999f188aad4ccff5f754cbf), which includes
> the patch you talked about. But the memory scaling shows no
> improvement (see scaling attached), even when using the "scalable"
> options :(
>
> I had a look at the PETSc functions MatPtAPNumeric_MPIAIJ_MPIAIJ
> and MatPtAPSymbolic_MPIAIJ_MPIAIJ (especially at the differences
> before and after the first "bad" commit), but I can't find what
> induced this memory issue.
>
> Are you sure that the option was used? It just looks suspicious to me
> that they use exactly the same amount of memory. It should be
> different, even if it does not solve the problem.
>
>    Thanks,
>
>      Matt 
>
> Myriam
>
>
>
>
> Le 03/20/19 à 17:38, Fande Kong a écrit :
>> Hi Myriam,
>>
>> There are three algorithms in PETSc to do PtAP ( const char     
>>     *algTypes[3] = {"scalable","nonscalable","hypre"};), and can
>> be specified using the petsc options: -matptap_via .
>>
>> (1) -matptap_via hypre: This call the hypre package to do the
>> PtAP trough an all-at-once triple product. In our experiences, it
>> is the most memory efficient, but could be slow.
>>
>> (2)  -matptap_via scalable: This involves a row-wise algorithm
>> plus an outer product.  This will use more memory than hypre, but
>> way faster. This used to have a bug that could take all your
>> memory, and I have a fix
>> at 
>> https://bitbucket.org/petsc/petsc/pull-requests/1452/mpiptap-enable-large-scale-simulations/diff.
>>  
>> When using this option, we may want to have extra options such
>> as   -inner_offdiag_matmatmult_via scalable
>> -inner_diag_matmatmult_via scalable  to select inner scalable
>> algorithms.
>>
>> (3)  -matptap_via nonscalable:  Suppose to be even faster, but
>> use more memory. It does dense matrix operations.
>>
>>
>> Thanks,
>>
>> Fande Kong
>>
>>
>>
>>
>> On Wed, Mar 20, 2019 at 10:06 AM Myriam Peyrounette via
>> petsc-users > > wrote:
>>
>> More precisely: something happens when upgrading the
>> functions MatPtAPNumeric_MPIAIJ_MPIAIJ and/or
>> MatPtAPSymbolic_MPIAIJ_MPIAIJ.
>>
>> Unfortunately, there are a lot of differences between the old
>> and new versions of these functions. I keep investigating but
>> if you have any idea, please let me know.
>>
>> Best,
>>
>> Myriam
>>
>>
>> Le 03/20/19 à 13:48, Myriam Peyrounette a écrit :
>>>
>>> Hi all,
>>>
>>> I used git bisect to determine when the memory need
>>> increased. I found that the first "bad" commit is  
>>> aa690a28a7284adb519c28cb44eae20a2c131c85.
>>>
>>> Barry was right, this commit seems to be about an evolution
>>> of MatPtAPSymbolic_MPIAIJ_MPIAIJ. You mentioned the option
>>> "-matptap_via scalable" but I can't find any information
>>> about it. Can you tell me more?
>>>
>>> Thanks
>>>
>>> Myriam
>>>
>>>
>>> Le 03/11/19 à 14:40, Mark Adams a écrit :
 Is there a difference in memory usage on your tiny problem?
 I assume no.

 I don't see anything that could come from GAMG other than
 the RAP stuff that you have discussed already.

 On Mon, Mar 11, 2019 at 9:32 AM Myriam Peyrounette
 >>> > wrote:

 The code I am using here is the example 42 of PETSc
 
 (https://www.mcs.anl.gov/petsc/petsc-3.9/src/ksp/ksp/examples/tutorials/ex42.c.html).
 Indeed it solves the Stokes equation. I thought it was
 a good idea to use an example you might know (and
 didn't find any that uses GAMG functions). I just
 changed the PCMG setup so that the memory problem
 appears. And it appears when adding PCGAMG.

 I don't care about the performance or even the result
 rightness here, but only about the difference in memory
 use between 3.6 and 3.10. Do you think finding a more
 adapted script would help?

 I used the threshold of 0.1 only once, at the
 beginning, to test its influence. I used the default
 threshold (of 0, I guess) for all the other runs.

 Myriam


 Le 03/11/19 à 13:52, Mark Adams a écrit :
> In looking at this larger scale

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-03-25 Thread Myriam Peyrounette via petsc-users

Hi,

thanks for the explanations. I tried the last PETSc version (commit
fbc5705bc518d02a4999f188aad4ccff5f754cbf), which includes the patch you
talked about. But the memory scaling shows no improvement (see scaling
attached), even when using the "scalable" options :(

I had a look at the PETSc functions MatPtAPNumeric_MPIAIJ_MPIAIJ and
MatPtAPSymbolic_MPIAIJ_MPIAIJ (especially at the differences before and
after the first "bad" commit), but I can't find what induced this memory
issue.

Myriam




Le 03/20/19 à 17:38, Fande Kong a écrit :
> Hi Myriam,
>
> There are three algorithms in PETSc to do PtAP ( const char         
> *algTypes[3] = {"scalable","nonscalable","hypre"};), and can be
> specified using the petsc options: -matptap_via .
>
> (1) -matptap_via hypre: This call the hypre package to do the PtAP
> trough an all-at-once triple product. In our experiences, it is the
> most memory efficient, but could be slow.
>
> (2)  -matptap_via scalable: This involves a row-wise algorithm plus an
> outer product.  This will use more memory than hypre, but way faster.
> This used to have a bug that could take all your memory, and I have a
> fix
> at 
> https://bitbucket.org/petsc/petsc/pull-requests/1452/mpiptap-enable-large-scale-simulations/diff.
>  
> When using this option, we may want to have extra options such as 
>  -inner_offdiag_matmatmult_via scalable -inner_diag_matmatmult_via
> scalable  to select inner scalable algorithms.
>
> (3)  -matptap_via nonscalable:  Suppose to be even faster, but use
> more memory. It does dense matrix operations.
>
>
> Thanks,
>
> Fande Kong
>
>
>
>
> On Wed, Mar 20, 2019 at 10:06 AM Myriam Peyrounette via petsc-users
> mailto:petsc-users@mcs.anl.gov>> wrote:
>
> More precisely: something happens when upgrading the functions
> MatPtAPNumeric_MPIAIJ_MPIAIJ and/or MatPtAPSymbolic_MPIAIJ_MPIAIJ.
>
> Unfortunately, there are a lot of differences between the old and
> new versions of these functions. I keep investigating but if you
> have any idea, please let me know.
>
> Best,
>
> Myriam
>
>
> Le 03/20/19 à 13:48, Myriam Peyrounette a écrit :
>>
>> Hi all,
>>
>> I used git bisect to determine when the memory need increased. I
>> found that the first "bad" commit is  
>> aa690a28a7284adb519c28cb44eae20a2c131c85.
>>
>> Barry was right, this commit seems to be about an evolution of
>> MatPtAPSymbolic_MPIAIJ_MPIAIJ. You mentioned the option
>> "-matptap_via scalable" but I can't find any information about
>> it. Can you tell me more?
>>
>> Thanks
>>
>> Myriam
>>
>>
>> Le 03/11/19 à 14:40, Mark Adams a écrit :
>>> Is there a difference in memory usage on your tiny problem? I
>>> assume no.
>>>
>>> I don't see anything that could come from GAMG other than the
>>> RAP stuff that you have discussed already.
>>>
>>> On Mon, Mar 11, 2019 at 9:32 AM Myriam Peyrounette
>>> >> > wrote:
>>>
>>> The code I am using here is the example 42 of PETSc
>>> 
>>> (https://www.mcs.anl.gov/petsc/petsc-3.9/src/ksp/ksp/examples/tutorials/ex42.c.html).
>>> Indeed it solves the Stokes equation. I thought it was a
>>> good idea to use an example you might know (and didn't find
>>> any that uses GAMG functions). I just changed the PCMG setup
>>> so that the memory problem appears. And it appears when
>>> adding PCGAMG.
>>>
>>> I don't care about the performance or even the result
>>> rightness here, but only about the difference in memory use
>>> between 3.6 and 3.10. Do you think finding a more adapted
>>> script would help?
>>>
>>> I used the threshold of 0.1 only once, at the beginning, to
>>> test its influence. I used the default threshold (of 0, I
>>> guess) for all the other runs.
>>>
>>> Myriam
>>>
>>>
>>> Le 03/11/19 à 13:52, Mark Adams a écrit :
 In looking at this larger scale run ...

 * Your eigen estimates are much lower than your tiny test
 problem.  But this is Stokes apparently and it should not
 work anyway. Maybe you have a small time step that adds a
 lot of mass that brings the eigen estimates down. And your
 min eigenvalue (not used) is positive. I would expect
 negative for Stokes ...

 * You seem to be setting a threshold value of 0.1 -- that
 is very high

 * v3.6 says "using nonzero initial guess" but this is not
 in v3.10. Maybe we just stopped printing that.

 * There were some changes to coasening parameters in going
 from v3.6 but it does not look like your problem was
 effected. (The coarsening algo is non-deterministic by
 default and you can see small difference on different runs)

 * We may have

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-03-20 Thread Myriam Peyrounette via petsc-users

More precisely: something happens when upgrading the functions
MatPtAPNumeric_MPIAIJ_MPIAIJ and/or MatPtAPSymbolic_MPIAIJ_MPIAIJ.

Unfortunately, there are a lot of differences between the old and new
versions of these functions. I keep investigating but if you have any
idea, please let me know.

Best,

Myriam


Le 03/20/19 à 13:48, Myriam Peyrounette a écrit :
>
> Hi all,
>
> I used git bisect to determine when the memory need increased. I found
> that the first "bad" commit is   aa690a28a7284adb519c28cb44eae20a2c131c85.
>
> Barry was right, this commit seems to be about an evolution of
> MatPtAPSymbolic_MPIAIJ_MPIAIJ. You mentioned the option "-matptap_via
> scalable" but I can't find any information about it. Can you tell me more?
>
> Thanks
>
> Myriam
>
>
> Le 03/11/19 à 14:40, Mark Adams a écrit :
>> Is there a difference in memory usage on your tiny problem? I assume no.
>>
>> I don't see anything that could come from GAMG other than the RAP
>> stuff that you have discussed already.
>>
>> On Mon, Mar 11, 2019 at 9:32 AM Myriam Peyrounette
>> mailto:myriam.peyroune...@idris.fr>> wrote:
>>
>> The code I am using here is the example 42 of PETSc
>> 
>> (https://www.mcs.anl.gov/petsc/petsc-3.9/src/ksp/ksp/examples/tutorials/ex42.c.html).
>> Indeed it solves the Stokes equation. I thought it was a good
>> idea to use an example you might know (and didn't find any that
>> uses GAMG functions). I just changed the PCMG setup so that the
>> memory problem appears. And it appears when adding PCGAMG.
>>
>> I don't care about the performance or even the result rightness
>> here, but only about the difference in memory use between 3.6 and
>> 3.10. Do you think finding a more adapted script would help?
>>
>> I used the threshold of 0.1 only once, at the beginning, to test
>> its influence. I used the default threshold (of 0, I guess) for
>> all the other runs.
>>
>> Myriam
>>
>>
>> Le 03/11/19 à 13:52, Mark Adams a écrit :
>>> In looking at this larger scale run ...
>>>
>>> * Your eigen estimates are much lower than your tiny test
>>> problem.  But this is Stokes apparently and it should not work
>>> anyway. Maybe you have a small time step that adds a lot of mass
>>> that brings the eigen estimates down. And your min eigenvalue
>>> (not used) is positive. I would expect negative for Stokes ...
>>>
>>> * You seem to be setting a threshold value of 0.1 -- that is
>>> very high
>>>
>>> * v3.6 says "using nonzero initial guess" but this is not in
>>> v3.10. Maybe we just stopped printing that.
>>>
>>> * There were some changes to coasening parameters in going from
>>> v3.6 but it does not look like your problem was effected. (The
>>> coarsening algo is non-deterministic by default and you can see
>>> small difference on different runs)
>>>
>>> * We may have also added a "noisy" RHS for eigen estimates by
>>> default from v3.6.
>>>
>>> * And for non-symetric problems you can try
>>> -pc_gamg_agg_nsmooths 0, but again GAMG is not built for Stokes
>>> anyway.
>>>
>>>
>>> On Tue, Mar 5, 2019 at 11:53 AM Myriam Peyrounette
>>> >> > wrote:
>>>
>>> I used PCView to display the size of the linear system in
>>> each level of the MG. You'll find the outputs attached to
>>> this mail (zip file) for both the default threshold value
>>> and a value of 0.1, and for both 3.6 and 3.10 PETSc versions.
>>>
>>> For convenience, I summarized the information in a graph,
>>> also attached (png file).
>>>
>>> As you can see, there are slight differences between the two
>>> versions but none is critical, in my opinion. Do you see
>>> anything suspicious in the outputs?
>>>
>>> + I can't find the default threshold value. Do you know
>>> where I can find it?
>>>
>>> Thanks for the follow-up
>>>
>>> Myriam
>>>
>>>
>>> Le 03/05/19 à 14:06, Matthew Knepley a écrit :
 On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette
 >>> > wrote:

 Hi Matt,

 I plotted the memory scalings using different threshold
 values. The two scalings are slightly translated (from
 -22 to -88 mB) but this gain is neglectable. The
 3.6-scaling keeps being robust while the 3.10-scaling
 deteriorates.

 Do you have any other suggestion?

 Mark, what is the option she can give to output all the
 GAMG data?

 Also, run using -ksp_view. GAMG will report all the sizes
 of its grids, so it should be easy to see
 if the coarse grid sizes are increasing, and also what the
 effect of the threshold value is.

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-03-20 Thread Myriam Peyrounette via petsc-users

Hi all,

I used git bisect to determine when the memory need increased. I found
that the first "bad" commit is   aa690a28a7284adb519c28cb44eae20a2c131c85.

Barry was right, this commit seems to be about an evolution of
MatPtAPSymbolic_MPIAIJ_MPIAIJ. You mentioned the option "-matptap_via
scalable" but I can't find any information about it. Can you tell me more?

Thanks

Myriam


Le 03/11/19 à 14:40, Mark Adams a écrit :
> Is there a difference in memory usage on your tiny problem? I assume no.
>
> I don't see anything that could come from GAMG other than the RAP
> stuff that you have discussed already.
>
> On Mon, Mar 11, 2019 at 9:32 AM Myriam Peyrounette
> mailto:myriam.peyroune...@idris.fr>> wrote:
>
> The code I am using here is the example 42 of PETSc
> 
> (https://www.mcs.anl.gov/petsc/petsc-3.9/src/ksp/ksp/examples/tutorials/ex42.c.html).
> Indeed it solves the Stokes equation. I thought it was a good idea
> to use an example you might know (and didn't find any that uses
> GAMG functions). I just changed the PCMG setup so that the memory
> problem appears. And it appears when adding PCGAMG.
>
> I don't care about the performance or even the result rightness
> here, but only about the difference in memory use between 3.6 and
> 3.10. Do you think finding a more adapted script would help?
>
> I used the threshold of 0.1 only once, at the beginning, to test
> its influence. I used the default threshold (of 0, I guess) for
> all the other runs.
>
> Myriam
>
>
> Le 03/11/19 à 13:52, Mark Adams a écrit :
>> In looking at this larger scale run ...
>>
>> * Your eigen estimates are much lower than your tiny test
>> problem.  But this is Stokes apparently and it should not work
>> anyway. Maybe you have a small time step that adds a lot of mass
>> that brings the eigen estimates down. And your min eigenvalue
>> (not used) is positive. I would expect negative for Stokes ...
>>
>> * You seem to be setting a threshold value of 0.1 -- that is very
>> high
>>
>> * v3.6 says "using nonzero initial guess" but this is not in
>> v3.10. Maybe we just stopped printing that.
>>
>> * There were some changes to coasening parameters in going from
>> v3.6 but it does not look like your problem was effected. (The
>> coarsening algo is non-deterministic by default and you can see
>> small difference on different runs)
>>
>> * We may have also added a "noisy" RHS for eigen estimates by
>> default from v3.6.
>>
>> * And for non-symetric problems you can try -pc_gamg_agg_nsmooths
>> 0, but again GAMG is not built for Stokes anyway.
>>
>>
>> On Tue, Mar 5, 2019 at 11:53 AM Myriam Peyrounette
>> > > wrote:
>>
>> I used PCView to display the size of the linear system in
>> each level of the MG. You'll find the outputs attached to
>> this mail (zip file) for both the default threshold value and
>> a value of 0.1, and for both 3.6 and 3.10 PETSc versions.
>>
>> For convenience, I summarized the information in a graph,
>> also attached (png file).
>>
>> As you can see, there are slight differences between the two
>> versions but none is critical, in my opinion. Do you see
>> anything suspicious in the outputs?
>>
>> + I can't find the default threshold value. Do you know where
>> I can find it?
>>
>> Thanks for the follow-up
>>
>> Myriam
>>
>>
>> Le 03/05/19 à 14:06, Matthew Knepley a écrit :
>>> On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette
>>> >> > wrote:
>>>
>>> Hi Matt,
>>>
>>> I plotted the memory scalings using different threshold
>>> values. The two scalings are slightly translated (from
>>> -22 to -88 mB) but this gain is neglectable. The
>>> 3.6-scaling keeps being robust while the 3.10-scaling
>>> deteriorates.
>>>
>>> Do you have any other suggestion?
>>>
>>> Mark, what is the option she can give to output all the GAMG
>>> data?
>>>
>>> Also, run using -ksp_view. GAMG will report all the sizes of
>>> its grids, so it should be easy to see
>>> if the coarse grid sizes are increasing, and also what the
>>> effect of the threshold value is.
>>>
>>>   Thanks,
>>>
>>>      Matt 
>>>
>>> Thanks
>>>
>>> Myriam
>>>
>>> Le 03/02/19 à 02:27, Matthew Knepley a écrit :
 On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via
 petsc-users >>> > wrote:

 Hi,

 I used to run my code with PETSc 3.6. Since I
 upgraded the PETSc version
 to 3.10, this code

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-03-11 Thread Myriam Peyrounette via petsc-users

There is a small difference in memory usage already (of 135mB). It is
not a big deal but it will be for larger problems (as shown by the
memory scaling). If we find the origin of this small gap for a small
case, we probably find the reason why the memory scaling is so bad with
3.10.

I am currently looking for the exact commit where the problem arises,
using git bisect. I'll let you know about the result.


Le 03/11/19 à 14:40, Mark Adams a écrit :
> Is there a difference in memory usage on your tiny problem? I assume no.
>
> I don't see anything that could come from GAMG other than the RAP
> stuff that you have discussed already.
>
> On Mon, Mar 11, 2019 at 9:32 AM Myriam Peyrounette
> mailto:myriam.peyroune...@idris.fr>> wrote:
>
> The code I am using here is the example 42 of PETSc
> 
> (https://www.mcs.anl.gov/petsc/petsc-3.9/src/ksp/ksp/examples/tutorials/ex42.c.html).
> Indeed it solves the Stokes equation. I thought it was a good idea
> to use an example you might know (and didn't find any that uses
> GAMG functions). I just changed the PCMG setup so that the memory
> problem appears. And it appears when adding PCGAMG.
>
> I don't care about the performance or even the result rightness
> here, but only about the difference in memory use between 3.6 and
> 3.10. Do you think finding a more adapted script would help?
>
> I used the threshold of 0.1 only once, at the beginning, to test
> its influence. I used the default threshold (of 0, I guess) for
> all the other runs.
>
> Myriam
>
>
> Le 03/11/19 à 13:52, Mark Adams a écrit :
>> In looking at this larger scale run ...
>>
>> * Your eigen estimates are much lower than your tiny test
>> problem.  But this is Stokes apparently and it should not work
>> anyway. Maybe you have a small time step that adds a lot of mass
>> that brings the eigen estimates down. And your min eigenvalue
>> (not used) is positive. I would expect negative for Stokes ...
>>
>> * You seem to be setting a threshold value of 0.1 -- that is very
>> high
>>
>> * v3.6 says "using nonzero initial guess" but this is not in
>> v3.10. Maybe we just stopped printing that.
>>
>> * There were some changes to coasening parameters in going from
>> v3.6 but it does not look like your problem was effected. (The
>> coarsening algo is non-deterministic by default and you can see
>> small difference on different runs)
>>
>> * We may have also added a "noisy" RHS for eigen estimates by
>> default from v3.6.
>>
>> * And for non-symetric problems you can try -pc_gamg_agg_nsmooths
>> 0, but again GAMG is not built for Stokes anyway.
>>
>>
>> On Tue, Mar 5, 2019 at 11:53 AM Myriam Peyrounette
>> > > wrote:
>>
>> I used PCView to display the size of the linear system in
>> each level of the MG. You'll find the outputs attached to
>> this mail (zip file) for both the default threshold value and
>> a value of 0.1, and for both 3.6 and 3.10 PETSc versions.
>>
>> For convenience, I summarized the information in a graph,
>> also attached (png file).
>>
>> As you can see, there are slight differences between the two
>> versions but none is critical, in my opinion. Do you see
>> anything suspicious in the outputs?
>>
>> + I can't find the default threshold value. Do you know where
>> I can find it?
>>
>> Thanks for the follow-up
>>
>> Myriam
>>
>>
>> Le 03/05/19 à 14:06, Matthew Knepley a écrit :
>>> On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette
>>> >> > wrote:
>>>
>>> Hi Matt,
>>>
>>> I plotted the memory scalings using different threshold
>>> values. The two scalings are slightly translated (from
>>> -22 to -88 mB) but this gain is neglectable. The
>>> 3.6-scaling keeps being robust while the 3.10-scaling
>>> deteriorates.
>>>
>>> Do you have any other suggestion?
>>>
>>> Mark, what is the option she can give to output all the GAMG
>>> data?
>>>
>>> Also, run using -ksp_view. GAMG will report all the sizes of
>>> its grids, so it should be easy to see
>>> if the coarse grid sizes are increasing, and also what the
>>> effect of the threshold value is.
>>>
>>>   Thanks,
>>>
>>>      Matt 
>>>
>>> Thanks
>>>
>>> Myriam
>>>
>>> Le 03/02/19 à 02:27, Matthew Knepley a écrit :
 On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via
 petsc-users >>> > wrote:

 Hi,

 I used to run my code with PETSc 3.6. Since I
 upgraded the PETSc version

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-03-11 Thread Mark Adams via petsc-users

Is there a difference in memory usage on your tiny problem? I assume no.

I don't see anything that could come from GAMG other than the RAP stuff
that you have discussed already.

On Mon, Mar 11, 2019 at 9:32 AM Myriam Peyrounette <
myriam.peyroune...@idris.fr> wrote:

> The code I am using here is the example 42 of PETSc (
> https://www.mcs.anl.gov/petsc/petsc-3.9/src/ksp/ksp/examples/tutorials/ex42.c.html).
> Indeed it solves the Stokes equation. I thought it was a good idea to use
> an example you might know (and didn't find any that uses GAMG functions). I
> just changed the PCMG setup so that the memory problem appears. And it
> appears when adding PCGAMG.
>
> I don't care about the performance or even the result rightness here, but
> only about the difference in memory use between 3.6 and 3.10. Do you think
> finding a more adapted script would help?
>
> I used the threshold of 0.1 only once, at the beginning, to test its
> influence. I used the default threshold (of 0, I guess) for all the other
> runs.
>
> Myriam
>
> Le 03/11/19 à 13:52, Mark Adams a écrit :
>
> In looking at this larger scale run ...
>
> * Your eigen estimates are much lower than your tiny test problem.  But
> this is Stokes apparently and it should not work anyway. Maybe you have a
> small time step that adds a lot of mass that brings the eigen estimates
> down. And your min eigenvalue (not used) is positive. I would expect
> negative for Stokes ...
>
> * You seem to be setting a threshold value of 0.1 -- that is very high
>
> * v3.6 says "using nonzero initial guess" but this is not in v3.10. Maybe
> we just stopped printing that.
>
> * There were some changes to coasening parameters in going from v3.6 but
> it does not look like your problem was effected. (The coarsening algo is
> non-deterministic by default and you can see small difference on different
> runs)
>
> * We may have also added a "noisy" RHS for eigen estimates by default from
> v3.6.
>
> * And for non-symetric problems you can try -pc_gamg_agg_nsmooths 0, but
> again GAMG is not built for Stokes anyway.
>
>
> On Tue, Mar 5, 2019 at 11:53 AM Myriam Peyrounette <
> myriam.peyroune...@idris.fr> wrote:
>
>> I used PCView to display the size of the linear system in each level of
>> the MG. You'll find the outputs attached to this mail (zip file) for both
>> the default threshold value and a value of 0.1, and for both 3.6 and 3.10
>> PETSc versions.
>>
>> For convenience, I summarized the information in a graph, also attached
>> (png file).
>>
>> As you can see, there are slight differences between the two versions but
>> none is critical, in my opinion. Do you see anything suspicious in the
>> outputs?
>>
>> + I can't find the default threshold value. Do you know where I can find
>> it?
>>
>> Thanks for the follow-up
>>
>> Myriam
>>
>> Le 03/05/19 à 14:06, Matthew Knepley a écrit :
>>
>> On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette <
>> myriam.peyroune...@idris.fr> wrote:
>>
>>> Hi Matt,
>>>
>>> I plotted the memory scalings using different threshold values. The two
>>> scalings are slightly translated (from -22 to -88 mB) but this gain is
>>> neglectable. The 3.6-scaling keeps being robust while the 3.10-scaling
>>> deteriorates.
>>>
>>> Do you have any other suggestion?
>>>
>> Mark, what is the option she can give to output all the GAMG data?
>>
>> Also, run using -ksp_view. GAMG will report all the sizes of its grids,
>> so it should be easy to see
>> if the coarse grid sizes are increasing, and also what the effect of the
>> threshold value is.
>>
>>   Thanks,
>>
>>  Matt
>>
>>> Thanks
>>> Myriam
>>>
>>> Le 03/02/19 à 02:27, Matthew Knepley a écrit :
>>>
>>> On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via petsc-users <
>>> petsc-users@mcs.anl.gov> wrote:
>>>
 Hi,

 I used to run my code with PETSc 3.6. Since I upgraded the PETSc version
 to 3.10, this code has a bad memory scaling.

 To report this issue, I took the PETSc script ex42.c and slightly
 modified it so that the KSP and PC configurations are the same as in my
 code. In particular, I use a "personnalised" multi-grid method. The
 modifications are indicated by the keyword "TopBridge" in the attached
 scripts.

 To plot the memory (weak) scaling, I ran four calculations for each
 script with increasing problem sizes and computations cores:

 1. 100,000 elts on 4 cores
 2. 1 million elts on 40 cores
 3. 10 millions elts on 400 cores
 4. 100 millions elts on 4,000 cores

 The resulting graph is also attached. The scaling using PETSc 3.10
 clearly deteriorates for large cases, while the one using PETSc 3.6 is
 robust.

 After a few tests, I found that the scaling is mostly sensitive to the
 use of the AMG method for the coarse grid (line 1780 in
 main_ex42_petsc36.cc). In particular, the performance strongly
 deteriorates when commenting lines 1777 to 1790 (in
 main_ex42_petsc36.cc).

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-03-11 Thread Myriam Peyrounette via petsc-users

The code I am using here is the example 42 of PETSc
(https://www.mcs.anl.gov/petsc/petsc-3.9/src/ksp/ksp/examples/tutorials/ex42.c.html).
Indeed it solves the Stokes equation. I thought it was a good idea to
use an example you might know (and didn't find any that uses GAMG
functions). I just changed the PCMG setup so that the memory problem
appears. And it appears when adding PCGAMG.

I don't care about the performance or even the result rightness here,
but only about the difference in memory use between 3.6 and 3.10. Do you
think finding a more adapted script would help?

I used the threshold of 0.1 only once, at the beginning, to test its
influence. I used the default threshold (of 0, I guess) for all the
other runs.

Myriam


Le 03/11/19 à 13:52, Mark Adams a écrit :
> In looking at this larger scale run ...
>
> * Your eigen estimates are much lower than your tiny test problem. 
> But this is Stokes apparently and it should not work anyway. Maybe you
> have a small time step that adds a lot of mass that brings the eigen
> estimates down. And your min eigenvalue (not used) is positive. I
> would expect negative for Stokes ...
>
> * You seem to be setting a threshold value of 0.1 -- that is very high
>
> * v3.6 says "using nonzero initial guess" but this is not in v3.10.
> Maybe we just stopped printing that.
>
> * There were some changes to coasening parameters in going from v3.6
> but it does not look like your problem was effected. (The coarsening
> algo is non-deterministic by default and you can see small difference
> on different runs)
>
> * We may have also added a "noisy" RHS for eigen estimates by default
> from v3.6.
>
> * And for non-symetric problems you can try -pc_gamg_agg_nsmooths 0,
> but again GAMG is not built for Stokes anyway.
>
>
> On Tue, Mar 5, 2019 at 11:53 AM Myriam Peyrounette
> mailto:myriam.peyroune...@idris.fr>> wrote:
>
> I used PCView to display the size of the linear system in each
> level of the MG. You'll find the outputs attached to this mail
> (zip file) for both the default threshold value and a value of
> 0.1, and for both 3.6 and 3.10 PETSc versions.
>
> For convenience, I summarized the information in a graph, also
> attached (png file).
>
> As you can see, there are slight differences between the two
> versions but none is critical, in my opinion. Do you see anything
> suspicious in the outputs?
>
> + I can't find the default threshold value. Do you know where I
> can find it?
>
> Thanks for the follow-up
>
> Myriam
>
>
> Le 03/05/19 à 14:06, Matthew Knepley a écrit :
>> On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette
>> > > wrote:
>>
>> Hi Matt,
>>
>> I plotted the memory scalings using different threshold
>> values. The two scalings are slightly translated (from -22 to
>> -88 mB) but this gain is neglectable. The 3.6-scaling keeps
>> being robust while the 3.10-scaling deteriorates.
>>
>> Do you have any other suggestion?
>>
>> Mark, what is the option she can give to output all the GAMG data?
>>
>> Also, run using -ksp_view. GAMG will report all the sizes of its
>> grids, so it should be easy to see
>> if the coarse grid sizes are increasing, and also what the effect
>> of the threshold value is.
>>
>>   Thanks,
>>
>>      Matt 
>>
>> Thanks
>>
>> Myriam
>>
>> Le 03/02/19 à 02:27, Matthew Knepley a écrit :
>>> On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via
>>> petsc-users >> > wrote:
>>>
>>> Hi,
>>>
>>> I used to run my code with PETSc 3.6. Since I upgraded
>>> the PETSc version
>>> to 3.10, this code has a bad memory scaling.
>>>
>>> To report this issue, I took the PETSc script ex42.c and
>>> slightly
>>> modified it so that the KSP and PC configurations are
>>> the same as in my
>>> code. In particular, I use a "personnalised" multi-grid
>>> method. The
>>> modifications are indicated by the keyword "TopBridge"
>>> in the attached
>>> scripts.
>>>
>>> To plot the memory (weak) scaling, I ran four
>>> calculations for each
>>> script with increasing problem sizes and computations cores:
>>>
>>> 1. 100,000 elts on 4 cores
>>> 2. 1 million elts on 40 cores
>>> 3. 10 millions elts on 400 cores
>>> 4. 100 millions elts on 4,000 cores
>>>
>>> The resulting graph is also attached. The scaling using
>>> PETSc 3.10
>>> clearly deteriorates for large cases, while the one
>>> using PETSc 3.6 is
>>> robust.
>>>
>>> After a few tests, I found that the scaling is mostly
>>> sensitive to

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-03-11 Thread Mark Adams via petsc-users

In looking at this larger scale run ...

* Your eigen estimates are much lower than your tiny test problem.  But
this is Stokes apparently and it should not work anyway. Maybe you have a
small time step that adds a lot of mass that brings the eigen estimates
down. And your min eigenvalue (not used) is positive. I would expect
negative for Stokes ...

* You seem to be setting a threshold value of 0.1 -- that is very high

* v3.6 says "using nonzero initial guess" but this is not in v3.10. Maybe
we just stopped printing that.

* There were some changes to coasening parameters in going from v3.6 but it
does not look like your problem was effected. (The coarsening algo is
non-deterministic by default and you can see small difference on different
runs)

* We may have also added a "noisy" RHS for eigen estimates by default from
v3.6.

* And for non-symetric problems you can try -pc_gamg_agg_nsmooths 0, but
again GAMG is not built for Stokes anyway.


On Tue, Mar 5, 2019 at 11:53 AM Myriam Peyrounette <
myriam.peyroune...@idris.fr> wrote:

> I used PCView to display the size of the linear system in each level of
> the MG. You'll find the outputs attached to this mail (zip file) for both
> the default threshold value and a value of 0.1, and for both 3.6 and 3.10
> PETSc versions.
>
> For convenience, I summarized the information in a graph, also attached
> (png file).
>
> As you can see, there are slight differences between the two versions but
> none is critical, in my opinion. Do you see anything suspicious in the
> outputs?
>
> + I can't find the default threshold value. Do you know where I can find
> it?
>
> Thanks for the follow-up
>
> Myriam
>
> Le 03/05/19 à 14:06, Matthew Knepley a écrit :
>
> On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette <
> myriam.peyroune...@idris.fr> wrote:
>
>> Hi Matt,
>>
>> I plotted the memory scalings using different threshold values. The two
>> scalings are slightly translated (from -22 to -88 mB) but this gain is
>> neglectable. The 3.6-scaling keeps being robust while the 3.10-scaling
>> deteriorates.
>>
>> Do you have any other suggestion?
>>
> Mark, what is the option she can give to output all the GAMG data?
>
> Also, run using -ksp_view. GAMG will report all the sizes of its grids, so
> it should be easy to see
> if the coarse grid sizes are increasing, and also what the effect of the
> threshold value is.
>
>   Thanks,
>
>  Matt
>
>> Thanks
>> Myriam
>>
>> Le 03/02/19 à 02:27, Matthew Knepley a écrit :
>>
>> On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via petsc-users <
>> petsc-users@mcs.anl.gov> wrote:
>>
>>> Hi,
>>>
>>> I used to run my code with PETSc 3.6. Since I upgraded the PETSc version
>>> to 3.10, this code has a bad memory scaling.
>>>
>>> To report this issue, I took the PETSc script ex42.c and slightly
>>> modified it so that the KSP and PC configurations are the same as in my
>>> code. In particular, I use a "personnalised" multi-grid method. The
>>> modifications are indicated by the keyword "TopBridge" in the attached
>>> scripts.
>>>
>>> To plot the memory (weak) scaling, I ran four calculations for each
>>> script with increasing problem sizes and computations cores:
>>>
>>> 1. 100,000 elts on 4 cores
>>> 2. 1 million elts on 40 cores
>>> 3. 10 millions elts on 400 cores
>>> 4. 100 millions elts on 4,000 cores
>>>
>>> The resulting graph is also attached. The scaling using PETSc 3.10
>>> clearly deteriorates for large cases, while the one using PETSc 3.6 is
>>> robust.
>>>
>>> After a few tests, I found that the scaling is mostly sensitive to the
>>> use of the AMG method for the coarse grid (line 1780 in
>>> main_ex42_petsc36.cc). In particular, the performance strongly
>>> deteriorates when commenting lines 1777 to 1790 (in
>>> main_ex42_petsc36.cc).
>>>
>>> Do you have any idea of what changed between version 3.6 and version
>>> 3.10 that may imply such degradation?
>>>
>>
>> I believe the default values for PCGAMG changed between versions. It
>> sounds like the coarsening rate
>> is not great enough, so that these grids are too large. This can be set
>> using:
>>
>>
>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html
>>
>> There is some explanation of this effect on that page. Let us know if
>> setting this does not correct the situation.
>>
>>   Thanks,
>>
>>  Matt
>>
>>
>>> Let me know if you need further information.
>>>
>>> Best,
>>>
>>> Myriam Peyrounette
>>>
>>>
>>> --
>>> Myriam Peyrounette
>>> CNRS/IDRIS - HLST
>>> --
>>>
>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> 
>>
>>
>> --
>> Myriam Peyrounette
>> CNRS/IDRIS - HLST
>> --
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-03-11 Thread Mark Adams via petsc-users

GAMG look fine here but the convergence rate looks terrible, like 4k+
iterations. You have 4 degrees of freedom per vertex. What equations and
discretization are you using?

Your eigen estimates are a little high, but not crazy. I assume this system
is not symmetric.

AMG is oriented toward the laplacian (and elasticity). It looks like you
are solving Stokes. AMG will not work on the whole system out of the box.
You can use it for a 3 dof velocity solve in a FieldSolit solver




On Mon, Mar 11, 2019 at 6:32 AM Myriam Peyrounette <
myriam.peyroune...@idris.fr> wrote:

> Hi,
>
> good point, I changed the 3.10 version so that it is configured with
> --with-debugging=0. You'll find attached the output of the new LogView. The
> execution time is reduced (although still not as good as 3.6) but I can't
> see any improvement with regard to memory.
>
> You'll also find attached the grep GAMG on -info outputs for both
> versions. There are slight differences in grid dimensions or nnz values,
> but is it significant?
>
> Thanks,
>
> Myriam
>
>
>
> Le 03/08/19 à 23:23, Mark Adams a écrit :
>
> Just seeing this now. It is hard to imagine how bad GAMG could be on a
> coarse grid, but you can run with -info and grep on GAMG and send that. You
> will see listing of levels, number of equations and number of non-zeros
> (nnz). You can send that and I can get some sense of GAMG is going nuts.
>
> Mark
>
> On Fri, Mar 8, 2019 at 11:56 AM Jed Brown via petsc-users <
> petsc-users@mcs.anl.gov> wrote:
>
>> It may not address the memory issue, but can you build 3.10 with the
>> same options you used for 3.6?  It is currently a debugging build:
>>
>>   ##
>>   ##
>>   #   WARNING!!!   #
>>   ##
>>   #   This code was compiled with a debugging option.  #
>>   #   To get timing results run ./configure#
>>   #   using --with-debugging=no, the performance will  #
>>   #   be generally two or three times faster.  #
>>   ##
>>   ##
>>
>>
> --
> Myriam Peyrounette
> CNRS/IDRIS - HLST
> --
>
>

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-03-11 Thread Myriam Peyrounette via petsc-users

Hi,

good point, I changed the 3.10 version so that it is configured with
--with-debugging=0. You'll find attached the output of the new LogView.
The execution time is reduced (although still not as good as 3.6) but I
can't see any improvement with regard to memory.

You'll also find attached the grep GAMG on -info outputs for both
versions. There are slight differences in grid dimensions or nnz values,
but is it significant?

Thanks,

Myriam



Le 03/08/19 à 23:23, Mark Adams a écrit :
> Just seeing this now. It is hard to imagine how bad GAMG could be on a
> coarse grid, but you can run with -info and grep on GAMG and send
> that. You will see listing of levels, number of equations and number
> of non-zeros (nnz). You can send that and I can get some sense of GAMG
> is going nuts.
>
> Mark
>
> On Fri, Mar 8, 2019 at 11:56 AM Jed Brown via petsc-users
> mailto:petsc-users@mcs.anl.gov>> wrote:
>
> It may not address the memory issue, but can you build 3.10 with the
> same options you used for 3.6?  It is currently a debugging build:
>
>              
> ##
>               #                                                   
>     #
>               #                       WARNING!!!                 
>      #
>               #                                                   
>     #
>               #   This code was compiled with a debugging option. 
>     #
>               #   To get timing results run ./configure           
>     #
>               #   using --with-debugging=no, the performance will 
>     #
>               #   be generally two or three times faster.         
>     #
>               #                                                   
>     #
>              
> ##
>

-- 
Myriam Peyrounette
CNRS/IDRIS - HLST
--


*** WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document***


-- PETSc Performance Summary: --

./ex42 on a  named ada163 with 1 processor, by ssos938 Mon Mar 11 10:03:37 2019
Using Petsc Release Version 3.10.2, Oct, 09, 2018 

 Max   Max/Min Avg   Total 
Time (sec):   1.041e+02 1.000   1.041e+02
Objects:  3.450e+02 1.000   3.450e+02
Flop: 1.266e+11 1.000   1.266e+11  1.266e+11
Flop/sec: 1.216e+09 1.000   1.216e+09  1.216e+09
MPI Messages: 0.000e+00 0.000   0.000e+00  0.000e+00
MPI Message Lengths:  0.000e+00 0.000   0.000e+00  0.000e+00
MPI Reductions:   0.000e+00 0.000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flop
and VecAXPY() for complex vectors of length N --> 8N flop

Summary of Stages:   - Time --  - Flop --  --- Messages ---  -- Message Lengths --  -- Reductions --
Avg %Total Avg %TotalCount   %Total Avg %TotalCount   %Total 
 0:  Main Stage: 1.0407e+02 100.0%  1.2656e+11 100.0%  0.000e+00   0.0%  0.000e+000.0%  0.000e+00   0.0% 


See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flop: Max - maximum over all processors
  Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   AvgLen: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
  %T - percent time in this phase %F - percent flop in this phase
  %M - percent messages in this phase %L - percent message lengths in this phase
  %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)

EventCount  Time (sec) Flop  --- Global ---  --- Stage   Total
   Max Ratio  Max Ratio   Max  Ratio  Mess   AvgLen  Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-03-08 Thread Mark Adams via petsc-users

Just seeing this now. It is hard to imagine how bad GAMG could be on a
coarse grid, but you can run with -info and grep on GAMG and send that. You
will see listing of levels, number of equations and number of non-zeros
(nnz). You can send that and I can get some sense of GAMG is going nuts.

Mark

On Fri, Mar 8, 2019 at 11:56 AM Jed Brown via petsc-users <
petsc-users@mcs.anl.gov> wrote:

> It may not address the memory issue, but can you build 3.10 with the
> same options you used for 3.6?  It is currently a debugging build:
>
>   ##
>   ##
>   #   WARNING!!!   #
>   ##
>   #   This code was compiled with a debugging option.  #
>   #   To get timing results run ./configure#
>   #   using --with-debugging=no, the performance will  #
>   #   be generally two or three times faster.  #
>   ##
>   ##
>
>

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-03-08 Thread Jed Brown via petsc-users

It may not address the memory issue, but can you build 3.10 with the
same options you used for 3.6?  It is currently a debugging build:

  ##
  ##
  #   WARNING!!!   #
  ##
  #   This code was compiled with a debugging option.  #
  #   To get timing results run ./configure#
  #   using --with-debugging=no, the performance will  #
  #   be generally two or three times faster.  #
  ##
  ##

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-03-05 Thread Jed Brown via petsc-users

Myriam, in your first message, there was a significant (about 50%)
increase in memory consumption already on 4 cores.  Before attacking
scaling, it may be useful to trace memory usage for that base case.
Even better if you can reduce to one process.  Anyway, I would start by
running both cases with -log_view and looking at the memory summary.  I
would then use Massif (the memory profiler/tracer component in Valgrind)
to obtain stack traces for the large allocations.  Comparing those
traces should help narrow down which part of the code has significantly
different memory allocation behavior.  It might also point to the
unacceptable memory consumption under weak scaling, but it's something
we should try to fix.

If I had to guess, it may be in intermediate data structures for the
different PtAP algorithms in GAMG.  The option "-matptap_via scalable"
may be helpful.

"Smith, Barry F. via petsc-users"  writes:

>Myriam,
>
> Sorry we have not been able to resolve this problem with memory scaling 
> yet.
>
> The best tool to determine the change in a code that results in large 
> differences in a program's run is git bisect. Basically you tell git bisect 
> the git commit of the code that is "good" and the git commit of the code that 
> is "bad" and it gives you additional git commits for you to check your code 
> on  each time telling git if it is "good" or "bad", eventually git bisect 
> tells you exactly the git commit that "broke" the code. No guess work, no 
> endless speculation. 
>
> The draw back is that you have to ./configure && make PETSc for each 
> "test" commit and then compile and run your code for that commit. I can 
> understand if you have to run your code on 10,000 processes to check if it is 
> "good" or "bad" that can be very daunting. But all I can suggest is to find a 
> problem size that is manageable and do the git bisect process (yeah it may 
> take several hours but that beats days of head banging).
>
>Good luck,
>
>Barry
>
>
>> On Mar 5, 2019, at 12:42 PM, Matthew Knepley via petsc-users 
>>  wrote:
>> 
>> On Tue, Mar 5, 2019 at 11:53 AM Myriam Peyrounette 
>>  wrote:
>> I used PCView to display the size of the linear system in each level of the 
>> MG. You'll find the outputs attached to this mail (zip file) for both the 
>> default threshold value and a value of 0.1, and for both 3.6 and 3.10 PETSc 
>> versions. 
>> 
>> For convenience, I summarized the information in a graph, also attached (png 
>> file).
>> 
>> 
>> Great! Can you draw lines for the different runs you did? My interpretation 
>> was that memory was increasing
>> as you did larger runs, and that you though that was coming from GAMG. That 
>> means the curves should
>> be pushed up for larger runs. Do you see that?
>> 
>>   Thanks,
>> 
>> Matt 
>> As you can see, there are slight differences between the two versions but 
>> none is critical, in my opinion. Do you see anything suspicious in the 
>> outputs?
>> 
>> + I can't find the default threshold value. Do you know where I can find it?
>> 
>> Thanks for the follow-up
>> 
>> Myriam
>> 
>> 
>> Le 03/05/19 à 14:06, Matthew Knepley a écrit :
>>> On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette 
>>>  wrote:
>>> Hi Matt,
>>> 
>>> I plotted the memory scalings using different threshold values. The two 
>>> scalings are slightly translated (from -22 to -88 mB) but this gain is 
>>> neglectable. The 3.6-scaling keeps being robust while the 3.10-scaling 
>>> deteriorates.
>>> 
>>> Do you have any other suggestion?
>>> 
>>> Mark, what is the option she can give to output all the GAMG data?
>>> 
>>> Also, run using -ksp_view. GAMG will report all the sizes of its grids, so 
>>> it should be easy to see
>>> if the coarse grid sizes are increasing, and also what the effect of the 
>>> threshold value is.
>>> 
>>>   Thanks,
>>> 
>>>  Matt 
>>> Thanks
>>> 
>>> Myriam 
>>> 
>>> Le 03/02/19 à 02:27, Matthew Knepley a écrit :
 On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via petsc-users 
  wrote:
 Hi,

 I used to run my code with PETSc 3.6. Since I upgraded the PETSc version
 to 3.10, this code has a bad memory scaling.

 To report this issue, I took the PETSc script ex42.c and slightly
 modified it so that the KSP and PC configurations are the same as in my
 code. In particular, I use a "personnalised" multi-grid method. The
 modifications are indicated by the keyword "TopBridge" in the attached
 scripts.

 To plot the memory (weak) scaling, I ran four calculations for each
 script with increasing problem sizes and computations cores:

 1. 100,000 elts on 4 cores
 2. 1 million elts on 40 cores
 3. 10 millions elts on 400 cores
 4. 100 millions elts on 4,000 cores

 The resulting graph is also attached. The scaling using PETSc 3.10
 clearly deteriorates for large cases, while the one using PETSc 3.6 is
 robust.

 After a few tests, I

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-03-05 Thread Smith, Barry F. via petsc-users


   Myriam,

Sorry we have not been able to resolve this problem with memory scaling yet.

The best tool to determine the change in a code that results in large 
differences in a program's run is git bisect. Basically you tell git bisect 
the git commit of the code that is "good" and the git commit of the code that 
is "bad" and it gives you additional git commits for you to check your code on  
each time telling git if it is "good" or "bad", eventually git bisect tells you 
exactly the git commit that "broke" the code. No guess work, no endless 
speculation. 

The draw back is that you have to ./configure && make PETSc for each "test" 
commit and then compile and run your code for that commit. I can understand if 
you have to run your code on 10,000 processes to check if it is "good" or "bad" 
that can be very daunting. But all I can suggest is to find a problem size that 
is manageable and do the git bisect process (yeah it may take several hours but 
that beats days of head banging).

   Good luck,

   Barry


> On Mar 5, 2019, at 12:42 PM, Matthew Knepley via petsc-users 
>  wrote:
> 
> On Tue, Mar 5, 2019 at 11:53 AM Myriam Peyrounette 
>  wrote:
> I used PCView to display the size of the linear system in each level of the 
> MG. You'll find the outputs attached to this mail (zip file) for both the 
> default threshold value and a value of 0.1, and for both 3.6 and 3.10 PETSc 
> versions. 
> 
> For convenience, I summarized the information in a graph, also attached (png 
> file).
> 
> 
> Great! Can you draw lines for the different runs you did? My interpretation 
> was that memory was increasing
> as you did larger runs, and that you though that was coming from GAMG. That 
> means the curves should
> be pushed up for larger runs. Do you see that?
> 
>   Thanks,
> 
> Matt 
> As you can see, there are slight differences between the two versions but 
> none is critical, in my opinion. Do you see anything suspicious in the 
> outputs?
> 
> + I can't find the default threshold value. Do you know where I can find it?
> 
> Thanks for the follow-up
> 
> Myriam
> 
> 
> Le 03/05/19 à 14:06, Matthew Knepley a écrit :
>> On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette 
>>  wrote:
>> Hi Matt,
>> 
>> I plotted the memory scalings using different threshold values. The two 
>> scalings are slightly translated (from -22 to -88 mB) but this gain is 
>> neglectable. The 3.6-scaling keeps being robust while the 3.10-scaling 
>> deteriorates.
>> 
>> Do you have any other suggestion?
>> 
>> Mark, what is the option she can give to output all the GAMG data?
>> 
>> Also, run using -ksp_view. GAMG will report all the sizes of its grids, so 
>> it should be easy to see
>> if the coarse grid sizes are increasing, and also what the effect of the 
>> threshold value is.
>> 
>>   Thanks,
>> 
>>  Matt 
>> Thanks
>> 
>> Myriam 
>> 
>> Le 03/02/19 à 02:27, Matthew Knepley a écrit :
>>> On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via petsc-users 
>>>  wrote:
>>> Hi,
>>> 
>>> I used to run my code with PETSc 3.6. Since I upgraded the PETSc version
>>> to 3.10, this code has a bad memory scaling.
>>> 
>>> To report this issue, I took the PETSc script ex42.c and slightly
>>> modified it so that the KSP and PC configurations are the same as in my
>>> code. In particular, I use a "personnalised" multi-grid method. The
>>> modifications are indicated by the keyword "TopBridge" in the attached
>>> scripts.
>>> 
>>> To plot the memory (weak) scaling, I ran four calculations for each
>>> script with increasing problem sizes and computations cores:
>>> 
>>> 1. 100,000 elts on 4 cores
>>> 2. 1 million elts on 40 cores
>>> 3. 10 millions elts on 400 cores
>>> 4. 100 millions elts on 4,000 cores
>>> 
>>> The resulting graph is also attached. The scaling using PETSc 3.10
>>> clearly deteriorates for large cases, while the one using PETSc 3.6 is
>>> robust.
>>> 
>>> After a few tests, I found that the scaling is mostly sensitive to the
>>> use of the AMG method for the coarse grid (line 1780 in
>>> main_ex42_petsc36.cc). In particular, the performance strongly
>>> deteriorates when commenting lines 1777 to 1790 (in main_ex42_petsc36.cc).
>>> 
>>> Do you have any idea of what changed between version 3.6 and version
>>> 3.10 that may imply such degradation?
>>> 
>>> I believe the default values for PCGAMG changed between versions. It sounds 
>>> like the coarsening rate
>>> is not great enough, so that these grids are too large. This can be set 
>>> using:
>>> 
>>>   
>>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html
>>> 
>>> There is some explanation of this effect on that page. Let us know if 
>>> setting this does not correct the situation.
>>> 
>>>   Thanks,
>>> 
>>>  Matt
>>>  
>>> Let me know if you need further information.
>>> 
>>> Best,
>>> 
>>> Myriam Peyrounette
>>> 
>>> 
>>> -- 
>>> Myriam Peyrounette
>>> CNRS/IDRIS - HLST
>>> --
>>> 
>>> 
>>> 
>>> -- 
>>> What

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-03-05 Thread Myriam Peyrounette via petsc-users

I used PCView to display the size of the linear system in each level of
the MG. You'll find the outputs attached to this mail (zip file) for
both the default threshold value and a value of 0.1, and for both 3.6
and 3.10 PETSc versions.

For convenience, I summarized the information in a graph, also attached
(png file).

As you can see, there are slight differences between the two versions
but none is critical, in my opinion. Do you see anything suspicious in
the outputs?

+ I can't find the default threshold value. Do you know where I can find it?

Thanks for the follow-up

Myriam


Le 03/05/19 à 14:06, Matthew Knepley a écrit :
> On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette
> mailto:myriam.peyroune...@idris.fr>> wrote:
>
> Hi Matt,
>
> I plotted the memory scalings using different threshold values.
> The two scalings are slightly translated (from -22 to -88 mB) but
> this gain is neglectable. The 3.6-scaling keeps being robust while
> the 3.10-scaling deteriorates.
>
> Do you have any other suggestion?
>
> Mark, what is the option she can give to output all the GAMG data?
>
> Also, run using -ksp_view. GAMG will report all the sizes of its
> grids, so it should be easy to see
> if the coarse grid sizes are increasing, and also what the effect of
> the threshold value is.
>
>   Thanks,
>
>      Matt 
>
> Thanks
>
> Myriam
>
> Le 03/02/19 à 02:27, Matthew Knepley a écrit :
>> On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via
>> petsc-users > > wrote:
>>
>> Hi,
>>
>> I used to run my code with PETSc 3.6. Since I upgraded the
>> PETSc version
>> to 3.10, this code has a bad memory scaling.
>>
>> To report this issue, I took the PETSc script ex42.c and slightly
>> modified it so that the KSP and PC configurations are the
>> same as in my
>> code. In particular, I use a "personnalised" multi-grid
>> method. The
>> modifications are indicated by the keyword "TopBridge" in the
>> attached
>> scripts.
>>
>> To plot the memory (weak) scaling, I ran four calculations
>> for each
>> script with increasing problem sizes and computations cores:
>>
>> 1. 100,000 elts on 4 cores
>> 2. 1 million elts on 40 cores
>> 3. 10 millions elts on 400 cores
>> 4. 100 millions elts on 4,000 cores
>>
>> The resulting graph is also attached. The scaling using PETSc
>> 3.10
>> clearly deteriorates for large cases, while the one using
>> PETSc 3.6 is
>> robust.
>>
>> After a few tests, I found that the scaling is mostly
>> sensitive to the
>> use of the AMG method for the coarse grid (line 1780 in
>> main_ex42_petsc36.cc). In particular, the performance strongly
>> deteriorates when commenting lines 1777 to 1790 (in
>> main_ex42_petsc36.cc).
>>
>> Do you have any idea of what changed between version 3.6 and
>> version
>> 3.10 that may imply such degradation?
>>
>>
>> I believe the default values for PCGAMG changed between versions.
>> It sounds like the coarsening rate
>> is not great enough, so that these grids are too large. This can
>> be set using:
>>
>>   
>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html
>>
>> There is some explanation of this effect on that page. Let us
>> know if setting this does not correct the situation.
>>
>>   Thanks,
>>
>>      Matt
>>  
>>
>> Let me know if you need further information.
>>
>> Best,
>>
>> Myriam Peyrounette
>>
>>
>> -- 
>> Myriam Peyrounette
>> CNRS/IDRIS - HLST
>> --
>>
>>
>>
>> -- 
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to
>> which their experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> 
>
> -- 
> Myriam Peyrounette
> CNRS/IDRIS - HLST
> --
>
>
>
> -- 
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which
> their experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 

-- 
Myriam Peyrounette
CNRS/IDRIS - HLST
--

<>


smime.p7s
Description: Signature cryptographique S/MIME

Re: [petsc-users] Bad memory scaling with PETSc 3.10

2019-03-05 Thread Myriam Peyrounette via petsc-users

Hi Matt,

I plotted the memory scalings using different threshold values. The two
scalings are slightly translated (from -22 to -88 mB) but this gain is
neglectable. The 3.6-scaling keeps being robust while the 3.10-scaling
deteriorates.

Do you have any other suggestion?

Thanks

Myriam

Le 03/02/19 à 02:27, Matthew Knepley a écrit :
> On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via petsc-users
> mailto:petsc-users@mcs.anl.gov>> wrote:
>
> Hi,
>
> I used to run my code with PETSc 3.6. Since I upgraded the PETSc
> version
> to 3.10, this code has a bad memory scaling.
>
> To report this issue, I took the PETSc script ex42.c and slightly
> modified it so that the KSP and PC configurations are the same as
> in my
> code. In particular, I use a "personnalised" multi-grid method. The
> modifications are indicated by the keyword "TopBridge" in the attached
> scripts.
>
> To plot the memory (weak) scaling, I ran four calculations for each
> script with increasing problem sizes and computations cores:
>
> 1. 100,000 elts on 4 cores
> 2. 1 million elts on 40 cores
> 3. 10 millions elts on 400 cores
> 4. 100 millions elts on 4,000 cores
>
> The resulting graph is also attached. The scaling using PETSc 3.10
> clearly deteriorates for large cases, while the one using PETSc 3.6 is
> robust.
>
> After a few tests, I found that the scaling is mostly sensitive to the
> use of the AMG method for the coarse grid (line 1780 in
> main_ex42_petsc36.cc). In particular, the performance strongly
> deteriorates when commenting lines 1777 to 1790 (in
> main_ex42_petsc36.cc).
>
> Do you have any idea of what changed between version 3.6 and version
> 3.10 that may imply such degradation?
>
>
> I believe the default values for PCGAMG changed between versions. It
> sounds like the coarsening rate
> is not great enough, so that these grids are too large. This can be
> set using:
>
>   
> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html
>
> There is some explanation of this effect on that page. Let us know if
> setting this does not correct the situation.
>
>   Thanks,
>
>      Matt
>  
>
> Let me know if you need further information.
>
> Best,
>
> Myriam Peyrounette
>
>
> -- 
> Myriam Peyrounette
> CNRS/IDRIS - HLST
> --
>
>
>
> -- 
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which
> their experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 

-- 
Myriam Peyrounette
CNRS/IDRIS - HLST
--



smime.p7s
Description: Signature cryptographique S/MIME

43 matches

Mail list logo