Re: [deal.II] Imported mesh problem with p4est and parallel::distributed

2024-02-01 Thread Alex Quinlan
Thanks, Wolfgang.  I'll look into it some more and eventually try with a 
larger problem.

On Tuesday, January 30, 2024 at 11:18:01 PM UTC-5 Wolfgang Bangerth wrote:

>
> Alex:
>
> > I am running a solid-mechanics job where I import a large mesh and run 
> using 
> > parallel fullydistributed with MPI. I had been trying to run my job 
> using a CG 
> > solver with the BoomerAMG
> > preconditioner (based on the example in step-40).
> > 
> > I ran my mesh with 116,000 nodes and the solver took about 700 seconds 
> and 
> > used around 18GB of RAM.  I then tried to run another version of the 
> same 
> > problem with a finer 1.5M node mesh.  This one never finished because
> > it's memory requirements exceeded the resources of my small cluster.
> > 
> > Eventually, I decided to test out some different solvers and 
> preconditioners. 
> > I switched the preconditioner to Jacobi and suddenly the 116k job ran in 
> 40 
> > seconds and only needed ~4GB of memory.
>
> This is too small a problem to be indicative of much. Jacobi is good for 
> small 
> problems, see also here:
>
> https://dealii.org/developer/doxygen/deal.II/step_6.html#step_6-Solversandpreconditioners
> But it is not effective for large problems because the effort to solve 
> linear 
> problems is not O(N) with this preconditioner. AMG typically is, but it is 
> only better on large problems.
>
> I do wonder about your memory usage. A good order of magnitude for 2d 
> problems 
> is 1 kB per DoF. Perhaps twice that in 3d. You are using 30 kB.
>
>
> > - Maybe it's because my mesh is generated externally and not refined 
> using the 
> > deal.ii mesh refinement approach?
> > - Maybe it's related to the use of p:f:t instead of p:d:t?
> > - Maybe I have some issue with my PETSc install?
> > - Maybe I didn't properly set the parameters for the AMG preconditioner?
>
> All possibilities. But hard to tell on such a small problem.
>
> Best
> W.
>
> -- 
> 
> Wolfgang Bangerth email: bang...@colostate.edu
> www: http://www.math.colostate.edu/~bangerth/
>
>
>

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/a5263d6d-b2ef-4c2b-8313-7c4bbdc5870an%40googlegroups.com.


Re: [deal.II] Imported mesh problem with p4est and parallel::distributed

2024-01-30 Thread Wolfgang Bangerth



Alex:

I am running a solid-mechanics job where I import a large mesh and run using 
parallel fullydistributed with MPI. I had been trying to run my job using a CG 
solver with the BoomerAMG

preconditioner (based on the example in step-40).

I ran my mesh with 116,000 nodes and the solver took about 700 seconds and 
used around 18GB of RAM.  I then tried to run another version of the same 
problem with a finer 1.5M node mesh.  This one never finished because

it's memory requirements exceeded the resources of my small cluster.

Eventually, I decided to test out some different solvers and preconditioners. 
I switched the preconditioner to Jacobi and suddenly the 116k job ran in 40 
seconds and only needed ~4GB of memory.


This is too small a problem to be indicative of much. Jacobi is good for small 
problems, see also here:

https://dealii.org/developer/doxygen/deal.II/step_6.html#step_6-Solversandpreconditioners
But it is not effective for large problems because the effort to solve linear 
problems is not O(N) with this preconditioner. AMG typically is, but it is 
only better on large problems.


I do wonder about your memory usage. A good order of magnitude for 2d problems 
is 1 kB per DoF. Perhaps twice that in 3d. You are using 30 kB.



- Maybe it's because my mesh is generated externally and not refined using the 
deal.ii mesh refinement approach?

- Maybe it's related to the use of p:f:t instead of p:d:t?
- Maybe I have some issue with my PETSc install?
- Maybe I didn't properly set the parameters for the AMG preconditioner?


All possibilities. But hard to tell on such a small problem.

Best
 W.

--

Wolfgang Bangerth  email: bange...@colostate.edu
   www: http://www.math.colostate.edu/~bangerth/


--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups "deal.II User Group" group.

To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/66aa7388-9f00-46fa-8166-216c81ada9e8%40colostate.edu.


Re: [deal.II] Imported mesh problem with p4est and parallel::distributed

2024-01-30 Thread Alex Quinlan
Dear deal.ii community,

I've had relative success running the parallel::fullydistributed, but now 
I've encountered some strange preconditioner/solver behavior.

I am running a solid-mechanics job where I import a large mesh and run 
using parallel fullydistributed with MPI. I had been trying to run my job 
using a CG solver with the BoomerAMG
preconditioner (based on the example in step-40).

I ran my mesh with 116,000 nodes and the solver took about 700 seconds and 
used around 18GB of RAM.  I then tried to run another version of the same 
problem with a finer 1.5M node mesh.  This one never finished because
it's memory requirements exceeded the resources of my small cluster.

Eventually, I decided to test out some different solvers and 
preconditioners. I switched the preconditioner to Jacobi and suddenly the 
116k job ran in 40 seconds and only needed ~4GB of memory.

I was quite surprised by this.  I thought that Jacobi was the bare minimum 
of a preconditioner and that the AMG would be better suited for this job. 
So, I'm curious if you have any thoughts on why my job might be reacting 
this
way: a 20x speed-up and a 5x reduction in memory requirements.

- Maybe it's because my mesh is generated externally and not refined using 
the deal.ii mesh refinement approach?
- Maybe it's related to the use of p:f:t instead of p:d:t?
- Maybe I have some issue with my PETSc install?
- Maybe I didn't properly set the parameters for the AMG preconditioner?

I have limited understanding of solvers and preconditioners, so I am trying 
to learn more about them.  I'd appreciate any input you may have.

Best regards,
Alex

On Friday, December 22, 2023 at 10:13:55 AM UTC-5 Alex Quinlan wrote:

> Thanks for your response, Wolfgang.
>
> Your last sentence seems to be my solution.  I was planning to use 
> parallel::fullydistributed due to the large sizes of my imported meshes 
>  (per a suggestion on a previous post of mine:  
> https://groups.google.com/g/dealii/c/V5HH2pZ0Kow )
>
> I wanted to run parallel:distributed primarily to compare with 
> parallel::fullydistributed.  I was also planning to use the approach in 
> dealii/tests/fullydistributed_grids/copy_distributed_tria_*  to convert a 
> p:d:t into p:d:f. 
>
> But now, I will proceed by skipping the comparison and by using the 
> approach in dealii/tests/fullydistributed_grids/copy_serial_tria_*, 
> starting with parallel:shared:triangulation and then building the p:f:t off 
> of that.  This allows me to side-step bug-7428 and to import a simplex mesh.
>
> Best regards,
> Alex
>
> On Tuesday, December 19, 2023 at 8:26:04 PM UTC-5 Wolfgang Bangerth wrote:
>
>>
>> Alex: 
>>
>> You've hit on one of those bugs that every once in a while someone trips 
>> over, 
>> but that nobody really ever takes/has the time to fully debug. For sure, 
>> we 
>> would love to get some help with this issue, and the github bug report 
>> already 
>> contains a relatively small test case that should make debugging not too 
>> painful. The right approach is likely to use this test case, see what 
>> inputs 
>> deal.II is calling p4est's mesh creation routine, and then generate a 
>> stand-alone test case that only uses p4est to trigger the problem. I 
>> would 
>> expect this can be done in 200 or so lines. 
>>
>> I believe that at the core of the problem lies a mismatch of what we 
>> think 
>> p4est wants, and what p4est really expects. This also illustrates the 
>> problem 
>> with debugging the issue: Those of us who know deal.II don't know p4est, 
>> and 
>> the other way around. It requires someone willing to create such a 
>> stand-alone 
>> test case, and then talk to the p4est folks about why this is wrong. Once 
>> we 
>> understand why the input is wrong/unacceptable to p4est, it should be 
>> relatively straightforward to come up with a way to give p4est what it 
>> wants. 
>> The *understanding* of the problem is the key step. 
>>
>>
>> > Can you help me out by answering a couple questions? 
>> > 
>> > * Is there a work-around for this issue? 
>>
>> No. 
>>
>> > o Is there an alternative to p4est? 
>>
>> In the long run, the p4est people are working on a library called tetcode 
>> that 
>> extends p4est. But I don't think any of us have ever tried to see what it 
>> would take to replace p4est by tetcode. 
>>
>>
>> > o Are there modifications I can make to my mesh/program to avoid this 
>> issue? 
>>
>> Not sure. Absent an understanding of why p4est is upset, any 
>> modifications 
>> would just be poking in the dark. 
>>
>>
>> > * What would be involved with fixing the bug? 
>>
>> See above. 
>>
>>
>> > On a possibly related issue: 
>> > 
>> > * What is needed to allow for simplex meshes with 
>> parallel::distributed?  Is 
>> > this also a connectivity issue? 
>>
>> p::d::Triangulation builds on p4est, which is exclusively for quad/hex 
>> meshes. 
>> Switching to tetcode would allow for the use of other cell types as well, 
>> but 
>> I don't think any of us have 

Re: [deal.II] Imported mesh problem with p4est and parallel::distributed

2023-12-22 Thread Alex Quinlan
Thanks for your response, Wolfgang.

Your last sentence seems to be my solution.  I was planning to use 
parallel::fullydistributed due to the large sizes of my imported meshes 
 (per a suggestion on a previous post of mine: 
 https://groups.google.com/g/dealii/c/V5HH2pZ0Kow )

I wanted to run parallel:distributed primarily to compare with 
parallel::fullydistributed.  I was also planning to use the approach in 
dealii/tests/fullydistributed_grids/copy_distributed_tria_*  to convert a 
p:d:t into p:d:f. 

But now, I will proceed by skipping the comparison and by using the 
approach in dealii/tests/fullydistributed_grids/copy_serial_tria_*, 
starting with parallel:shared:triangulation and then building the p:f:t off 
of that.  This allows me to side-step bug-7428 and to import a simplex mesh.

Best regards,
Alex

On Tuesday, December 19, 2023 at 8:26:04 PM UTC-5 Wolfgang Bangerth wrote:

>
> Alex:
>
> You've hit on one of those bugs that every once in a while someone trips 
> over, 
> but that nobody really ever takes/has the time to fully debug. For sure, 
> we 
> would love to get some help with this issue, and the github bug report 
> already 
> contains a relatively small test case that should make debugging not too 
> painful. The right approach is likely to use this test case, see what 
> inputs 
> deal.II is calling p4est's mesh creation routine, and then generate a 
> stand-alone test case that only uses p4est to trigger the problem. I would 
> expect this can be done in 200 or so lines.
>
> I believe that at the core of the problem lies a mismatch of what we think 
> p4est wants, and what p4est really expects. This also illustrates the 
> problem 
> with debugging the issue: Those of us who know deal.II don't know p4est, 
> and 
> the other way around. It requires someone willing to create such a 
> stand-alone 
> test case, and then talk to the p4est folks about why this is wrong. Once 
> we 
> understand why the input is wrong/unacceptable to p4est, it should be 
> relatively straightforward to come up with a way to give p4est what it 
> wants. 
> The *understanding* of the problem is the key step.
>
>
> > Can you help me out by answering a couple questions?
> > 
> > * Is there a work-around for this issue?
>
> No.
>
> > o Is there an alternative to p4est?
>
> In the long run, the p4est people are working on a library called tetcode 
> that 
> extends p4est. But I don't think any of us have ever tried to see what it 
> would take to replace p4est by tetcode.
>
>
> > o Are there modifications I can make to my mesh/program to avoid this 
> issue?
>
> Not sure. Absent an understanding of why p4est is upset, any modifications 
> would just be poking in the dark.
>
>
> > * What would be involved with fixing the bug?
>
> See above.
>
>
> > On a possibly related issue:
> > 
> > * What is needed to allow for simplex meshes with 
> parallel::distributed?  Is
> > this also a connectivity issue?
>
> p::d::Triangulation builds on p4est, which is exclusively for quad/hex 
> meshes. 
> Switching to tetcode would allow for the use of other cell types as well, 
> but 
> I don't think any of us have given substantial thought to what such a 
> switch 
> would require.
>
> There is the parallel::fullydistributed::Triangulation class, which I 
> *think* 
> works with simplex meshes. It is not covered by tutorial programs yet, 
> though.
>
> Best
> W.
>
> -- 
> 
> Wolfgang Bangerth email: bang...@colostate.edu
> www: http://www.math.colostate.edu/~bangerth/
>
>
>

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/4ef98ac2-983c-4181-b930-54f6e4912ad5n%40googlegroups.com.


Re: [deal.II] Imported mesh problem with p4est and parallel::distributed

2023-12-19 Thread Wolfgang Bangerth



Alex:

You've hit on one of those bugs that every once in a while someone trips over, 
but that nobody really ever takes/has the time to fully debug. For sure, we 
would love to get some help with this issue, and the github bug report already 
contains a relatively small test case that should make debugging not too 
painful. The right approach is likely to use this test case, see what inputs 
deal.II is calling p4est's mesh creation routine, and then generate a 
stand-alone test case that only uses p4est to trigger the problem. I would 
expect this can be done in 200 or so lines.


I believe that at the core of the problem lies a mismatch of what we think 
p4est wants, and what p4est really expects. This also illustrates the problem 
with debugging the issue: Those of us who know deal.II don't know p4est, and 
the other way around. It requires someone willing to create such a stand-alone 
test case, and then talk to the p4est folks about why this is wrong. Once we 
understand why the input is wrong/unacceptable to p4est, it should be 
relatively straightforward to come up with a way to give p4est what it wants. 
The *understanding* of the problem is the key step.




Can you help me out by answering a couple questions?

  * Is there a work-around for this issue?


No.


  o Is there an alternative to p4est?


In the long run, the p4est people are working on a library called tetcode that 
extends p4est. But I don't think any of us have ever tried to see what it 
would take to replace p4est by tetcode.




  o Are there modifications I can make to my mesh/program to avoid this 
issue?


Not sure. Absent an understanding of why p4est is upset, any modifications 
would just be poking in the dark.




  * What would be involved with fixing the bug?


See above.



On a possibly related issue:

  * What is needed to allow for simplex meshes with parallel::distributed?  Is
this also a connectivity issue?


p::d::Triangulation builds on p4est, which is exclusively for quad/hex meshes. 
Switching to tetcode would allow for the use of other cell types as well, but 
I don't think any of us have given substantial thought to what such a switch 
would require.


There is the parallel::fullydistributed::Triangulation class, which I *think* 
works with simplex meshes. It is not covered by tutorial programs yet, though.


Best
 W.

--

Wolfgang Bangerth  email: bange...@colostate.edu
   www: http://www.math.colostate.edu/~bangerth/


--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups "deal.II User Group" group.

To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/d631cb01-b9fa-42d7-8901-321fa2e25010%40colostate.edu.


[deal.II] Imported mesh problem with p4est and parallel::distributed

2023-12-19 Thread Alex Quinlan
Dear deal.ii community,

I am working with a mesh that is imported via a modified version of 
GridIn::read_abaqus().  I'm able to run my mesh and job with 
parallel::shared without any issues.

However, when I go to use parallel::distributed, I run into an issue with 
p8est connectivity:

void dealii::parallel::distributed::Triangulation::copy_new_triangulation_to_p4est(std::integral_constant) 
[with int dim = 3; int spacedim = 3]
The violated condition was: 
p8est_connectivity_is_valid(connectivity) == 1

It seems that this is a known issue:  
https://github.com/dealii/dealii/issues/7428 

I've read through the issue thread, but I don't fully understand what is 
going on.  It's not clear to me if this is a bug with p4est or with 
deal.ii.  There is a listed work-around that seems to rearrange the nodes, 
but I'm not sure if that's applicable in my situation.

Can you help me out by answering a couple questions?

   - Is there a work-around for this issue?  
   - Is there an alternative to p4est?
  - Are there modifications I can make to my mesh/program to avoid this 
  issue?
   - What would be involved with fixing the bug?

On a possibly related issue:

   - What is needed to allow for simplex meshes with 
   parallel::distributed?  Is this also a connectivity issue?
   
Many thanks,
Alex

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/d8ee7ec8-4ca9-4c3a-82f6-33b7c356cd9an%40googlegroups.com.