Re: [deal.II] Control over the "valence" of p4est distributed meshes

2022-12-08 Thread Wolfgang Bangerth

On 12/8/22 14:46, blais...@gmail.com wrote:


I know that P4est as a partitioner has limitations compared to other 
approaches like Metis or Scotch. However, I was wondering if there was anyway 
to penalize the partitioner in order to generate the smaller valence that is 
possible? Sometimes we end up with interfaces between processors that are 
substantially big (or islands of a few cells). The issue is that for our 
particle code, this generates a ton of Ghost particles and these particles 
generate additional cost (collisions become significantly more expensive to 
calculate because we cannot apply newton's third law for a collision. The 
calculation becomes essentially duplicated).


I was wondering if there was any ways to "force" or ensure that the interfaces 
between subdomains remained relatively well-posed?


Bruno:
No, p4est partitions a space-filling curve. You can select where the partition 
points along this one-dimensional line are (by choosing weights for each 
cell), but you can't change the curve or what subdomain interfaces this creates.


If that's what you need, you'll have to use the fullydistributed triangulation 
class that allows you to do that.


Best
 W.

--

Wolfgang Bangerth  email: bange...@colostate.edu
   www: http://www.math.colostate.edu/~bangerth/


--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups "deal.II User Group" group.

To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/2d06e675-ae73-044e-5b8a-189c1b09148f%40colostate.edu.


[deal.II] Control over the "valence" of p4est distributed meshes

2022-12-08 Thread blais...@gmail.com
Dear all,
I hope you are well.

I know that P4est as a partitioner has limitations compared to other 
approaches like Metis or Scotch. However, I was wondering if there was 
anyway to penalize the partitioner in order to generate the smaller valence 
that is possible? Sometimes we end up with interfaces between processors 
that are substantially big (or islands of a few cells). The issue is that 
for our particle code, this generates a ton of Ghost particles and these 
particles generate additional cost (collisions become significantly more 
expensive to calculate because we cannot apply newton's third law for a 
collision. The calculation becomes essentially duplicated). 

I was wondering if there was any ways to "force" or ensure that the 
interfaces between subdomains remained relatively well-posed?

Thank you very much!
Bruno

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/08ba5731-6692-439b-8921-3dc92631f8d6n%40googlegroups.com.


Re: [deal.II] What's the best strategy to speed up assembly?

2022-12-08 Thread Wolfgang Bangerth



Kev,

I've done some layman's benchmarking of the individual "steps" (setup, 
assembly, solve, ...) in my current version of the code. It looks as if the 
assembly takes several orders of magnitude (~100 at least) longer than the 
solving part.


My question is now: What is the best strategy to speed up assembly, is there 
any experience with this? I've read different approaches and am confused 
what's promising for small-scale problems. So far I'm considering:


1) Using a matrix-free approach rather than PETSc - this seems to be a win in 
most cases, would however consider  rewriting large parts of the code and I am 
not sure if I will gain a lot given my small system size.


2) Only assemble the Jacobian every few steps, but the residual in every step. 
This is probably easier to implement. I know from experience with my problem 
that I pretty quickly land in a situation where I need only one or two Newton 
steps to find the solution to my nonlinear equation, so there saving will be 
small at best.


Like Bruno, it seems rather unlikely to me that assembly would take 100x times 
longer than other things, and I would suggest you use something like 
TimerOutput to time individual sections to narrow down where the issue lies.


Beyond this, only updating the Jacobian is indeed a fairly usual strategy. One 
can of course implement this by hand, but it's quite cumbersome to implement 
optimal algorithms to determine when updating is necessary, and I would 
encourage you to take a look at step-77 to see how one can solve nonlinear 
problems efficiently through interfaces to advanced libraries such as SUNDIALS:

  https://dealii.org/developer/doxygen/deal.II/step_77.html

I don't remember if KINSOL has ways to solve a sequence of nonlinear system, 
re-using the Jacobian between systems. But if it does not, you can always just 
store a pointer to the last Jacobian matrix and whenever you start solving a 
linear system for the first time, copy the stored matrix over.


In any case, I think the first step should be to look at (i) whether assembly 
really takes that long for you, (ii) understand why it takes that long. If 
step-77 is any indication, then assembling the Jacobian should really not take 
that much longer than actually solving linear systems.


Best
 W.


--

Wolfgang Bangerth  email: bange...@colostate.edu
   www: http://www.math.colostate.edu/~bangerth/


--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups "deal.II User Group" group.

To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/d151778c-297c-834d-88ed-8bcadda95298%40colostate.edu.


Re: [deal.II] Re: What's the best strategy to speed up assembly?

2022-12-08 Thread Wolfgang Bangerth

On 12/8/22 12:13, blais...@gmail.com wrote:
For example, I try to avoid doing as much work as possible during the 
double loop over DOFs which is the inner most loop. Sometimes 
pre-calculating things in the outer loop really speeds-up the 
calculation. This also depends on the polynomials you are using for 
interpolation. If you are using high-order polynomials, I think this is 
where you will reap the benefits of Matrix free significantly.


This is extensively discussed in step-22, as a point of reference.
Best
 W.

--

Wolfgang Bangerth  email: bange...@colostate.edu
   www: http://www.math.colostate.edu/~bangerth/

--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups "deal.II User Group" group.

To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/4146de94-3178-503c-2d81-befc5f732375%40colostate.edu.


[deal.II] Re: What's the best strategy to speed up assembly?

2022-12-08 Thread blais...@gmail.com
I am a bit surprised that your assembly is really 100x more expensive than 
your linear solver.
Maybe your assembly code is not optimized? 
For example, I try to avoid doing as much work as possible during the 
double loop over DOFs which is the inner most loop. Sometimes 
pre-calculating things in the outer loop really speeds-up the calculation. 
This also depends on the polynomials you are using for interpolation. If 
you are using high-order polynomials, I think this is where you will reap 
the benefits of Matrix free significantly.

Feel free to ask more questions :)!
Best
Bruno


On Thursday, December 8, 2022 at 11:10:51 a.m. UTC-5 kmsc...@gmail.com 
wrote:

> Hej, 
>
> I've written here before and hope I don't misuse the mailing list - 
> however I've been looking a bit in the documentation and here and haven't 
> really found a conclusive answer.
>
> I am aiming to solve a nonlinear hyperbolic transport equation. For the 
> sake of the argument, let's say it reads
>
> mu \cdot \nabla f(x) = - f(x)^2 - 2*b(x)*f(x) - a(x)
>
> this is, of course, a Riccati equation (up to signs, possibly). In my 
> case, f is a complex function but this is of little relevance here. Since 
> it's a nonlinear problem I need to construct both the Jacobian and the 
> residual. For starters, I do that in each step.
>
> I've managed to implement this and even get a PETSc-parallelised version 
> to work, and am very happy. (I love deal.ii, by the way - very impressive). 
> It scales not "optimally" on my small laptop but it's still a fine speedup 
> when using MPI. So far so good.
>
> However, I want to solve my problem for many different directions vF, and 
> then extract all the solutions and do something with them. As such, my 
> problem is less that I need very large number of DOFs / huge meshes - my 
> typical mesh will be on the order of 1 unknowns, maybe 100k but not 
> millions. Rather, I want the individual solves to be as fast as possible 
> since I need to do on the order of 100-1 of them, depending on the 
> problem at hand. 
>
> I've done some layman's benchmarking of the individual "steps" (setup, 
> assembly, solve, ...) in my current version of the code. It looks as if the 
> assembly takes several orders of magnitude (~100 at least) longer than the 
> solving part.
>
> My question is now: What is the best strategy to speed up assembly, is 
> there any experience with this? I've read different approaches and am 
> confused what's promising for small-scale problems. So far I'm considering:
>
> 1) Using a matrix-free approach rather than PETSc - this seems to be a win 
> in most cases, would however consider  rewriting large parts of the code 
> and I am not sure if I will gain a lot given my small system size.
>
> 2) Only assemble the Jacobian every few steps, but the residual in every 
> step. This is probably easier to implement. I know from experience with my 
> problem that I pretty quickly land in a situation where I need only one or 
> two Newton steps to find the solution to my nonlinear equation, so there 
> saving will be small at best. 
>
> Is there anything else one can do? 
> So far I've been using MeshWorker, which is fine and understandable to be, 
> but e.g. the boundary term as used in Example 12 queries the scalar product 
> of \mu and the edge normal in each boundary element, which seems like a 
> possible slowdown - in addition to generating jumps and averages on inner 
> cell edges.
>
>  Any help is much appreciated. Sorry for the long text!
> /Kev
>

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/4977f9c9-45e8-4fc0-aed4-e6c9bd1783cfn%40googlegroups.com.


[deal.II] What's the best strategy to speed up assembly?

2022-12-08 Thread Kev Se
Hej, 

I've written here before and hope I don't misuse the mailing list - however 
I've been looking a bit in the documentation and here and haven't really 
found a conclusive answer.

I am aiming to solve a nonlinear hyperbolic transport equation. For the 
sake of the argument, let's say it reads

mu \cdot \nabla f(x) = - f(x)^2 - 2*b(x)*f(x) - a(x)

this is, of course, a Riccati equation (up to signs, possibly). In my case, 
f is a complex function but this is of little relevance here. Since it's a 
nonlinear problem I need to construct both the Jacobian and the residual. 
For starters, I do that in each step.

I've managed to implement this and even get a PETSc-parallelised version to 
work, and am very happy. (I love deal.ii, by the way - very impressive). It 
scales not "optimally" on my small laptop but it's still a fine speedup 
when using MPI. So far so good.

However, I want to solve my problem for many different directions vF, and 
then extract all the solutions and do something with them. As such, my 
problem is less that I need very large number of DOFs / huge meshes - my 
typical mesh will be on the order of 1 unknowns, maybe 100k but not 
millions. Rather, I want the individual solves to be as fast as possible 
since I need to do on the order of 100-1 of them, depending on the 
problem at hand. 

I've done some layman's benchmarking of the individual "steps" (setup, 
assembly, solve, ...) in my current version of the code. It looks as if the 
assembly takes several orders of magnitude (~100 at least) longer than the 
solving part.

My question is now: What is the best strategy to speed up assembly, is 
there any experience with this? I've read different approaches and am 
confused what's promising for small-scale problems. So far I'm considering:

1) Using a matrix-free approach rather than PETSc - this seems to be a win 
in most cases, would however consider  rewriting large parts of the code 
and I am not sure if I will gain a lot given my small system size.

2) Only assemble the Jacobian every few steps, but the residual in every 
step. This is probably easier to implement. I know from experience with my 
problem that I pretty quickly land in a situation where I need only one or 
two Newton steps to find the solution to my nonlinear equation, so there 
saving will be small at best. 

Is there anything else one can do? 
So far I've been using MeshWorker, which is fine and understandable to be, 
but e.g. the boundary term as used in Example 12 queries the scalar product 
of \mu and the edge normal in each boundary element, which seems like a 
possible slowdown - in addition to generating jumps and averages on inner 
cell edges.

 Any help is much appreciated. Sorry for the long text!
/Kev

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/75da33ce-4268-4716-a7e0-dae04192324an%40googlegroups.com.


[deal.II] Re: How would you document a step/code/module?

2022-12-08 Thread Abbas
Thank you Marc for your input, 
Maybe I am just complicating things. 
If I ever make a good example/module in the future, I'll make sure to share 
it to the code gallery. 

On Monday, December 5, 2022 at 1:11:14 AM UTC+2 mafe...@gmail.com wrote:

> Hello Abbas,
>
> if you are willing the share your code with the deal.II community, you can 
> submit your code to the deal.II code gallery. The documentation for each of 
> the example programs use the same doxygen formatting as deal.II.
> https://www.dealii.org/code-gallery.html
> https://github.com/dealii/code-gallery
>
> Doxygen in general is a powerful tool to document your code, but other 
> tools exist as well as for example Jupyter as you pointed out. I feel like 
> it boils down to personal taste which one you would like to use. But kudos 
> to you in thinking about documenting and sharing your code!
>
> Marc
>
> On Sunday, December 4, 2022 at 6:26:41 AM UTC-7 Abbas wrote:
>
>> Hello, 
>>
>> Let's say you made 1000+ lines of code with dealii, and you want to 
>> document and share it in a format similar to that of the dealii steps, how 
>> would you go about doing this?
>>
>> MAYBE in the future people will share single "modules" with some 
>> "tutorial" like documentation. This might be something some of us are 
>> thinking about or considering. I guess? 
>>
>> I have been suggested to use a jupyter notebook but I feel like that's a 
>> huge waste of white space. Having a dedicated web link is also very 
>> attractive so I was considering having a website hosted on github pages. I 
>> tried that using Gatsby and it, but I feel like this is unnecessarily 
>> over-complicated.
>>
>> Has anyone done something similar to this? If not, any ideas?
>>
>> Thank you for reading this far.  
>>
>

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/2dde0e1c-8394-457d-8ec1-0a2976354630n%40googlegroups.com.