Yi-Chung,

I'm willing to work on this part. Please let me know how should I start it. I
believe this code will help the community.

No doubt! Thank you for your offer of help!

I'm going to comment in more detail below, but will point out that for all major development, it is always helpful to do it in a way so that you can get feedback early and often. There is nothing worse than going to work for two or three months, uploading all of your code, and then getting feedback that something could have been done in a much simpler way, or should have been done in a different way to make it more general. In other words, whenever you have something that is working, put it into a github pull request and let others take a look and comment on it!


    > My application is about IC designs that
    > may have million to billion cells. A fully distributed triangulation
    > helps to reduce memory usage. The current shared_memory system can
    > handle 20M (single core) in system of 32GB main memory.

    That's already quite impressive :-) What kind of meshes do you have that
    require so many cells? Are they geometrically incredibly complicated to
    require that many cells already at the coarse level?

Actually, this is the interesting part. We try to simulate thermal profile of
integrated circuit. For a CPU, it has billion transistors inside and each of
then has its own power trace as RHS. That is why we have to give it a large
coarse mesh at beginning. I did some model reductions for transistors, but I
still want my tool can simulate 100M cells to ensure accuracy.

That makes sense. My question was more motivated by why you need a *coarse* mesh that is so fine? If your geometry is relatively simple, but you need high resolution, then you can just take a coarse mesh with simple cells and refine it a sufficient number of times. That already works -- we have done computations on 100M cells many times.

The only reason I can see for a very fine coarse mesh is if you need to resolve geometries with lots and lots and lots of curved edges and corners.


    Yes, I'm willing to believe this. The algorithm wasn't intended for
    meshes of this size, though we did test it with ~300k cells in 2d and we
    know that it scales like O(N). So 200 seconds seems like a long time. Is
    this in debug mode?

Unfortunately not in debug mode.I guess the reorder is more like O(N^2 or N^4)
if I may recall.

Hm, that is strange. Would you be willing to share one or two of your meshes with one or a few million cells? If you can't share them publicly, can you share them with me?


It searches the cells will minimum numbers of neighbors and then search again
recursively for its neighbors. With increasing number of dofs, time increases
exponentially.

Ah, I see -- that's the reordering in the Cuthill-McKee step in parallel::distributed::Triangulation. Interesting. I wonder whether we could come up with a more scalable implementation of this algorithm by building up different data structures.


In my setup program, a steady state 3-D thermal simulation (distributed trial)
for a problem of 5M dofs in two cores requires 200 sec reorder, 80 sec
setup_system, 80 sec solver time (petsc-MPI). 100sec output, and 80 sec
create_tria, and 45 sec assembly. Each core requires 9GB memory. This is why I
want to reduce memory usage.

Yes, I can see why the reorder step is annoying then.

Best
 Wolfgang

--
------------------------------------------------------------------------
Wolfgang Bangerth          email:                 bange...@colostate.edu
                           www: http://www.math.colostate.edu/~bangerth/

--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- You received this message because you are subscribed to the Google Groups "deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to