Hi Matt, I'm not sure if this is something related to the issue I filed https://gitlab.com/petsc/petsc/issues/423 where an unbalanced partitioning was observed even on a very tiny mesh, created in petsc/src/dm/impls/plex/examples/tutorials/ex7.c
Thanks, Valeria On Fri, Oct 18, 2019 at 6:21 PM <[email protected]> wrote: > Send petsc-users mailing list submissions to > [email protected] > > To subscribe or unsubscribe via the World Wide Web, visit > https://lists.mcs.anl.gov/mailman/listinfo/petsc-users > or, via email, send a message with subject or body 'help' to > [email protected] > > You can reach the person managing the list at > [email protected] > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of petsc-users digest..." > > > Today's Topics: > > 1. Re: Strange Partition in PETSc 3.11 version on some > computers (Matthew Knepley) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Fri, 18 Oct 2019 20:20:13 -0400 > From: Matthew Knepley <[email protected]> > To: Danyang Su <[email protected]> > Cc: "Smith, Barry F." <[email protected]>, Mark Lohry > <[email protected]>, PETSc <[email protected]> > Subject: Re: [petsc-users] Strange Partition in PETSc 3.11 version on > some computers > Message-ID: > <CAMYG4Gk00hHAxspwm9CXPR8Nw+Tjh3QJN3rkwESRyc_dMkjG= > [email protected]> > Content-Type: text/plain; charset="utf-8" > > On Fri, Oct 18, 2019 at 5:53 PM Danyang Su <[email protected]> wrote: > > > Hi All, > > > > I am now able to reproduce the partition problem using a relatively small > > mesh (attached). The mesh consists of 9087 nodes, 15656 prism cells. > There > > are 39 layers with 233 nodes for each layer. I have tested the partition > > using PETSc as well as Gmsh 3.0.1. > > > Great job finding a good test case. Can you send me that mesh? > > Thanks, > > Matt > > > Taking 4 partitions as an example, the partition from PETSc 3.9 and 3.10 > > are reasonable though not perfect, with total number of ghost nodes / > total > > number of nodes ratio 2754 / 9087. > > > > The partition from PETSc 3.11, PETSc 3.12 and PETSc-dev look weird, with > > total number of ghost nodes / total number of nodes: 12413 / 9087. The > > nodes are not well connected for the same processor. > > > > Note: the z axis is scaled by 25 for better visualization in paraview. > > > > > > The partition from Gmsh-Metis is a bit different but still quite similar > > to PETSc 3.9 and 3.10. > > > > Finally, the partition using Gmsh-Chaco Multilevel-KL algorithm is the > > best one, with total number of ghost nodes / total number of nodes: 741 / > > 9087 . For most of my simulation cases with much larger meshes, PETSc 3.9 > > and 3.10 generate partition similar to the one below, which work pretty > > well and the code can get very good speedup. > > > > Thanks, > > > > Danyang > > On 2019-09-18 11:44 a.m., Danyang Su wrote: > > > > > > On 2019-09-18 10:56 a.m., Smith, Barry F. via petsc-users wrote: > > > > > > On Sep 18, 2019, at 12:25 PM, Mark Lohry via petsc-users > > <[email protected]> <[email protected]> wrote: > > > > Mark, > > > > > > Mark, > > > > Good point. This has been a big headache forever > > > > Note that this has been "fixed" in the master version of PETSc and > > will be in its next release. If you use --download-parmetis in the future > > it will use the same random numbers on all machines and thus should > produce > > the same partitions on all machines. > > > > I think that metis has aways used the same random numbers and all > > machines and thus always produced the same results. > > > > Barry > > > > Good to know this. I will the same configuration that causes strange > > partition problem to test the next version. > > > > Thanks, > > > > Danyang > > > > > > > > The machine, compiler and MPI version should not matter. > > > > I might have missed something earlier in the thread, but parmetis has a > > dependency on the machine's glibc srand, and it can (and does) create > > different partitions with different srand versions. The same mesh on the > > same code on the same process count can and will give different > partitions > > (possibly bad ones) on different machines. > > > > On Tue, Sep 17, 2019 at 1:05 PM Mark Adams via petsc-users > > <[email protected]> <[email protected]> wrote: > > > > > > On Tue, Sep 17, 2019 at 12:53 PM Danyang Su <[email protected]> > > <[email protected]> wrote: > > Hi Mark, > > > > Thanks for your follow-up. > > > > The unstructured grid code has been verified and there is no problem in > > the results. The convergence rate is also good. The 3D mesh is not good, > it > > is based on the original stratum which I haven't refined, but good for > > initial test as it is relative small and the results obtained from this > > mesh still makes sense. > > > > The 2D meshes are just for testing purpose as I want to reproduce the > > partition problem on a cluster using PETSc3.11.3 and Intel2019. > > Unfortunately, I didn't find problem using this example. > > > > The code has no problem in using different PETSc versions (PETSc V3.4 to > > V3.11) > > > > OK, it is the same code. I thought I saw something about your code > > changing. > > > > Just to be clear, v3.11 never gives you good partitions. It is not just a > > problem on this Intel cluster. > > > > The machine, compiler and MPI version should not matter. > > and MPI distribution (MPICH, OpenMPI, IntelMPI), except for one > > simulation case (the mesh I attached) on a cluster with PETSc3.11.3 and > > Intel2019u4 due to the very different partition compared to PETSc3.9.3. > Yet > > the simulation results are the same except for the efficiency problem > > because the strange partition results into much more communication (ghost > > nodes). > > > > I am still trying different compiler and mpi with PETSc3.11.3 on that > > cluster to trace the problem. Will get back to you guys when there is > > update. > > > > > > This is very strange. You might want to use 'git bisect'. You set a good > > and a bad SHA1 (we can give you this for 3.9 and 3.11 and the exact > > commands). The git will go to a version in the middle. You then > > reconfigure, remake, rebuild your code, run your test. Git will ask you, > as > > I recall, if the version is good or bad. Once you get this workflow going > > it is not too bad, depending on how hard this loop is of course. > > Thanks, > > > > danyang > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ < > http://www.cse.buffalo.edu/~knepley/> > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191018/a51cd5a4/attachment.html > > > -------------- next part -------------- > A non-text attachment was scrubbed... > Name: basin-3d-dgr20000.png > Type: image/png > Size: 85113 bytes > Desc: not available > URL: < > http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191018/a51cd5a4/attachment.png > > > -------------- next part -------------- > A non-text attachment was scrubbed... > Name: gmsh-partition-metis.png > Type: image/png > Size: 61754 bytes > Desc: not available > URL: < > http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191018/a51cd5a4/attachment-0001.png > > > -------------- next part -------------- > A non-text attachment was scrubbed... > Name: gmsh-partition-Chaco.png > Type: image/png > Size: 66392 bytes > Desc: not available > URL: < > http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20191018/a51cd5a4/attachment-0002.png > > > > ------------------------------ > > Subject: Digest Footer > > _______________________________________________ > petsc-users mailing list > [email protected] > https://lists.mcs.anl.gov/mailman/listinfo/petsc-users > > > ------------------------------ > > End of petsc-users Digest, Vol 130, Issue 75 > ******************************************** > -- *Valeria Barra,* *Postdoctoral Research Associate,* *Department of Computer Science,* *University of Colorado at Boulder* https://csel-web.cs.colorado.edu/~vaba3353
