On Wed, Jan 22, 2014 at 12:59 PM, Tim Tautges (ANL) <taut...@mcs.anl.gov>wrote:
> That's funny, I was thinking the same about DMPlex. :) > Maybe you can think that on the moab-dev list. Matt > - tim > > > On 01/22/2014 12:45 PM, Matthew Knepley wrote: > >> Tim, >> >> I do not consider MOAB a real alternative here. >> >> Matt >> >> On Wed, Jan 22, 2014 at 12:18 PM, Tim Tautges (ANL) >> <taut...@mcs.anl.gov<mailto: >> taut...@mcs.anl.gov>> wrote: >> >> >> >> On 01/21/2014 05:58 PM, Gorman, Gerard J wrote: >> >> >> >> I am not sure if Exodus has a good solution here. As far >> as I understand, exodus is inherently >> sequential, even >> when implemented with HDF5 instead of netcdf. I would >> also worry about third party support for exodus >> files using >> HDF5 as their storage format. Exodus has an parallel >> extension called nemesis, but I can?t figure out >> how how >> their concept of ghost point and cells works. The >> documentation on this point is really unclear. >> >> >> >> I have to admit I was kind of hoping that ExodusII folks would >> have come on a bit more on the parallel IO front (I’m >> assuming those guys also run large simulations…). That said, I >> see this as a two stage process: first integrate with >> DMPlex as that should give the high level abstraction for >> read/write to file; secondly extend the family of >> readers/writers. At least this way there will be some agility and >> interoperability between different formats, and it >> will not be too disruptive to the application codes when a >> different formats adopted. >> >> >> My impression is that the ExodusII people are working within the >> context of code frameworks more than disk file >> formats to do this, e.g. in Trilinos and Sierra. I don't think the >> ExoII file format by itself is very conducive to >> representing parallel, which is why Nemesis writes an annotation >> (though, I haven't followed ExoII developments >> closely since they went open source several years back). >> >> >> >> >> b. Do ?poor man? parallel I/O where each CPU does its own >> I/O, and possibly create interface matching >> files ? la >> nemesis or SILO. Maybe, we can save enough information on >> the parallel layout in order to easily write an >> un-partitionner as a post-processor. >> >> >> I am pretty sure that if we are writing everything in slabs to a >> HDF5 container we do not have to worry too much >> about the parallel layout although some clear optimisations are >> possible. In the worst case it is a three stage >> process of where we perform a parallel read of the connectivity, >> scatter/gather for continuous numbering, parallel >> repartitioning and subsequent parallel read of remaining data. >> Importantly, it is at least scalable. >> >> >> We've seen fragmentation with unstructured meshes being a problem >> too, and you won't escape that even with >> renumbering (though reading then migrating would address that, at the >> cost of some additional communication and >> possibly reading to figure out where things need to go). >> >> >> >> >> Depending on the degree of direct interaction/automation in >> those interactions between the mesh and Petsc, there >> are other options as well. One that we're developing, based >> on the MOAB library, can read/write (in serial) >> ExodusII, and also supports parallel read/write using its own >> HDF5-based format. Parallel I/O robustness >> has been >> iffy above ~16k procs and 32M-64M hex/tet elements, but for >> smaller problems it should work. We're in the >> process >> of developing direct support for going between a mesh defined >> with fields (called tags in MOAB) and petsc >> vectors. >> MOAB has pretty solid support for things like computing >> sharing and ghosting between procs and >> exchanging/reducing >> field values on those entities. Viz is supported either by >> compiling a VTK/Paraview plugin that pulls the >> mesh/fields through MOAB or by translating to VTK (also >> supported directly from MOAB); Visit also has a >> plugin you >> can enable. See http://trac.mcs.anl.gov/__ >> projects/ITAPS/wiki/MOAB >> >> <http://trac.mcs.anl.gov/projects/ITAPS/wiki/MOAB> for >> details of MOAB; the petsc integration stuff >> is on a bitbucket branch of petsc owned by Vijay Mahadevan. >> >> >> Another reason this could be of great interest is that MOAB >> supports (according to the docs) geometric topology >> which >> could be used when adapting the mesh on a curved surface for >> example - another item on my which list. >> >> >> Yes, MOAB's file format can handle definitions of groupings (and >> relations between the groups) necessary to >> represent geometric topology. What you use for the shape evaluation >> of those geometric surfaces is another question >> though. If you're wanting to do that using a reconstructed >> continuous patch, MOAB has some code to do that too, >> though it's not as good as the latest stuff in that area (from Jiao >> at Stony Brook). >> >> >> >> Is it >> >> integrated to PETSc via the plex or does this essentially replace >> the functionality of the plex? >> >> >> It's not via plex, but I'm pretty sure all the mesh-related >> functionality available through plex is available >> through different API functions in MOAB. >> >> >> Why does it break >> >> down for more than 16k procs? >> >> >> It's a combination of things: >> - maximizing generality means we're using more than just 2 or 3 >> tables, because in addition to nodes and elements, >> we need groupings, whose membership determines whether a group is >> resident on a given processor, etc, and that >> strongly affects scaling >> - that same generality causes us to hit an MPI/IO bug on IBM (though >> we haven't checked on Q yet to see if that's >> been addressed, it might have been); we've worked with ANL I/O guys >> off and on on this, and hope to get back to that >> on Q soon >> - we do single file parallel I/O, without any 2-phase (communicate >> down to I/O nodes then do I/O), and that hits >> HDF5 pretty hard; we're working with hdfgroup to explore that >> >> We haven't done any benchmarking on a Lustre system yet, but I expect >> that to do worse than IBM, because of the many >> tables thing (my impression is that Lustre doesn't handle frequent >> metadata reads well) >> >> >> is it just a case that Lustre gets hammered? What magic sauce is used >> by high order FEM >> >> codes such as nek500 that can run on ~1m cores? >> >> >> Those codes go for a much more restricted I/O data case, which allows >> them to specialize and do their own >> implementation of parallel I/O. So, Nek5000 has its own >> implementation of poor man's parallel, they repeat vertices >> in the file that are logically the same (shared between hexes), and >> they don't really do subsets. I think that's >> great to do if you have to, but I'm still hoping for more support in >> that direction from general libraries. >> >> >> >> libmesh also maintains its own DMlibmesh, but I'm not sure >> how solid their support for large mesh / parallel >> I/O is >> (but they've been working on it recently I know). >> >> >> >> Are there any other formats that we should be considering? It’s a >> few years since I tried playing about with CGNS - >> at the time its parallel IO was non-existent and I have not seen >> it being pushed since. XDMF looks interesting as it >> is essentially some xml metadata and a HDF5 bucket. Is anyone >> championing this? >> >> >> Don't know about XDMF. I know there's been a bunch of work on SILO >> and its parallel performance fairly recently (3 >> or 4 yrs ago) and it's used heavily inside LLNL. >> >> - tim >> >> Cheers Gerard >> >> >> -- >> ==============================__==============================__==== >> >> "You will keep in perfect peace him whose mind is >> steadfast, because he trusts in you." Isaiah 26:3 >> >> Tim Tautges Argonne National Laboratory >> (taut...@mcs.anl.gov <mailto:taut...@mcs.anl.gov>) >> (telecommuting from UW-Madison) >> phone (gvoice): (608) 354-1459 <tel:%28608%29%20354-1459> >> 1500 Engineering Dr. >> fax: (608) 263-4499 <tel:%28608%29%20263-4499> >> Madison, WI 53706 >> >> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any >> results to which their experiments lead. >> -- Norbert Wiener >> > > -- > ================================================================ > "You will keep in perfect peace him whose mind is > steadfast, because he trusts in you." Isaiah 26:3 > > Tim Tautges Argonne National Laboratory > (taut...@mcs.anl.gov) (telecommuting from UW-Madison) > phone (gvoice): (608) 354-1459 1500 Engineering Dr. > fax: (608) 263-4499 Madison, WI 53706 > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener