Ok, message received.  Later.

- tim

On 01/22/2014 01:00 PM, Matthew Knepley wrote:
On Wed, Jan 22, 2014 at 12:59 PM, Tim Tautges (ANL) <taut...@mcs.anl.gov 
<mailto:taut...@mcs.anl.gov>> wrote:

    That's funny, I was thinking the same about DMPlex. :)


Maybe you can think that on the moab-dev list.

    Matt

    - tim


    On 01/22/2014 12:45 PM, Matthew Knepley wrote:

        Tim,

            I do not consider MOAB a real alternative here.

               Matt

        On Wed, Jan 22, 2014 at 12:18 PM, Tim Tautges (ANL) <taut...@mcs.anl.gov 
<mailto:taut...@mcs.anl.gov>
        <mailto:taut...@mcs.anl.gov <mailto:taut...@mcs.anl.gov>>> wrote:



             On 01/21/2014 05:58 PM, Gorman, Gerard J wrote:



                         I am not sure if Exodus has a good solution here. As 
far as I understand, exodus is inherently
                         sequential, even
                         when implemented with HDF5 instead of netcdf. I would 
also worry about third party support for
        exodus
                         files using
                         HDF5 as their storage format. Exodus has an parallel 
extension called nemesis, but I can?t
        figure out
                         how how
                         their concept of ghost point and cells works. The 
documentation on this point is really unclear.



                 I have to admit I was kind of hoping that ExodusII folks would 
have come on a bit more on the parallel
        IO front (I’m
                 assuming those guys also run large simulations…). That said, I 
see this as a two stage process: first
        integrate with
                 DMPlex as that should give the high level abstraction for 
read/write to file; secondly extend the family of
                 readers/writers. At least this way there will be some agility 
and interoperability between different
        formats, and it
                 will not be too disruptive to the application codes when a 
different formats adopted.


             My impression is that the ExodusII people are working within the 
context of code frameworks more than disk file
             formats to do this, e.g. in Trilinos and Sierra.  I don't think 
the ExoII file format by itself is very
        conducive to
             representing parallel, which is why Nemesis writes an annotation 
(though, I haven't followed ExoII developments
             closely since they went open source several years back).




                         b. Do ?poor man? parallel I/O where each CPU does its 
own I/O, and possibly create interface
        matching
                         files ? la
                         nemesis or SILO. Maybe, we can save enough information 
on the parallel layout in order to
        easily write an
                         un-partitionner as a post-processor.


                 I am pretty sure that if we are writing everything in slabs to 
a HDF5 container we do not have to worry
        too much
                 about the parallel layout although some clear optimisations 
are possible. In the  worst case it is a
        three stage
                 process of where we perform a parallel read of the 
connectivity, scatter/gather for continuous
        numbering, parallel
                 repartitioning and subsequent parallel read of remaining data. 
Importantly, it is at least scalable.


             We've seen fragmentation with unstructured meshes being a problem 
too, and you won't escape that even with
             renumbering (though reading then migrating would address that, at 
the cost of some additional communication and
             possibly reading to figure out where things need to go).




                     Depending on the degree of direct interaction/automation 
in those interactions between the mesh and
        Petsc, there
                     are other options as well.  One that we're developing, 
based on the MOAB library, can read/write
        (in serial)
                     ExodusII, and also supports parallel read/write using its 
own HDF5-based format.  Parallel I/O
        robustness
                     has been
                     iffy above ~16k procs and 32M-64M hex/tet elements, but 
for smaller problems it should work.  We're
        in the
                     process
                     of developing direct support for going between a mesh 
defined with fields (called tags in MOAB) and
        petsc
                     vectors.
                     MOAB has pretty solid support for things like computing 
sharing and ghosting between procs and
                     exchanging/reducing
                     field values on those entities.  Viz is supported either 
by compiling a VTK/Paraview plugin that
        pulls the
                     mesh/fields through MOAB or by translating to VTK (also 
supported directly from MOAB); Visit also has a
                     plugin you
                     can enable.  See 
http://trac.mcs.anl.gov/____projects/ITAPS/wiki/MOAB
        <http://trac.mcs.anl.gov/__projects/ITAPS/wiki/MOAB>

                     <http://trac.mcs.anl.gov/__projects/ITAPS/wiki/MOAB
        <http://trac.mcs.anl.gov/projects/ITAPS/wiki/MOAB>> for details of 
MOAB; the petsc integration stuff
                     is on a bitbucket branch of petsc owned by Vijay Mahadevan.


                 Another reason this could be of great interest is that MOAB 
supports (according to the docs) geometric
        topology
                 which
                 could be used when adapting the mesh on a curved surface for 
example - another item on my which list.


             Yes, MOAB's file format can handle definitions of groupings (and 
relations between the groups) necessary to
             represent geometric topology.  What you use for the shape 
evaluation of those geometric surfaces is another
        question
             though.  If you're wanting to do that using a reconstructed 
continuous patch, MOAB has some code to do that
        too,
             though it's not as good as the latest stuff in that area (from 
Jiao at Stony Brook).



             Is it

                 integrated to PETSc via the plex or does this essentially 
replace the functionality of the plex?


             It's not via plex, but I'm pretty sure all the mesh-related 
functionality available through plex is available
             through different API functions in MOAB.


             Why does it break

                 down for more than 16k procs?


             It's a combination of things:
             - maximizing generality means we're using more than just 2 or 3 
tables, because in addition to nodes and
        elements,
             we need groupings, whose membership determines whether a group is 
resident on a given processor, etc, and that
             strongly affects scaling
             - that same generality causes us to hit an MPI/IO bug on IBM 
(though we haven't checked on Q yet to see if
        that's
             been addressed, it might have been); we've worked with ANL I/O 
guys off and on on this, and hope to get
        back to that
             on Q soon
             - we do single file parallel I/O, without any 2-phase (communicate 
down to I/O nodes then do I/O), and that
        hits
             HDF5 pretty hard; we're working with hdfgroup to explore that

             We haven't done any benchmarking on a Lustre system yet, but I 
expect that to do worse than IBM, because of
        the many
             tables thing (my impression is that Lustre doesn't handle frequent 
metadata reads well)


             is it just a case that Lustre gets hammered? What magic sauce is 
used by high order FEM

                 codes such as nek500 that can run on ~1m cores?


             Those codes go for a much more restricted I/O data case, which 
allows them to specialize and do their own
             implementation of parallel I/O.  So, Nek5000 has its own 
implementation of poor man's parallel, they repeat
        vertices
             in the file that are logically the same (shared between hexes), 
and they don't really do subsets.  I think
        that's
             great to do if you have to, but I'm still hoping for more support 
in that direction from general libraries.



                     libmesh also maintains its own DMlibmesh, but I'm not sure 
how solid their support for large mesh /
        parallel
                     I/O is
                     (but they've been working on it recently I know).



                 Are there any other formats that we should be considering? 
It’s a few years since I tried playing about
        with CGNS -
                 at the time its parallel IO was non-existent and I have not 
seen it being pushed since. XDMF looks
        interesting as it
                 is essentially some xml metadata and a HDF5 bucket. Is anyone 
championing this?


             Don't know about XDMF.  I know there's been a bunch of work on 
SILO and its parallel performance fairly
        recently (3
             or 4 yrs ago) and it's used heavily inside LLNL.

             - tim

                 Cheers Gerard


             --
             
==============================____============================__==__====

             "You will keep in perfect peace him whose mind is
                steadfast, because he trusts in you."               Isaiah 26:3

                           Tim Tautges            Argonne National Laboratory
                       (taut...@mcs.anl.gov <mailto:taut...@mcs.anl.gov> 
<mailto:taut...@mcs.anl.gov
        <mailto:taut...@mcs.anl.gov>>)      (telecommuting from UW-Madison)
               phone (gvoice): (608) 354-1459 <tel:%28608%29%20354-1459> 
<tel:%28608%29%20354-1459>      1500
        Engineering Dr.
                          fax: (608) 263-4499 <tel:%28608%29%20263-4499> 
<tel:%28608%29%20263-4499>      Madison, WI 53706





        --
        What most experimenters take for granted before they begin their 
experiments is infinitely more interesting than any
        results to which their experiments lead.
        -- Norbert Wiener


    --
    ==============================__==============================__====
    "You will keep in perfect peace him whose mind is
       steadfast, because he trusts in you."               Isaiah 26:3

                  Tim Tautges            Argonne National Laboratory
              (taut...@mcs.anl.gov <mailto:taut...@mcs.anl.gov>)      
(telecommuting from UW-Madison)
      phone (gvoice): (608) 354-1459 <tel:%28608%29%20354-1459>      1500 
Engineering Dr.
                 fax: (608) 263-4499 <tel:%28608%29%20263-4499>      Madison, 
WI 53706




--
What most experimenters take for granted before they begin their experiments is 
infinitely more interesting than any
results to which their experiments lead.
-- Norbert Wiener

--
================================================================
"You will keep in perfect peace him whose mind is
  steadfast, because he trusts in you."               Isaiah 26:3

             Tim Tautges            Argonne National Laboratory
         (taut...@mcs.anl.gov)      (telecommuting from UW-Madison)
 phone (gvoice): (608) 354-1459      1500 Engineering Dr.
            fax: (608) 263-4499      Madison, WI 53706

Reply via email to