I don't think this is by default per say.  I use batchtools (instead of simfactory) exactly for this reason and discourage new users from restarting in the same directory as the parent checkpoints to avoid this exact outcome.  An additional issue that can arise is if a job is terminated before the walltime such that the data stored in ASCII/HDF5 goes beyond the last checkpoint are potential ingestion issues due to data mismatching from the restart.  I think overall Kuibit handles this well, but it is has been an issue in the past for some users before learning to separate restarts into individual directories.

Cheers,

Samuel

On 12/16/22 10:01 AM, Bruno Giacomazzo wrote:

    - Safety feature to avoid HDF5 files from being corrupted
      * Leo requests a feature that would allow the user to e.g.,
    generate one
        output file per restart. With kuibit, there was interested in
    switching from
        the ASCII data files to the HDF5 in our research group.
    However, in a recent
        simulation it turned out that a node failure caused a crash as
    one of the
        HDF5 was being written to and we lost all data for an important
        gridfunction. If one HDF5 file was written per restart (or
    another safety
        feature was in place), then this would have not been an issue,
    as only one
        of the chunks of data would have been corrupted. Leo will open
    a ticket
        about this.


Isn't this done automatically when using simfactory? I have my hdf5 data written in the separate output-00?? directories (the ones generated by symfactory at each restart) so that if one run has problems I do not lose all the data.

Cheers,
Bruno


--

Prof. Bruno Giacomazzo
Department of Physics
University of Milano-Bicocca
Piazza della Scienza 3
20126 Milano
Italy

email: bruno.giacoma...@unimib.it
phone: (+39) 02 6448 2321
web: http://www.brunogiacomazzo.org

---------------------------------------------------------------------
There are only 10 types of people in the world:
Those who understand binary, and those who don't
----------------------------------------------------------------------


_______________________________________________
Users mailing list
Users@einsteintoolkit.org
http://lists.einsteintoolkit.org/mailman/listinfo/users
_______________________________________________
Users mailing list
Users@einsteintoolkit.org
http://lists.einsteintoolkit.org/mailman/listinfo/users

Reply via email to