When using simfactory, if I write in my parameter file that the output directory for the hdf5 files is "./hdf5" (e.g., CarpetIOHDF5::out2D_dir = "./hdf5_2D") then this directory is created in the output-00?? directory. Therefore each run creates a separate hdf5 directory in the corresponding output-00?? one. This is how I avoid losing all data if one run has problems.
Cheers, Bruno Il giorno ven 16 dic 2022 alle ore 10:10 Samuel Tootle < [email protected]> ha scritto: > I don't think this is by default per say. I use batchtools (instead of > simfactory) exactly for this reason and discourage new users from > restarting in the same directory as the parent checkpoints to avoid this > exact outcome. An additional issue that can arise is if a job is > terminated before the walltime such that the data stored in ASCII/HDF5 goes > beyond the last checkpoint are potential ingestion issues due to data > mismatching from the restart. I think overall Kuibit handles this well, > but it is has been an issue in the past for some users before learning to > separate restarts into individual directories. > > Cheers, > > Samuel > On 12/16/22 10:01 AM, Bruno Giacomazzo wrote: > > - Safety feature to avoid HDF5 files from being corrupted >> * Leo requests a feature that would allow the user to e.g., generate one >> output file per restart. With kuibit, there was interested in >> switching from >> the ASCII data files to the HDF5 in our research group. However, in a >> recent >> simulation it turned out that a node failure caused a crash as one of >> the >> HDF5 was being written to and we lost all data for an important >> gridfunction. If one HDF5 file was written per restart (or another >> safety >> feature was in place), then this would have not been an issue, as >> only one >> of the chunks of data would have been corrupted. Leo will open a >> ticket >> about this. >> > > Isn't this done automatically when using simfactory? I have my hdf5 data > written in the separate output-00?? directories (the ones generated by > symfactory at each restart) so that if one run has problems I do not lose > all the data. > > Cheers, > Bruno > > > -- > > Prof. Bruno Giacomazzo > Department of Physics > University of Milano-Bicocca > Piazza della Scienza 3 > 20126 Milano > Italy > > email: [email protected] > phone: (+39) 02 6448 2321 > web: http://www.brunogiacomazzo.org > > --------------------------------------------------------------------- > There are only 10 types of people in the world: > Those who understand binary, and those who don't > ---------------------------------------------------------------------- > > _______________________________________________ > Users mailing > [email protected]http://lists.einsteintoolkit.org/mailman/listinfo/users > > -- Prof. Bruno Giacomazzo Department of Physics University of Milano-Bicocca Piazza della Scienza 3 20126 Milano Italy email: [email protected] phone: (+39) 02 6448 2321 web: http://www.brunogiacomazzo.org --------------------------------------------------------------------- There are only 10 types of people in the world: Those who understand binary, and those who don't ----------------------------------------------------------------------
_______________________________________________ Users mailing list [email protected] http://lists.einsteintoolkit.org/mailman/listinfo/users
