Yes, following the command 'df -Th', the type of file system where the 
error occurred is 'nfs'.

Thanks again.


Timo Heister schrieb am Dienstag, 13. April 2021 um 17:50:42 UTC+2:

> Great to hear. Was it a standard NFS filesystem that produced the failures?
>
> On Tue, Apr 13, 2021 at 4:03 AM 'Christian Burkhardt' via deal.II User
> Group <dea...@googlegroups.com> wrote:
> >
> > Thanks for your quick answer.
> > The problem persisted for the parallel vtu output.
> > Changing the file system where the output is written to, to a xfs solved 
> the problem. Sorry for the inconvinience, but we are quite new to cluster 
> systems.
> >
> > Thanks,
> > Christian
> >
> > Timo Heister schrieb am Montag, 12. April 2021 um 18:37:32 UTC+2:
> >>
> >> Christian,
> >>
> >> What kind of filesystem is this file written to? Does our parallel vtu
> >> output that also uses MPI IO work correctly? (step-40 with grouping
> >> set to a single file)
> >>
> >> On Mon, Apr 12, 2021 at 12:15 PM 'Christian Burkhardt' via deal.II
> >> User Group <dea...@googlegroups.com> wrote:
> >> >
> >> > Hi everyone,
> >> >
> >> > we installed dea...@9.2.0 on our HPC cluster (centos7) using spack 
> and the intel compilers 
> (dea...@9.2.0%in...@19.0.4~assimp~petsc~slepc~ginkgo~adol-c+mpi^intel-mpi^intel-mkl^boost).
> >> > When running our code, which uses hdf5 for output, on the front node 
> and when submitting it via the batch script everything works fine as long 
> as we run on a single node (up to 40 cores).
> >> > As soon as we increase the node number above 1 (eg 41 cores) the code 
> fails.
> >> > We were able to reproduce the problem with an adapted version of 
> step-40 of the dealii tutorials that outputs using hdf5 (see attached 
> step-40).
> >> > The restriction to 32 MPI Processes for the output was bypassed by 
> setting the limit to 42.
> >> > The code can overwrite existing files (created during a previous run 
> with 40 processes or less and 1 node), but crashes when new files are to be 
> created with the following error message, which is related to the hdf5 
> output:
> >> >
> >> > ...
> >> > HDF5-DIAG: Error detected in HDF5 (1.8.21) MPI-process 41:
> >> > #000: H5F.c line 520 in H5Fcreate(): unable to create file
> >> > major: File accessibilty
> >> > minor: Unable to open file
> >> > #001: H5Fint.c line 990 in H5F_open(): unable to open file: time = 
> Mon Apr 12 11:25:30 2021
> >> > , name = 'Solution_0.h5', tent_flags = 13
> >> > major: File accessibilty
> >> > minor: Unable to open file
> >> > #002: H5FD.c line 991 in H5FD_open(): open failed
> >> > major: Virtual File Layer
> >> > minor: Unable to initialize object
> >> > #003: H5FDmpio.c line 1057 in H5FD_mpio_open(): MPI_File_open failed
> >> > major: Internal error (too specific to document in detail)
> >> > minor: Some MPI function failed
> >> > #004: H5FDmpio.c line 1057 in H5FD_mpio_open(): File does not exist, 
> error stack:
> >> > ADIOI_UFS_OPEN(39): File Solution_0.h5 does not exist
> >> > major: Internal error (too specific to document in detail)
> >> > minor: MPI Error String
> >> > ...
> >> >
> >> > The file mentioned in the error message is still created, but remains 
> empty (file size 0)
> >> > (see testRunMPI.e1448739 for the full error message).
> >> > We tried different hdf5 versions (1.10.7, 1.8.21).
> >> > We also looked into https://github.com/choderalab/yank/issues/1165 
> and tried setting the environment variable "HDF5_USE_FILE_LOCKING=FALSE" 
> which didn't alter the outcome.
> >> > As our configuration includes MPI "WITH_MPI=ON" the issue 
> https://github.com/dealii/dealii/issues/605 is about something different, 
> right?
> >> >
> >> > In the tutorial description it is mentioned that a limitation of 16 
> processors was chosen because such large examples have problems being 
> visualised.
> >> > Is there a general rule of thumb that states graphical output for DOF 
> numbers over a certain threshold are unfeasible?
> >> >
> >> > Any help is much appreciated.
> >> >
> >> > Christian
> >> >
> >> > --
> >> > The deal.II project is located at http://www.dealii.org/
> >> > For mailing list/forum options, see 
> https://groups.google.com/d/forum/dealii?hl=en
> >> > ---
> >> > You received this message because you are subscribed to the Google 
> Groups "deal.II User Group" group.
> >> > To unsubscribe from this group and stop receiving emails from it, 
> send an email to dealii+un...@googlegroups.com.
> >> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/dealii/CACORy7NFayrhRRYh8eR-5TrcVx%2Bjd61-1tKN9rkx9iWSFWTRVA%40mail.gmail.com
> .
> >>
> >>
> >>
> >> --
> >> Timo Heister
> >> http://www.math.clemson.edu/~heister/
> >
> > --
> > The deal.II project is located at http://www.dealii.org/
> > For mailing list/forum options, see 
> https://groups.google.com/d/forum/dealii?hl=en
> > ---
> > You received this message because you are subscribed to the Google 
> Groups "deal.II User Group" group.
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to dealii+un...@googlegroups.com.
> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/dealii/f645bc47-7424-4b1e-844a-7ab784e4fae4n%40googlegroups.com
> .
>
>
>
> -- 
> Timo Heister
> http://www.math.clemson.edu/~heister/
>

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dealii/96711e42-4860-4de1-aefc-78ceabab03aen%40googlegroups.com.

Reply via email to