Hi,I have some problems with PETSc and HDF5 VecLoad/VecView. The VecLoad problems can rest for now, but the VecView are more serious.
In short: I have a 3D DMDA with and some vectors that I want to save to a HDF5 file. This works perfectly on my workstation, but not on the compute cluster I have access to. I have attached a typical error message.
I have also attached an piece of code that can trigger the error. The code is merely a 2D->3D rewrite of DMDA ex 10 (http://www.mcs.anl.gov/petsc/petsc-current/src/dm/examples/tutorials/ex10.c.html), nothing else is done.
The program typically works on small number of processes. I have successfully executed the attached program on up to 32 processes. That works. Always. I have never had a single success when trying to run on 64 processes. Always same error.
The computer I am struggling with is an SGI machine with SLES 11sp1 and Intel CPUs, hence I have used Intels compilers. I have tried both 2013, 2014 and 2015 versions of the compilers, so that's probably not the cause. I have also tried GCC 4.9.1, just to be safe, same error there. The same compiler is used for both HDF5 and PETSc. The same error message occurs for both debug and release builds. I have tried HDF5 versions 1.8.11 and 1.8.13. I have tried PETSc version 3.4.1 and the latest from Git. The MPI implementation on the machine is SGI's MPT, and i have tried both 2.06 and 2.10. Always same error. Other MPI implementations is unfortunately not available.
What really drives me mad is that this works like a charm on my workstation with Linux Mint... I have successfully executed the attached example on 254 processes (my machine breaks down if I try anything more than that).
Does any of you have any tips on how to attack this problem and find out what's wrong?
Regards, Håkon Strandenes
HDF5-DIAG: Error detected in HDF5 (1.8.13) MPI-process 42:
#000: H5Dio.c line 225 in H5Dwrite(): can't prepare for writing data
major: Dataset
minor: Write failed
#001: H5Dio.c line 347 in H5D__pre_write(): can't write data
major: Dataset
minor: Write failed
#002: H5Dio.c line 783 in H5D__write(): can't write data
major: Dataset
minor: Write failed
#003: H5Dmpio.c line 757 in H5D__chunk_collective_write(): write error
major: Dataspace
minor: Write failed
#004: H5Dmpio.c line 685 in H5D__chunk_collective_io(): couldn't finish linked chunk MPI-IO
major: Low-level I/O
minor: Can't get value
#005: H5Dmpio.c line 998 in H5D__link_chunk_collective_io(): MPI_Type_struct failed
major: Internal error (too specific to document in detail)
minor: Some MPI function failed
#006: H5Dmpio.c line 998 in H5D__link_chunk_collective_io(): Invalid datatype argument
major: Internal error (too specific to document in detail)
minor: MPI Error String
HDF5-DIAG: Error detected in HDF5 (1.8.13) MPI-process 46:
#000: H5Dio.c line 225 in H5Dwrite(): can't prepare for writing data
major: Dataset
minor: Write failed
#001: H5Dio.c line 347 in H5D__pre_write(): can't write data
major: Dataset
minor: Write failed
#002: H5Dio.c line 783 in H5D__write(): can't write data
major: Dataset
minor: Write failed
#003: H5Dmpio.c line 757 in H5D__chunk_collective_write(): write error
major: Dataspace
minor: Write failed
#004: H5Dmpio.c line 685 in H5D__chunk_collective_io(): couldn't finish linked chunk MPI-IO
major: Low-level I/O
minor: Can't get value
#005: H5Dmpio.c line 998 in H5D__link_chunk_collective_io(): MPI_Type_struct failed
major: Internal error (too specific to document in detail)
minor: Some MPI function failed
#006: H5Dmpio.c line 998 in H5D__link_chunk_collective_io(): Invalid datatype argument
major: Internal error (too specific to document in detail)
minor: MPI Error String
HDF5-DIAG: Error detected in HDF5 (1.8.13) MPI-process 58:
#000: H5Dio.c line 225 in H5Dwrite(): can't prepare for writing data
major: Dataset
minor: Write failed
#001: H5Dio.c line 347 in H5D__pre_write(): can't write data
major: Dataset
minor: Write failed
#002: H5Dio.c line 783 in H5D__write(): can't write data
major: Dataset
minor: Write failed
#003: H5Dmpio.c line 757 in H5D__chunk_collective_write(): write error
major: Dataspace
minor: Write failed
#004: H5Dmpio.c line 685 in H5D__chunk_collective_io(): couldn't finish linked chunk MPI-IO
major: Low-level I/O
minor: Can't get value
#005: H5Dmpio.c line 998 in H5D__link_chunk_collective_io(): MPI_Type_struct failed
major: Internal error (too specific to document in detail)
minor: Some MPI function failed
#006: H5Dmpio.c line 998 in H5D__link_chunk_collective_io(): Invalid datatype argument
major: Internal error (too specific to document in detail)
minor: MPI Error String
HDF5-DIAG: Error detected in HDF5 (1.8.13) MPI-process 62:
#000: H5Dio.c line 225 in H5Dwrite(): can't prepare for writing data
major: Dataset
minor: Write failed
#001: H5Dio.c line 347 in H5D__pre_write(): can't write data
major: Dataset
minor: Write failed
#002: H5Dio.c line 783 in H5D__write(): can't write data
major: Dataset
minor: Write failed
#003: H5Dmpio.c line 757 in H5D__chunk_collective_write(): write error
major: Dataspace
minor: Write failed
#004: H5Dmpio.c line 685 in H5D__chunk_collective_io(): couldn't finish linked chunk MPI-IO
major: Low-level I/O
minor: Can't get value
#005: H5Dmpio.c line 998 in H5D__link_chunk_collective_io(): MPI_Type_struct failed
major: Internal error (too specific to document in detail)
minor: Some MPI function failed
#006: H5Dmpio.c line 998 in H5D__link_chunk_collective_io(): Invalid datatype argument
major: Internal error (too specific to document in detail)
minor: MPI Error String
[58]PETSC ERROR: #1 VecView_MPI_HDF5_DA() line 584 in /home/ntnu/hakostra/sw/petsc/dbg/source/src/dm/impls/da/gr2.c
HDF5-DIAG: Error detected in HDF5 (1.8.13) MPI-process 47:
#000: H5Dio.c line 225 in H5Dwrite(): can't prepare for writing data
major: Dataset
minor: Write failed
#001: H5Dio.c line 347 in H5D__pre_write(): can't write data
major: Dataset
minor: Write failed
#002: H5Dio.c line 783 in H5D__write(): can't write data
major: Dataset
minor: Write failed
#003: H5Dmpio.c line 757 in H5D__chunk_collective_write(): write error
major: Dataspace
minor: Write failed
#004: H5Dmpio.c line 685 in H5D__chunk_collective_io(): couldn't finish linked chunk MPI-IO
major: Low-level I/O
minor: Can't get value
#005: H5Dmpio.c line 998 in H5D__link_chunk_collective_io(): MPI_Type_struct failed
major: Internal error (too specific to document in detail)
minor: Some MPI function failed
#006: H5Dmpio.c line 998 in H5D__link_chunk_collective_io(): Invalid datatype argument
major: Internal error (too specific to document in detail)
minor: MPI Error String
HDF5-DIAG: Error detected in HDF5 (1.8.13) MPI-process 59:
#000: H5Dio.c line 225 in H5Dwrite(): can't prepare for writing data
major: Dataset
minor: Write failed
#001: H5Dio.c line 347 in H5D__pre_write(): can't write data
major: Dataset
minor: Write failed
#002: H5Dio.c line 783 in H5D__write(): can't write data
major: Dataset
minor: Write failed
#003: H5Dmpio.c line 757 in H5D__chunk_collective_write(): write error
major: Dataspace
minor: Write failed
#004: H5Dmpio.c line 685 in H5D__chunk_collective_io(): couldn't finish linked chunk MPI-IO
major: Low-level I/O
minor: Can't get value
#005: H5Dmpio.c line 998 in H5D__link_chunk_collective_io(): MPI_Type_struct failed
major: Internal error (too specific to document in detail)
minor: Some MPI function failed
#006: H5Dmpio.c line 998 in H5D__link_chunk_collective_io(): Invalid datatype argument
major: Internal error (too specific to document in detail)
minor: MPI Error String
[59]PETSC ERROR: #1 VecView_MPI_HDF5_DA() line 584 in /home/ntnu/hakostra/sw/petsc/dbg/source/src/dm/impls/da/gr2.c
[59]PETSC ERROR: #2 VecView_MPI_DA() line 709 in /home/ntnu/hakostra/sw/petsc/dbg/source/src/dm/impls/da/gr2.c
[42]PETSC ERROR: #1 VecView_MPI_HDF5_DA() line 584 in /home/ntnu/hakostra/sw/petsc/dbg/source/src/dm/impls/da/gr2.c
[62]PETSC ERROR: #1 VecView_MPI_HDF5_DA() line 584 in /home/ntnu/hakostra/sw/petsc/dbg/source/src/dm/impls/da/gr2.c
[62]PETSC ERROR: #2 VecView_MPI_DA() line 709 in /home/ntnu/hakostra/sw/petsc/dbg/source/src/dm/impls/da/gr2.c
[42]PETSC ERROR: #2 VecView_MPI_DA() line 709 in /home/ntnu/hakostra/sw/petsc/dbg/source/src/dm/impls/da/gr2.c
[42]PETSC ERROR: #3 VecView() line 604 in /home/ntnu/hakostra/sw/petsc/dbg/source/src/vec/vec/interface/vector.c
HDF5-DIAG: Error detected in HDF5 (1.8.13) MPI-process 63:
#000: H5Dio.c line 225 in H5Dwrite(): can't prepare for writing data
major: Dataset
minor: Write failed
#001: H5Dio.c line 347 in H5D__pre_write(): can't write data
major: Dataset
minor: Write failed
#002: H5Dio.c line 783 in H5D__write(): can't write data
major: Dataset
minor: Write failed
#003: H5Dmpio.c line 757 in H5D__chunk_collective_write(): write error
major: Dataspace
minor: Write failed
#004: H5Dmpio.c line 685 in H5D__chunk_collective_io(): couldn't finish linked chunk MPI-IO
major: Low-level I/O
minor: Can't get value
#005: H5Dmpio.c line 998 in H5D__link_chunk_collective_io(): MPI_Type_struct failed
major: Internal error (too specific to document in detail)
minor: Some MPI function failed
#006: H5Dmpio.c line 998 in H5D__link_chunk_collective_io(): Invalid datatype argument
major: Internal error (too specific to document in detail)
minor: MPI Error String
[63]PETSC ERROR: #1 VecView_MPI_HDF5_DA() line 584 in /home/ntnu/hakostra/sw/petsc/dbg/source/src/dm/impls/da/gr2.c
[63]PETSC ERROR: #2 VecView_MPI_DA() line 709 in /home/ntnu/hakostra/sw/petsc/dbg/source/src/dm/impls/da/gr2.c
[46]PETSC ERROR: #1 VecView_MPI_HDF5_DA() line 584 in /home/ntnu/hakostra/sw/petsc/dbg/source/src/dm/impls/da/gr2.c
[46]PETSC ERROR: #2 VecView_MPI_DA() line 709 in /home/ntnu/hakostra/sw/petsc/dbg/source/src/dm/impls/da/gr2.c
[46]PETSC ERROR: #3 VecView() line 604 in /home/ntnu/hakostra/sw/petsc/dbg/source/src/vec/vec/interface/vector.c
[58]PETSC ERROR: #2 VecView_MPI_DA() line 709 in /home/ntnu/hakostra/sw/petsc/dbg/source/src/dm/impls/da/gr2.c
[58]PETSC ERROR: #3 VecView() line 604 in /home/ntnu/hakostra/sw/petsc/dbg/source/src/vec/vec/interface/vector.c
[58]PETSC ERROR: #4 main() line 75 in /work/hakostra/PETSc-DM-ex10/ex10.cpp
[47]PETSC ERROR: #1 VecView_MPI_HDF5_DA() line 584 in /home/ntnu/hakostra/sw/petsc/dbg/source/src/dm/impls/da/gr2.c
[47]PETSC ERROR: #2 VecView_MPI_DA() line 709 in /home/ntnu/hakostra/sw/petsc/dbg/source/src/dm/impls/da/gr2.c
[47]PETSC ERROR: #3 VecView() line 604 in /home/ntnu/hakostra/sw/petsc/dbg/source/src/vec/vec/interface/vector.c
[59]PETSC ERROR: #3 VecView() line 604 in /home/ntnu/hakostra/sw/petsc/dbg/source/src/vec/vec/interface/vector.c
[59]PETSC ERROR: #4 main() line 75 in /work/hakostra/PETSc-DM-ex10/ex10.cpp
[42]PETSC ERROR: #4 main() line 75 in /work/hakostra/PETSc-DM-ex10/ex10.cpp
[62]PETSC ERROR: #3 VecView() line 604 in /home/ntnu/hakostra/sw/petsc/dbg/source/src/vec/vec/interface/vector.c
[62]PETSC ERROR: #4 main() line 75 in /work/hakostra/PETSc-DM-ex10/ex10.cpp
[46]PETSC ERROR: #4 main() line 75 in /work/hakostra/PETSc-DM-ex10/ex10.cpp
[63]PETSC ERROR: #3 VecView() line 604 in /home/ntnu/hakostra/sw/petsc/dbg/source/src/vec/vec/interface/vector.c
[63]PETSC ERROR: #4 main() line 75 in /work/hakostra/PETSc-DM-ex10/ex10.cpp
[47]PETSC ERROR: #4 main() line 75 in /work/hakostra/PETSc-DM-ex10/ex10.cpp
[58]PETSC ERROR: ----------------End of Error Message -------send entire error message to [email protected]
[42]PETSC ERROR: ----------------End of Error Message -------send entire error message to [email protected]
[59]PETSC ERROR: ----------------End of Error Message -------send entire error message to [email protected]
[46]PETSC ERROR: ----------------End of Error Message -------send entire error message to [email protected]
[62]PETSC ERROR: ----------------End of Error Message -------send entire error message to [email protected]
[47]PETSC ERROR: ----------------End of Error Message -------send entire error message to [email protected]
[63]PETSC ERROR: ----------------End of Error Message -------send entire error message to [email protected]
HDF5-DIAG: Error detected in HDF5 (1.8.13) MPI-process 43:
#000: H5Dio.c line 225 in H5Dwrite(): can't prepare for writing data
major: Dataset
minor: Write failed
#001: H5Dio.c line 347 in H5D__pre_write(): can't write data
major: Dataset
minor: Write failed
#002: H5Dio.c line 783 in H5D__write(): can't write data
major: Dataset
minor: Write failed
#003: H5Dmpio.c line 757 in H5D__chunk_collective_write(): write error
major: Dataspace
minor: Write failed
#004: H5Dmpio.c line 685 in H5D__chunk_collective_io(): couldn't finish linked chunk MPI-IO
major: Low-level I/O
minor: Can't get value
#005: H5Dmpio.c line 998 in H5D__link_chunk_collective_io(): MPI_Type_struct failed
major: Internal error (too specific to document in detail)
minor: Some MPI function failed
#006: H5Dmpio.c line 998 in H5D__link_chunk_collective_io(): Invalid datatype argument
major: Internal error (too specific to document in detail)
minor: MPI Error String
[43]PETSC ERROR: #1 VecView_MPI_HDF5_DA() line 584 in /home/ntnu/hakostra/sw/petsc/dbg/source/src/dm/impls/da/gr2.c
[43]PETSC ERROR: #2 VecView_MPI_DA() line 709 in /home/ntnu/hakostra/sw/petsc/dbg/source/src/dm/impls/da/gr2.c
[43]PETSC ERROR: #3 VecView() line 604 in /home/ntnu/hakostra/sw/petsc/dbg/source/src/vec/vec/interface/vector.c
[43]PETSC ERROR: #4 main() line 75 in /work/hakostra/PETSc-DM-ex10/ex10.cpp
[43]PETSC ERROR: ----------------End of Error Message -------send entire error message to [email protected]
MPT: MPI_COMM_WORLD rank 62 has terminated without calling MPI_Finalize()
aborting job
PETSc-DM-ex10.tar.gz
Description: application/gzip
