I am unable to write a 2GB dataset from a single task. I have the same problem with 1.8.8, 1.8.9, and 1.8.10. I have attached a FORTRAN 90 program that shows the problem. I also have a C program that shows the same problem so I do not think this is a problem of FORTRAN to C convertion. The program is a parallel f90 program run as a single task.
Here is the error report:
$ ./example
HDF5-DIAG: Error detected in HDF5 (1.8.10) MPI-process 0:
#000: H5Dio.c line 266 in H5Dwrite(): can't write data
major: Dataset
minor: Write failed
#001: H5Dio.c line 673 in H5D__write(): can't write data
major: Dataset
minor: Write failed
#002: H5Dmpio.c line 544 in H5D__contig_collective_write(): couldn't
finish shared collective MPI-IO
major: Low-level I/O
minor: Write failed
#003: H5Dmpio.c line 1523 in H5D__inter_collective_io(): couldn't finish
collective MPI-IO
major: Low-level I/O
minor: Can't get value
#004: H5Dmpio.c line 1567 in H5D__final_collective_io(): optimized write
failed
major: Dataset
minor: Write failed
#005: H5Dmpio.c line 312 in H5D__mpio_select_write(): can't finish
collective parallel write
major: Low-level I/O
minor: Write failed
#006: H5Fio.c line 158 in H5F_block_write(): write through metadata
accumulator failed
major: Low-level I/O
minor: Write failed
#007: H5Faccum.c line 816 in H5F_accum_write(): file write failed
major: Low-level I/O
minor: Write failed
#008: H5FDint.c line 185 in H5FD_write(): driver write request failed
major: Virtual File Layer
minor: Write failed
#009: H5FDmpio.c line 1842 in H5FD_mpio_write(): MPI_File_write_at_all
failed
major: Internal error (too specific to document in detail)
minor: Some MPI function failed
#010: H5FDmpio.c line 1842 in H5FD_mpio_write(): Invalid argument, error
stack:
MPI_FILE_WRITE_AT_ALL(84): Invalid count argument
major: Internal error (too specific to document in detail)
minor: MPI Error String
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
libpthread.so.0 00000031C4D0C4F0 Unknown Unknown Unknown
libc.so.6 00000031C3E721E3 Unknown Unknown Unknown
example 000000000071C646 Unknown Unknown Unknown
libmpich.so.3 00002B0682CBF48C Unknown Unknown Unknown
libmpich.so.3 00002B0682E6AE91 Unknown Unknown Unknown
libmpich.so.3 00002B0682E6BDB2 Unknown Unknown Unknown
example 00000000004AE9EC Unknown Unknown Unknown
example 00000000004A9014 Unknown Unknown Unknown
example 0000000000497040 Unknown Unknown Unknown
example 00000000004990A2 Unknown Unknown Unknown
example 0000000000693991 Unknown Unknown Unknown
example 00000000004597A9 Unknown Unknown Unknown
example 000000000045C144 Unknown Unknown Unknown
example 000000000043B6B4 Unknown Unknown Unknown
Attached is the configuration data
Attached is a program which produces the error report. I compiled this
fortran 90 program with:
h5pfc -g -O0 -o example example.f90
Internal to the program is a variable "LocalSz" which is 646. 8*(646^3) is
bigger than 2*1024^3. The program works if LocalSz is 645.
Thanks for looking at this.
--
Robert McLay, Ph.D.
TACC
Manager, HPC Software Tools
(512) 232-8104
SUMMARY OF THE HDF5 CONFIGURATION
=================================
General Information:
-------------------
HDF5 Version: 1.8.10
Configured on: Mon Dec 10 16:48:09 CST 2012
Configured by: mclay@login1
Configure mode: production
Host system: x86_64-unknown-linux-gnu
Uname information: Linux login1 2.6.18-238.19.1.el5.TACC #2 SMP
Fri Aug 19 15:40:23 CDT 2011 x86_64 x86_64 x86_64 GNU/Linux
Byte sex: little-endian
Libraries:
Installation point:
/work/00515/mclay/apps/intel-12_1/mvapich2-1_8/phdf5/1.8.10
Compiling Options:
------------------
Compilation Mode: production
C Compiler: /opt/apps/intel12/mvapich2/1.8/bin/mpicc (
Intel(R) C Intel(R) 64 Compiler Version 12.1 Build 20111011)
CFLAGS: -g -O3 -fPIC -fno-omit-frame-pointer
H5_CFLAGS: -std=c99 -O3
AM_CFLAGS:
CPPFLAGS:
H5_CPPFLAGS: -D_POSIX_C_SOURCE=199506L -DNDEBUG
-UH5_DEBUG_API
AM_CPPFLAGS:
-I/work/00515/mclay/apps/intel-12_1/mvapich2-1_8/phdf5/1.8.10/include
-D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_BSD_SOURCE
Shared C Library: yes
Static C Library: yes
Statically Linked Executables: no
LDFLAGS:
-L/work/00515/mclay/apps/intel-12_1/mvapich2-1_8/phdf5/1.8.10/lib -lsz -lz
H5_LDFLAGS:
AM_LDFLAGS:
-L/work/00515/mclay/apps/intel-12_1/mvapich2-1_8/phdf5/1.8.10/lib
Extra libraries: -lsz -lz -lm
-Wl,-rpath,/work/00515/mclay/apps/intel-12_1/mvapich2-1_8/phdf5/1.8.10/lib
Archiver: ar
Ranlib: ranlib
Debugged Packages:
API Tracing: no
Languages:
----------
Fortran: yes
Fortran Compiler: /opt/apps/intel12/mvapich2/1.8/bin/mpif90
Fortran 2003 Compiler: no
Fortran Flags: -O3 -fPIC
H5 Fortran Flags: -O3
AM Fortran Flags:
Shared Fortran Library: yes
Static Fortran Library: yes
C++: no
Features:
---------
Parallel HDF5: yes
High Level library: yes
Threadsafety: no
Default API Mapping: v18
With Deprecated Public Symbols: yes
I/O filters (external): deflate(zlib),szip(encoder)
I/O filters (internal): shuffle,fletcher32,nbit,scaleoffset
MPE:
Direct VFD: no
dmalloc: no
Clear file buffers before write: yes
Using memory checker: no
Function Stack Tracing: no
GPFS: no
Strict File Format Checks: no
Optimization Instrumentation: no
Large File Support (LFS): yes
example.f90
Description: Binary data
_______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
