I know there is/was a bug that affected OS X when trying to write a dataset 
larger than 2^31 bytes. It is solved in the current development branch. I am 
not sure if this effects Linux x64 though. but it sure looks like the stack 
traces I was getting.

Try checking out the development code from:

The code can be checked out from:

http://svn.hdfgroup.uiuc.edu/hdf5/branches/hdf5_1_8_10

and see if the problem still persists.
___________________________________________________________
Mike Jackson                    Principal Software Engineer
BlueQuartz Software                            Dayton, Ohio
[email protected]              www.bluequartz.net

--
Mike Jackson <www.bluequartz.net>

On Jan 24, 2013, at 4:01 PM, Robert McLay wrote:

> I am unable to write a 2GB dataset from a single task.  I have the same 
> problem with 1.8.8, 1.8.9, and 1.8.10.   I have attached a FORTRAN 90 program 
> that shows the problem.  I also have a C program that shows the same problem 
> so I do not think this is a problem of FORTRAN to C convertion.   The program 
> is a parallel f90 program run as a single task.
> 
> Here is the error report:
> 
> $  ./example
> HDF5-DIAG: Error detected in HDF5 (1.8.10) MPI-process 0:
>   #000: H5Dio.c line 266 in H5Dwrite(): can't write data
>     major: Dataset
>     minor: Write failed
>   #001: H5Dio.c line 673 in H5D__write(): can't write data
>     major: Dataset
>     minor: Write failed
>   #002: H5Dmpio.c line 544 in H5D__contig_collective_write(): couldn't finish 
> shared collective MPI-IO
>     major: Low-level I/O
>     minor: Write failed
>   #003: H5Dmpio.c line 1523 in H5D__inter_collective_io(): couldn't finish 
> collective MPI-IO
>     major: Low-level I/O
>     minor: Can't get value
>   #004: H5Dmpio.c line 1567 in H5D__final_collective_io(): optimized write 
> failed
>     major: Dataset
>     minor: Write failed
>   #005: H5Dmpio.c line 312 in H5D__mpio_select_write(): can't finish 
> collective parallel write
>     major: Low-level I/O
>     minor: Write failed
>   #006: H5Fio.c line 158 in H5F_block_write(): write through metadata 
> accumulator failed
>     major: Low-level I/O
>     minor: Write failed
>   #007: H5Faccum.c line 816 in H5F_accum_write(): file write failed
>     major: Low-level I/O
>     minor: Write failed
>   #008: H5FDint.c line 185 in H5FD_write(): driver write request failed
>     major: Virtual File Layer
>     minor: Write failed
>   #009: H5FDmpio.c line 1842 in H5FD_mpio_write(): MPI_File_write_at_all 
> failed
>     major: Internal error (too specific to document in detail)
>     minor: Some MPI function failed
>   #010: H5FDmpio.c line 1842 in H5FD_mpio_write(): Invalid argument, error 
> stack:
> MPI_FILE_WRITE_AT_ALL(84): Invalid count argument
>     major: Internal error (too specific to document in detail)
>     minor: MPI Error String
> forrtl: severe (174): SIGSEGV, segmentation fault occurred
> Image              PC                Routine            Line        Source    
>          
> libpthread.so.0    00000031C4D0C4F0  Unknown               Unknown  Unknown
> libc.so.6          00000031C3E721E3  Unknown               Unknown  Unknown
> example            000000000071C646  Unknown               Unknown  Unknown
> libmpich.so.3      00002B0682CBF48C  Unknown               Unknown  Unknown
> libmpich.so.3      00002B0682E6AE91  Unknown               Unknown  Unknown
> libmpich.so.3      00002B0682E6BDB2  Unknown               Unknown  Unknown
> example            00000000004AE9EC  Unknown               Unknown  Unknown
> example            00000000004A9014  Unknown               Unknown  Unknown
> example            0000000000497040  Unknown               Unknown  Unknown
> example            00000000004990A2  Unknown               Unknown  Unknown
> example            0000000000693991  Unknown               Unknown  Unknown
> example            00000000004597A9  Unknown               Unknown  Unknown
> example            000000000045C144  Unknown               Unknown  Unknown
> example            000000000043B6B4  Unknown               Unknown  Unknown
> 
> Attached is the configuration data
> 
> Attached is a program which produces the error report.  I compiled this 
> fortran 90 program with:
> 
> h5pfc -g -O0 -o example example.f90
> 
> Internal to the program is a variable "LocalSz" which is 646.  8*(646^3) is 
> bigger than 2*1024^3.   The program works if LocalSz is 645.
> 
> Thanks for looking at this.
> 
> 
> 
> -- 
> Robert McLay, Ph.D.
> TACC
> Manager, HPC Software Tools
> (512) 232-8104
> 
> <h5.config.txt><example.f90>_______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org


_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Reply via email to