Re: [Hdf-forum] HDF5 causes Fatal error in MPI_Gather

Quincey Koziol Mon, 29 Mar 2010 07:36:41 -0700

Hi Stefan,

On Mar 29, 2010, at 8:53 AM, Frijters, S.C.J. wrote:


> Hi Quincey,
> 
> I'm not sure I understand what I'm doing wrong. Focussing on just one of the 
> cases:
> 
> - I have a 3D array of real*8, which is equivalent to a C double, right?
> - According to the manual, this should correspond to H5T_NATIVE_DOUBLE
> - I then use that datatype in:
> 
> CALL h5dcreate_f(file_id, dsetname, H5T_NATIVE_DOUBLE , filespace, dset_id, 
> err, plist_id)
> 
> and 
> 
> CALL h5dwrite_f(dset_id, H5T_NATIVE_DOUBLE, scalar, dims, err, file_space_id 
> = filespace, mem_space_id = memspace, xfer_prp = plist_id)
> 
> What am I missing?

        Ah, sorry, from your comments below, I thought your memory datatype was 
H5T_NATIVE_FLOAT and your file datatype was H5T_NATIVE_DOUBLE.  Everything 
above looks OK to me.

> Also, a quick related performance question - when using a hyperslab to make a 
> data selection, is there a performance difference between
> 
> count  = (/1, 1, 1/)
> stride = (/1, 1, 1/)
> block  = (/nx, ny, nz/)
> CALL h5sselect_hyperslab_f (filespace, H5S_SELECT_SET_F, offset, count, err, 
> stride, block)
> 
> and
> 
> count  = (/nx, ny, nz/)
> CALL h5sselect_hyperslab_f (filespace, H5S_SELECT_SET_F, offset, count, err)
> 
> I found the manual to be rather unclear on that point.

        No, they are equivalent.

                Quincey

> Kind regards,
> 
> Stefan Frijters
> ________________________________________
> From: [email protected] [[email protected]] On 
> Behalf Of Quincey Koziol [[email protected]]
> Sent: 26 March 2010 19:00
> To: HDF Users Discussion List
> Subject: Re: [Hdf-forum] HDF5 causes Fatal error in MPI_Gather
> 
> Hi Stefan,
> 
> On Mar 26, 2010, at 9:25 AM, Frijters, S.C.J. wrote:
> 
>> Hi Quincey,
>> 
>> I'm using H5T_NATIVE_DOUBLE to write an array of real*8 and H5T_NATIVE_REAL 
>> for real*4. Is that okay?
> 
>        That will work, but it will cause the I/O to be independent, rather 
> than collective (if you request collective).  Try writing to the file in the 
> same datatype you use for your memory datatype and see if the performance is 
> better.
> 
>        Quincey
> 
> 
>> Kind regards,
>> 
>> Stefan Frijters
>> ________________________________________
>> From: [email protected] [[email protected]] On 
>> Behalf Of Quincey Koziol [[email protected]]
>> Sent: 25 March 2010 22:40
>> To: HDF Users Discussion List
>> Subject: Re: [Hdf-forum] HDF5 causes Fatal error in MPI_Gather
>> 
>> Hi Stefan,
>> 
>> On Mar 25, 2010, at 5:08 AM, Frijters, S.C.J. wrote:
>> 
>>> Hi Quincey,
>>> 
>>> I managed to increase the chunk size - I overlooked the fact that my blocks 
>>> of data weren't cubes in my  testcase. However, it seems that performance 
>>> can suffer a lot for certain chunk sizes (for my test case):
>>> 
>>> The size of my entire data array is 40 x 40 x 160. My MPI cartesian grid is 
>>> 4 x 4 x 1, so every core has a 10 x 10 x 160 subset. Originally I had the 
>>> chunk size set to 10 x 10 x 160 as well (which explains why I couldn't 
>>> double the 3rd component), and writes take less than a second. However, if 
>>> I set the chunk size to 20 x 20 x 160, it's really slow (7 seconds), while 
>>> 40 x 40 x 160 once again takes less than a second. I'd been reading up on 
>>> the whole chunking thing before, but I  think I'm still ignorant of some of 
>>> the subtleties. Am I violating a rule here so that HDF goes back to 
>>> independent IO?
>> 
>>       Hmm, are your datatypes the same in memory and the file?  If they 
>> aren't, HDF5 will break collective I/O down into independent I/O.
>> 
>>> Is there some rule of thumb, or set of guidelines to get good performance 
>>> out of it? I read your "Parallel HDF5 Hints"  document and some others, but 
>>> it hasn't helped me enough, apparently :-D.  The time spent on IO in my 
>>> application is getting to be somewhat of a hot item.
>> 
>>       That would be where I'd point you.
>> 
>>               Quincey
>> 
>>> Thanks again for the continued support,
>>> 
>>> Stefan Frijters
>>> ________________________________________
>>> From: [email protected] [[email protected]] On 
>>> Behalf Of Quincey Koziol [[email protected]]
>>> Sent: 24 March 2010 17:12
>>> To: HDF Users Discussion List
>>> Subject: Re: [Hdf-forum] HDF5 causes Fatal error in MPI_Gather
>>> 
>>> Hi Stefan,
>>> 
>>> On Mar 24, 2010, at 11:06 AM, Frijters, S.C.J. wrote:
>>> 
>>>> Hi Quincey,
>>>> 
>>>> I can double one dimension on my chunk size (at the cost of really slow 
>>>> IO), but if I double them all I get errors like these:
>>>> 
>>>> HDF5-DIAG: Error detected in HDF5 (1.8.4) MPI-process 4:
>>>> #000: H5D.c line 171 in H5Dcreate2(): unable to create dataset
>>>> major: Dataset
>>>> minor: Unable to initialize object
>>>> #001: H5Dint.c line 428 in H5D_create_named(): unable to create and link 
>>>> to dataset
>>>> major: Dataset
>>>> minor: Unable to initialize object
>>>> #002: H5L.c line 1639 in H5L_link_object(): unable to create new link to 
>>>> object
>>>> major: Links
>>>> minor: Unable to initialize object
>>>> #003: H5L.c line 1862 in H5L_create_real(): can't insert link
>>>> major: Symbol table
>>>> minor: Unable to insert object
>>>> #004: H5Gtraverse.c line 877 in H5G_traverse(): internal path traversal 
>>>> failed
>>>> major: Symbol table
>>>> minor: Object not found
>>>> #005: H5Gtraverse.c line 703 in H5G_traverse_real(): traversal operator 
>>>> failed
>>>> major: Symbol table
>>>> minor: Callback failed
>>>> #006: H5L.c line 1685 in H5L_link_cb(): unable to create object
>>>> major: Object header
>>>> minor: Unable to initialize object
>>>> #007: H5O.c line 2677 in H5O_obj_create(): unable to open object
>>>> major: Object header
>>>> minor: Can't open object
>>>> #008: H5Doh.c line 296 in H5O_dset_create(): unable to create dataset
>>>> major: Dataset
>>>> minor: Unable to initialize object
>>>> #009: H5Dint.c line 1030 in H5D_create(): unable to construct layout 
>>>> information
>>>> major: Dataset
>>>> minor: Unable to initialize object
>>>> #010: H5Dchunk.c line 420 in H5D_chunk_construct(): chunk size must be <= 
>>>> maximum dimension size for fixed-sized dimensions
>>>> major: Dataset
>>>> minor: Unable to initialize object
>>>> 
>>>> I am currently doing test runs on my local machine on 16 cores because the 
>>>> large machine I run jobs on is unavailable at the moment and has a 
>>>> queueing system rather unsuited to quick test runs, so maybe this is an 
>>>> artefact of running on such a small number of cores? Although I *think* I 
>>>> tried this before and got the same type of error on several thousand cores 
>>>> also.
>>> 
>>>      You seem to have increased the chunk dimension to be larger than the 
>>> dataset dimension.  What is the chunk size and dataspace size you are using?
>>> 
>>>      Quincey
>>> 
>>>> 
>>>> Kind regards,
>>>> 
>>>> Stefan Frijters
>>>> 
>>>> ________________________________________
>>>> From: [email protected] [[email protected]] On 
>>>> Behalf Of Quincey Koziol [[email protected]]
>>>> Sent: 24 March 2010 16:28
>>>> To: HDF Users Discussion List
>>>> Subject: Re: [Hdf-forum] HDF5 causes Fatal error in MPI_Gather
>>>> 
>>>> Hi Stefan,
>>>> 
>>>> On Mar 24, 2010, at 10:10 AM, Stefan Frijters wrote:
>>>> 
>>>>> Hi Quincey,
>>>>> 
>>>>> Thanks for the quick response. Currently, each core is handling its
>>>>> datasets with a chunk size equal to the size of the local data (the dims
>>>>> parameter in h5pset_chunk_f is equal to to dims parameter in
>>>>> h5dwrite_f)  because the local arrays are not that large anyway (in the
>>>>> order of 20x20x20 reals), so if I understand things correctly I'm
>>>>> already using maximum chunk size.
>>>> 
>>>>     No, you don't have to make them the same size, since the collective 
>>>> I/O should stitch them back together anyway.  Try doubling the dimensions 
>>>> on your chunks.
>>>> 
>>>>> Do you have an idea why it doesn't crash the first time I try to do it
>>>>> though? It's a different array, but of the same size and datatype as the
>>>>> second. As far as I can see I'm closing all used handles at the end of
>>>>> my function at least.
>>>> 
>>>>     Hmm, I'm not certain...
>>>> 
>>>>             Quincey
>>>> 
>>>>> Kind regards,
>>>>> 
>>>>> Stefan Frijters
>>>>> 
>>>>>> Hi Stefan,
>>>>>> 
>>>>>> On Mar 24, 2010, at 3:11 AM, Stefan Frijters wrote:
>>>>>> 
>>>>>>> Dear all,
>>>>>>> 
>>>>>>> Recently, I've run into a problem with my parallel HDF5 writes. My
>>>>>>> program works fine on 8k cores, but when I run it on 16k cores it
>>>>>>> crashes when writing a data file through h5dwrite_f(...).
>>>>>>> File writes go through one function in the code, so it always uses the
>>>>>>> same code, but for some reason I don't understand it writes one file
>>>>>>> without problems, but the second one throws the following error message:
>>>>>>> 
>>>>>>> Abort(1) on node 0 (rank 0 in comm 1140850688): Fatal error in
>>>>>>> MPI_Gather: Invalid buffer pointer, error stack:
>>>>>>> MPI_Gather(758): MPI_Gather(sbuf=0xa356f400, scount=16000, MPI_BYTE,
>>>>>>> rbuf=(nil), rcount=16000, MPI_BYTE, root=0, comm=0x84000003) failed
>>>>>>> MPI_Gather(675): Null buffer pointer
>>>>>>> 
>>>>>>> I've been looking through the HDF5 source code and it only seems to call
>>>>>>> MPI_Gather in one place, in the function H5D_obtain_mpio_mode. In that
>>>>>>> function HDF tries to allocate a receive buffer using
>>>>>>> 
>>>>>>> recv_io_mode_info = (uint8_t *)H5MM_malloc(total_chunks * mpi_size);
>>>>>>> 
>>>>>>> Which then returns the null pointer seen in rbuf=(nil) instead of a
>>>>>>> valid pointer. Thus, to me it seems it's HDF causing the problem and not
>>>>>>> MPI.
>>>>>>> 
>>>>>>> This problem occurs in both collective and independent IO mode.
>>>>>>> 
>>>>>>> Do you have any idea what might be causing this problem, or how to
>>>>>>> resolve it? I'm not sure what kind of other information you might need,
>>>>>>> but I'll do my best to supply it, if you need any.
>>>>>> 
>>>>>> This is a scalability problem we are aware of and are working to address,
>>>>>> but in the meanwhile, can you increase the size of your chunks for your
>>>>>> dataset(s)?  (That will reduce the number of chunks and the size of the
>>>>>> buffer being allocated)
>>>>>> 
>>>>>>  Quincey
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> Hdf-forum is for HDF software users discussion.
>>>>> [email protected]
>>>>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Hdf-forum is for HDF software users discussion.
>>>> [email protected]
>>>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>>>> 
>>>> _______________________________________________
>>>> Hdf-forum is for HDF software users discussion.
>>>> [email protected]
>>>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>>> 
>>> 
>>> _______________________________________________
>>> Hdf-forum is for HDF software users discussion.
>>> [email protected]
>>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>>> 
>>> _______________________________________________
>>> Hdf-forum is for HDF software users discussion.
>>> [email protected]
>>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>> 
>> 
>> _______________________________________________
>> Hdf-forum is for HDF software users discussion.
>> [email protected]
>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>> 
>> _______________________________________________
>> Hdf-forum is for HDF software users discussion.
>> [email protected]
>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
> 
> 
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
> 
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org


_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Re: [Hdf-forum] HDF5 causes Fatal error in MPI_Gather

Reply via email to