Hi Martin,
        Thanks for finding this!  I've made the patch to the trunk and will be 
migrating it back to the 1.8.9 release.

        Quincey

On Apr 20, 2012, at 3:10 PM, Martin Otte wrote:

> Hi,
> 
> I have had sporadic crashes with parallel HDF5, and when I checked my code 
> with valgrind it seems that the crash is due to a bug in H5Smpio.c. I am 
> using hdf5 version 1.8.8.
> 
> In routine H5S_obtain_datatype, starting near line 568 of H5Smpio.c, memory 
> is being realloced if larger buffers are necessary:
> 
> /* Check if we need to increase the size of the buffers */
> if(outercount >= alloc_count) {
>     MPI_Aint     *tmp_disp;         /* Temporary pointer to new displacement 
> buffer */
>     int          *tmp_blocklen;     /* Temporary pointer to new block length 
> buffer */
>     MPI_Datatype *tmp_inner_type;   /* Temporary pointer to inner MPI 
> datatype buffer */
> 
>     /* Double the allocation count */
>     alloc_count *= 2;
> 
>     /* Re-allocate the buffers */
>     if(NULL == (tmp_disp = (MPI_Aint *)H5MM_realloc(disp, alloc_count * 
> sizeof(MPI_Aint))))
>         HGOTO_ERROR(H5E_DATASPACE, H5E_CANTALLOC, FAIL, "can't allocate array 
> of displacements")
>     disp = tmp_disp;
>     if(NULL == (tmp_blocklen = (int *)H5MM_realloc(blocklen, alloc_count * 
> sizeof(int))))
>         HGOTO_ERROR(H5E_DATASPACE, H5E_CANTALLOC, FAIL, "can't allocate array 
> of block lengths")
>     blocklen = tmp_blocklen;
>     if(NULL == (tmp_inner_type = (MPI_Datatype *)H5MM_realloc(inner_type, 
> alloc_count * sizeof(MPI_Datatype))))
>         HGOTO_ERROR(H5E_DATASPACE, H5E_CANTALLOC, FAIL, "can't allocate array 
> of inner MPI datatypes")
> } /* end if */
> 
> However, unlike with the "disp" and "blocklen" buffers, the inner_type is 
> never pointed to the new tmp_inner_type buffer!! So now inner_type has been 
> freed and doesn't point to anything, and the realloced memory is leaked and 
> will never be freed.
> 
> The fix is to just add a line:
> 
> inner_type = tmp_inner_type;
> 
> after the call to H5MM_realloc as for the "disp" and "blocklen" buffers. I 
> have attached a patch for this. With this fix, parallel hdf5 works very well 
> for me, but without the fix I get many crashes. I hope this can be fixed for 
> the 1.8.9 release,
> 
> Martin J. Otte
> Atmospheric Modeling and Analysis Division
> U.S. Environmental Protection Agency
> 109 T.W. Alexander Drive, Mail Drop E243-03
> Research Triangle Park, NC 27711 USA
> 
> Fax: 919-541-1379
> Voice: 919-541-0147
> <hdf5-H5Smpio_realloc.patch>_______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Reply via email to