Thank you for the response.
2013/3/5 <[email protected]> > On 2013-02-27 21:01, Pradeep Jha wrote: > >> Thanks for the response. So from what I understand, the HD5 fortran >> wrapper automatically transposes the matrix to store it in the C >> storage conventions. So puttings "dims(1) = Nx" and "dims(3) = Nz" is >> correct. HDF5 fortran wrapper is just transposing the data inside the >> program before storing it in the final h5 format. >> >> But this is confusing me about something. >> >> I convert my original unformatted data written by fortran to h5 >> format so that I can visualize the original data using a software >> (Paraview). Does that mean that the data I will visualize using >> Paraview and the h5 data file will be a transposed data of what I >> originally intended to visualize? >> > > Yes, if you are writing a multidimensional array using the fortran APIs > and you want to read the data back using the same dimensional array then > you need to use the fortran APIs. If you were to use a C program to read > the hdf5 data back then you need to account for the transposing in the C > program, or handle it when writing from fortran. > > Here is an explanation from the archives: > > HDF5 is a "self-describing" format, which means that HDF5 metadata >> stored in a dataset object header allows the HDF5 C library and any >> other non-C applications built on top of it, to retrieve a raw data >> (i.e. elements of a multidimensional array) in the correct order. >> >> (Let's for a second forget about HDF5, C and Fortran, Python and >> Matlab :-) ) >> >> If we have a matrix A(N,M,K), we usually count dimensions from left >> to right saying that the first dimension has size N, the second >> dimension has size M, the third dimension has size K, and so on. >> >> (Now let's talk about HDF5 but without referring to any language.) >> >> When we describe a matrix using HDF5 datatspace object, we use the >> same convention (i.e. specifying dimensions from left to right): the >> first dimension has size N, the second dimension has size M, the >> third dimension has size K. (Aside: Please notice that this >> description is valid for both C and Fortran HDF5 applications, i.e. C >> and Fortran dims array needed by H5Screate_simple >> (h5screate_simple_f) will have the values dims [] = {N,M,K}). >> >> The question is: how does HDF5 know how to interpret a blob of {N x >> M x K x by sizeof(datatype)} bytes of dataset raw data stored in the >> file? Was A(N,M,K) stored? Or was it A(K,N,M) stored? Or any other >> permutation of (K,N,M)? >> >> HDF5 file has no clue about matrices and their dimensions, and the >> languages they were written from. This is application's >> responsibility to interpret data correctly and pass the correct >> interpretation to the HDF5 C library to store in a file. >> >> As it was mentioned above, dimensions of the matrix are described >> using HDF5 dataspace object and are stored in the file. d integers >> P1, ..., Pd, where d is a rank of a matrix, are stored in a dataspace >> object header according to the following convention: the last value >> - Pd is the size of the FASTEST changing dimension of the matrix, >> i.e. HDF5 file spec and HDF5 C library follow C storage convention >> (no wonder, it is a C library :-). Therefore there is no ambiguity in >> interpreting {N x M x K x sizeof(datatype)} bytes, and HDF5 file has >> enough information to interpret data correctly by any "row-major" or >> "column-major" application (including bypassing HDF5 C library and >> reading directly from the HDF5 file!) >> >> Here is what is happening when HDF5 Fortran library is used: >> >> Suppose we want to write A(N,M,K) matrix to the HDF5 file. HDF5 >> Fortran API describes dataspace with the first dimension being N, the >> second dimension being M, the third dimension being K (as we would do >> it in C and any other language). But HDF5 Fortran API also knows >> that the fastest changing dimension has size N (i.e. we have >> column-major order). Therefore HDF5 Fortran library instructs C >> library to store K,M,N values in the dataspace object header instead >> of N,M,K, since N is the size of the fastest changing dimension. >> >> So, if we read matrix A(N,M,K) ((i.e. N x M x K x sizeof(datatype) >> blob) written from Fortran by a C application, we will read it to >> the matrix B(K,M,N) ( C API that requests sizes of the first, second >> and third dimensions will return values K,M,N stored in the dataspace >> header) >> >> If we read matrix A(N,M,K) written from Fortran by Fortran >> application, we will read it once again into B(N,M,K) ( Fortran API >> that requests sizes of the first, second and third dimension will >> flip an array K,M,N stored in the file and return N,M,K) >> >> In other words: HDF5 library stores information about how to >> interpret data. Interpretation follows C storage convention: the last >> dimension specified for the dataspace object is the fastest changing >> one. It is the responsibility of the application (in this case >> FORTRAN HDF5 library) to interpret correctly the order of dimensions >> and pass to/ from the HDF5 C library. >> >> Please notice that there is no need to transpose data itself: one >> only has to pass a correct interpretation of the data to the HDF5 C >> Library and to make sure it is done according to the HDF5 C library >> convention - the first value stored in the dataspace header >> corresponds to the slowest changing dimension, ...., the last value >> stored in the dataspace header corresponds to the fastest changing >> dimension). >> > > > ______________________________**_________________ > Hdf-forum is for HDF software users discussion. > [email protected] > http://mail.hdfgroup.org/**mailman/listinfo/hdf-forum_**hdfgroup.org<http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org> >
_______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
