Hi Tuan,

Please take a look at the netCDF4 solution, this should give you a good idea how to implement it. Or if you like to have a pure HDF5 solution look at the H5DS (HDF Dimension Scales API), which is developed for situation you want to implement.

In practice, I always implement it in this way (and yes, I also prefer to have "time" as a dataset, not an attribute):

   /Group1/dataset1[T-dim,Y-dim,X-dim] <== notice that the dimensions
   in C sequence
            dataset2[T-dim,Y1-dim, X1-dim] <== Y-dim can be different
   from Y1-dim or X-dim can be different from X1-dim
            time[T-dim]
   /Group2/dataset3[T1-dim,Y-dim,X-dim]
            dataset4[T1-dim,Y-dim,X1-dim]
            time[T1-dim]
   ...

Greetings, Richard

On 12/16/2011 09:23 AM, Werner Benger wrote:
Hi Tuan,

it might be more efficient to formulate such a time-array as a dataset in the root group rather than an attribute. Datasets don't have size limits, attributes have limitations as they are supposed to be small. Not sure how much it is, could be some 64k limitation.

This dataset might be a one-dimensional array of a compound structure, containing the floating-point value of the time and a string containing the corresponding group name. That way you can read this array quickly and access the group associated with it, independent on which naming convention is used for the group's name. Could even be some random combination of letters. Still, might be good to see this time-dataset as a "cache" for attributes that are stored in the group, so generating this time-dataset could also be a postprocessing step when scanning the groups in the file with time attributes. This might be more efficient than re-creating and appending this time-dataset when each new data set is added to the file, but this would need to be explore in practice. Iterating over groups is also pretty fast, even for large files, but depends how good/bad it will be in your use case.

  Werner

On Fri, 16 Dec 2011 00:44:52 -0600, Hoang Trong Minh Tuan <[email protected]> wrote:

    Dr. Werner,
      Thanks a lot for your advice. Right now, each HDF file has some
    groups, each group has 2 dataset, both  correspond to the same
    time-step. So, based on your suggestion, I think attributes
    (holding the time step information) should attach to the group.
     However, I want to quickly to read the time information into an
    array, so I'm thinking of putting the time points into an array
    which belong to an attribute of the root group. So, if the array
    is a[...], then if each group has 10 datasets
    a[1] is the time for dataset1 in group 1
    a[2]......                   dataset2 in group 1
    ...
    a[11] is the time for dataset1 in group 2

    Do you think that should be fine? Also, is there a limit for size
    of data containing in the attributes, or at least a good threshold?

    Thanks,
    Tuan





    On Thu, Dec 15, 2011 at 12:54 PM, Werner Benger
    <[email protected] <mailto:[email protected]>> wrote:

        Hi Tuan,

         if you have multiple datasets for the same time then it would
        be better to attach the time information to the common group
        where they are in.

        Using a double-valued attribute called "time" would do in the
        most simple case. If you need a more advanced specification of
        time, for instance using units on the time scale, you could
        use a named type for this time unit where such global
        properties are defined. This named type would best go in a
        group independent from those time group, for instance a group
        without time attribute, or the group which contains those time
        groups.

        Possibly you might also want some "reverse lookup" for each
        dataset's name, like a table on which time values this dataset
        is available, in case this changes and you don't have all
        datasets defined on all times. This could be done by another
        group, and subgroups for each dataset, and then using symbolic
        links to the actual data, or via some dataset that provides
        the same information as a table. Just symbolic links are more
        elegant, I don't think it's possible to make a dataset
        containing symbolic links, at most object references, but
        that's not the same.

            Werner


        On Thu, 15 Dec 2011 10:28:28 -0600, Hoang Trong Minh Tuan
        <[email protected]
        <mailto:[email protected]>> wrote:

            Hi Werner,
               I've just successfully created a HDF5 with multi-groups
            and multi-datasets. I have another question: what is the
            best way to attach the time information (or may be some
            others) to each dataset.


            Tuan

            On Thu, Dec 8, 2011 at 2:31 PM, Werner Benger
            <[email protected] <mailto:[email protected]>> wrote:

                 Hi Tuan,

                  with that many time steps, you might want to
                organize the time hierarchically, like having a group
                of hundred time groups, so 100 x 100 time groups cover
                the 10.000 timesteps. It's probably inefficient to
                have 10.000 timesteps or more in the same group,
                though I don't have experience (yet) with that
                scenario. It would also be inefficient if all your
                datasets per time step are pretty small. It might be
                better in that case to use a multidimensional dataset
                with one varying dimension, and this dimension being
                the time, such that you can append data as it flows
                and you get new ones.

                 I don't use IDL, so I don't know which constraints
                IDL would give on the HDF5 layout. If IDL is your
                primary target, it might be best to investigate what
                data layout IDL can handle best.

                   Werner



                On Thu, 08 Dec 2011 07:01:36 -0600, Hoang Trong Minh
                Tuan <[email protected]
                <mailto:[email protected]>> wrote:

                    Hi Dr. Werner,
                       I'm doing the simulation of cells. In such
                    case, one group is a snapshot at a single time
                    point of the system. As such, I will have tens of
                    thousands of such groups in a file; or maybe
                    multiple files, each file contains thousands of
                    groups. Also, I want to generate the video from
                    these snapshots using IDL. Would your suggestion
                    still be the reasonable approach or should I do in
                    a different way? . Thank you!

                    Bests,
                    Tuan


                    On Thu, Dec 8, 2011 at 2:28 AM, Werner Benger
                    <[email protected] <mailto:[email protected]>>
                    wrote:

                        Hi  Tuan,

                         why don't you put all datasets which belong
                        to a specific time into a group, one group for
                        each timestep, and attach time information
                        (physical time, seconds, float attribute) as
                        attribute to this group?

                           Werner

                        On Thu, 08 Dec 2011 00:40:34 -0600, Hoang
                        Trong Minh Tuan <[email protected]
                        <mailto:[email protected]>> wrote:

                            Hi all,
                               I am doing simulation in which I need
                            to keep track of time information and
                            2d/3d data at each time step ( I may have
                            more than one arrays). My question is what
                            is the best way to store such data. Should
                            I keep 2 separate dataset, one to store
                            time, and one to store 2d/3d data; or I
                            can combine them into a special dataset
                            (which is I don't know)?
                               Thanks a lot,


                            Tuan




-- ___________________________________________________________________________
                        Dr. Werner Benger Visualization Research
                        Laboratory for Creative Arts and Technology (LCAT)
                        Center for Computation & Technology at
                        Louisiana State University (CCT/LSU)
                        211 Johnston Hall, Baton Rouge, Louisiana 70803
                        Tel.: +1 225 578 4809
                        <tel:%2B1%20225%20578%204809> Fax.: +1 225
                        578-5362 <tel:%2B1%20225%20578-5362>





-- ___________________________________________________________________________
                Dr. Werner Benger Visualization Research
                Laboratory for Creative Arts and Technology (LCAT)
                Center for Computation & Technology at Louisiana State
                University (CCT/LSU)
                211 Johnston Hall, Baton Rouge, Louisiana 70803
                Tel.: +1 225 578 4809 <tel:%2B1%20225%20578%204809>
                Fax.: +1 225 578-5362 <tel:%2B1%20225%20578-5362>





-- ___________________________________________________________________________
        Dr. Werner Benger Visualization Research
        Laboratory for Creative Arts and Technology (LCAT)
        Center for Computation & Technology at Louisiana State
        University (CCT/LSU)
        211 Johnston Hall, Baton Rouge, Louisiana 70803
        Tel.: +1 225 578 4809 <tel:%2B1%20225%20578%204809> Fax.: +1
        225 578-5362 <tel:%2B1%20225%20578-5362>





--
___________________________________________________________________________
Dr. Werner Benger Visualization Research
Laboratory for Creative Arts and Technology (LCAT)
Center for Computation & Technology at Louisiana State University (CCT/LSU)
211 Johnston Hall, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 Fax.: +1 225 578-5362


_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Reply via email to