Hi Tuan,
it might be more efficient to formulate such a time-array as a dataset in the root group rather than an attribute. Datasets don't have size limits, attributes have limitations as they are supposed to be small. Not sure how much it is, could be some 64k limitation.
This dataset might be a one-dimensional array of a compound structure, containing the floating-point value of the time and a string containing the corresponding group name. That way you can read this array quickly and access the group associated with it, independent on which naming convention is used for the group's name. Could even be some random combination of letters. Still, might be good to see this time-dataset as a "cache" for attributes that are stored in the group, so generating this time-dataset could also be a postprocessing step when scanning the groups in the file with time attributes. This might be more efficient than re-creating and appending this time-dataset when each new data set is added to the file, but this would need to be explore in practice. Iterating over groups is also pretty fast, even for large files, but depends how good/bad it will be in your use case.
Werner
On Fri, 16 Dec 2011 00:44:52 -0600, Hoang Trong Minh Tuan <[email protected]> wrote:
Dr. Werner,Thanks a lot for your advice. Right now, each HDF file has some groups, each group has 2 dataset, both correspond to the same time-step. So, based on your suggestion, I think attributes (holding the time step information) should attach to the group.However, I want to quickly to read the time information into an array, so I'm thinking of putting the time points into an array which belong to an attribute of the root group. So, if the array is a[...], then if each group has 10 datasetsa[1] is the time for dataset1 in group 1a[2]...... dataset2 in group 1...a[11] is the time for dataset1 in group 2Do you think that should be fine? Also, is there a limit for size of data containing in the attributes, or at least a good threshold?Thanks,Tuan
On Thu, Dec 15, 2011 at 12:54 PM, Werner Benger <[email protected]> wrote:Hi Tuan,if you have multiple datasets for the same time then it would be better to attach the time information to the common group where they are in.Using a double-valued attribute called "time" would do in the most simple case. If you need a more advanced specification of time, for instance using units on the time scale, you could use a named type for this time unit where such global properties are defined. This named type would best go in a group independent from those time group, for instance a group without time attribute, or the group which contains those time groups.Possibly you might also want some "reverse lookup" for each dataset's name, like a table on which time values this dataset is available, in case this changes and you don't have all datasets defined on all times. This could be done by another group, and subgroups for each dataset, and then using symbolic links to the actual data, or via some dataset that provides the same information as a table. Just symbolic links are more elegant, I don't think it's possible to make a dataset containing symbolic links, at most object references, but that's not the same.WernerOn Thu, 15 Dec 2011 10:28:28 -0600, Hoang Trong Minh Tuan <[email protected]> wrote:
Hi Werner,I've just successfully created a HDF5 with multi-groups and multi-datasets. I have another question: what is the best way to attach the time information (or may be some others) to each dataset.
TuanOn Thu, Dec 8, 2011 at 2:31 PM, Werner Benger <[email protected]> wrote:Hi Tuan,with that many time steps, you might want to organize the time hierarchically, like having a group of hundred time groups, so 100 x 100 time groups cover the 10.000 timesteps. It's probably inefficient to have 10.000 timesteps or more in the same group, though I don't have experience (yet) with that scenario. It would also be inefficient if all your datasets per time step are pretty small. It might be better in that case to use a multidimensional dataset with one varying dimension, and this dimension being the time, such that you can append data as it flows and you get new ones.I don't use IDL, so I don't know which constraints IDL would give on the HDF5 layout. If IDL is your primary target, it might be best to investigate what data layout IDL can handle best.WernerOn Thu, 08 Dec 2011 07:01:36 -0600, Hoang Trong Minh Tuan <[email protected]> wrote:
Hi Dr. Werner,I'm doing the simulation of cells. In such case, one group is a snapshot at a single time point of the system. As such, I will have tens of thousands of such groups in a file; or maybe multiple files, each file contains thousands of groups. Also, I want to generate the video from these snapshots using IDL. Would your suggestion still be the reasonable approach or should I do in a different way? . Thank you!Bests,Tuan
On Thu, Dec 8, 2011 at 2:28 AM, Werner Benger <[email protected]> wrote:Hi Tuan,why don't you put all datasets which belong to a specific time into a group, one group for each timestep, and attach time information (physical time, seconds, float attribute) as attribute to this group?WernerOn Thu, 08 Dec 2011 00:40:34 -0600, Hoang Trong Minh Tuan <[email protected]> wrote:
Hi all,I am doing simulation in which I need to keep track of time information and 2d/3d data at each time step ( I may have more than one arrays). My question is what is the best way to store such data. Should I keep 2 separate dataset, one to store time, and one to store 2d/3d data; or I can combine them into a special dataset (which is I don't know)?Thanks a lot,Tuan--___________________________________________________________________________
Dr. Werner Benger Visualization Research
Laboratory for Creative Arts and Technology (LCAT)
Center for Computation & Technology at Louisiana State University (CCT/LSU)
211 Johnston Hall, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 Fax.: +1 225 578-5362--___________________________________________________________________________
Dr. Werner Benger Visualization Research
Laboratory for Creative Arts and Technology (LCAT)
Center for Computation & Technology at Louisiana State University (CCT/LSU)
211 Johnston Hall, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 Fax.: +1 225 578-5362--___________________________________________________________________________
Dr. Werner Benger Visualization Research
Laboratory for Creative Arts and Technology (LCAT)
Center for Computation & Technology at Louisiana State University (CCT/LSU)
211 Johnston Hall, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 Fax.: +1 225 578-5362
--
___________________________________________________________________________
Dr. Werner Benger Visualization Research
Laboratory for Creative Arts and Technology (LCAT)
Center for Computation & Technology at Louisiana State University (CCT/LSU)
211 Johnston Hall, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 Fax.: +1 225 578-5362
Dr. Werner Benger Visualization Research
Laboratory for Creative Arts and Technology (LCAT)
Center for Computation & Technology at Louisiana State University (CCT/LSU)
211 Johnston Hall, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 Fax.: +1 225 578-5362
_______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
