See comments embedded below. . . From: Hdf-forum <[email protected]<mailto:[email protected]>> on behalf of "Dogrul, Can@DWR" <[email protected]<mailto:[email protected]>> Reply-To: HDF Users Discussion List <[email protected]<mailto:[email protected]>> Date: Wednesday, February 24, 2016 7:55 AM To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: [Hdf-forum] Many datasets in an HDF5 file
Unsubscribe It appears that you have subscribed to commercial messages from this sender. To stop receiving such messages from this sender, please unsubscribe<http://secure-web.cisco.com/1A1STJFlrYW3pTYGjGkMV3NVABn_SrVUt5GFlSSYKh0vKy2xTGJoDuUy9XGiioXFcjwRNGD9_HamR1Ppz6cfaxQdbq9cFewBg5CY3laSbxSX80zDDBTeENu7Y-nmdENs5EuE3QjYK1LT1PvSrZCDTtBy_6HotIJ3uPcB_4RmfqhvAOm91vuZ6vFsxYvohgxn9GD8i-i3KcIsdGU9m4CHeDmLoRYFZaBN96BG3v9612CNzdHREayXFC2-cQIvRhfVYVU7heOSIBUI4Befjnnoi8t096qlCk9dCwE-m9w6fsDLTd-uPUIMs_jmDtKGDxepyrEqk0u7K2Bwyyv8Nok5_zDGyl14SR1FPh4xaPShhfy__Ikr1v6hf4ZBqM6Rz3DNDjz1U-bAmK7O0Yqv-vG7Pl3Dy_6k_u6rIRWva7QBIqSn3cgVRDJTPVzssHgLHyXdgiRzIJ7FHpYrIVqsNPNohU-V-gV2_8zjF77GYygwAP152r09iqc83Su1hJ0B0vBTda1bBOhrWg3H28FiYdM2Q7Q/l70%3Ahttp%3A%2F%2Flists.hdfgroup.org%2Fmailman%2Foptions%2Fhdf-forum_lists.hdfgroup.org63%3Amailto%3Ahdf-forum-request%40lists.hdfgroup.org%3Fsubject%3Dunsubscribee> Hello, I am planning to store potentially tens of thousands of 2-D datasets along with their relevant attributes to an HDF5 file. Each dataset will have the same number of rows but different number of columns. I am rather new to HDF so I thought I’d ask about any potential pitfalls before diving into coding. Are there any memory or performance issues I should be concerned about due to the large number of datasets being dealt with? If its practical, I think you would want to try to distribute the datasets among several groups in a group hierarchy of some modest depth, maybe 2-6 depending on dataset count. Putting all datasets in a single group is probably not the best approach as it leads to a rather large single structure necessary to manage all the members of that group. And what about the file size? Currently, the data is stored in native Fortran sequential binary file format. The file sizes range from tens of GBytes to over 100 GBytes depending on the application they are generated from. Should I expect a file size that is much larger than its Fortran binary counterpart or about the same size? That depends. Is this a *lot* of tiny datasets or a lot of large-ish datasets? I think dataset header overheads are on the order of 1/2 kilobyte. Its worse if you chunk the datasets (e.g. use H5P_CHUNK storage mode when you create the datasets). If your average dataset size is say 20x that (e.g. >= 10Kb), then I think the file size difference will NOT be significant. Hope that helps. Any information would be greatly appreciated. Thanks, Jon ************************************************** Emin C. Dogrul, Ph.D., P.E. Water Resources Engineer Hydrologic Models Development Unit California Department of Water Resources Bay-Delta Office 1416 9th Street, Rm 252A Sacramento, CA 95814 Phone: (916) 654 7018 Fax: (916) 653 6077 e-mail: [email protected]<mailto:[email protected]> **************************************************
_______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org Twitter: https://twitter.com/hdf5
