Hi Jeff, I had a look at your document I have some comments:
(page 2) It looks like you intend for your "schema" to be a human-readable description of how the data are organized and not a formal specification that can be used to validate an HDF5 file via a check tool of some sort (as in an XML schema). Although I'm sure this is informally useful to you, this lack of a machine-verifiable formal specification would be a major weakness of HDF5ds. (page 3) HDF5 not specifying how user metadata should be structured is not really a "limitation" of HDF5. Different users will have differing ideas about what metadata is important so we don't lock people into a particular arrangement. (page 3) You can store mixed types in an attribute using a compound type. (page 3) Encoding your metadata as a JSON object is similar to storing parseable strings in database tables - you aren't leveraging the strength of the platform. In the grand scheme of things, it probably isn't a big deal to store your metadata as JSON strings (especially if they are small and infrequently accessed) and maybe that fits well into your code, but the more HDF5-centered way to store that metadata would be as a several independent attributes. (page 4) HDF5 specifies references to nodes as absolute paths. You can use region references to refer to subsets of a dataset. HDF5 also supports external links to other files. (page 4) The term "settings" is probably too experimentally oriented for general HDF5 use. What does "settings" mean in a file that stores phylogenetic tree data or patient history data? (page 5) The idea of associating attributes to collections of objects in the HDF5 file is an interesting one, though I'm not sure how to cleanly handle that off the top of my head. Definitely something to keep in mind. I would want to handle that inside the library, though, and not via easily broken parseable string attributes. Unfortunately, I can't really weigh in on what sort of similar work has been done in this area (others at THG will have to do that) but there's clearly a need for some sort of formal, verifiable HDF5 schema. Cheers, Dana On Tue, Jan 22, 2013 at 1:42 PM, Jeff Teeters <[email protected]> wrote: > I'm part of a group which is working to develop standards for sharing > neuroscience electrophysiology data using HDF5. As part of this effort we > developed proposed conventions for using HDF5 for data sharing which are > independent of any domain. A main goal of these conventions is to provide > a standard way of specifying schemata that describe data and metadata > within an HDF5 file. We named these proposed conventions, HDFds - (ds for > data sharing). > > I'm concerned that some of what is proposed might overlap with previous > work or that there might be better ways of achieving the desired > functionality. I would very much appreciate any feedback about these > proposed conventions, either to the mailing list or to me directly. > > Thanks, > Jeff > > > _______________________________________________ > Hdf-forum is for HDF software users discussion. > [email protected] > http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org > >
_______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
