HDF5 is a very valuable tool for those working with large data sets.


HDF5 is a unique technology suite that makes possible the management of extremely large and complex data collections. The HDF5 technology suite includes:

* A versatile data model that can represent very complex data objects and a wide variety of metadata. * A completely portable file format with no limit on the number or size of data objects in the collection. * A software library that runs on a range of computational platforms, from laptops to massively parallel systems, and implements a high-level API with C, C++, Fortran 90, and Java interfaces. * A rich set of integrated performance features that allow for access time and storage space optimizations. * Tools and applications for managing, manipulating, viewing, and analyzing the data in the collection. * The HDF5 data model, file format, API, library, and tools are open and distributed without charge.

[HDF5] lets you store huge amounts of numerical data, and easily manipulate that data from NumPy. For example, you can slice into multi-terabyte datasets stored on disk, as if they were real NumPy arrays. Thousands of datasets can be stored in a single file, categorized and tagged however you want.

H5py uses straightforward NumPy and Python metaphors, like dictionary and NumPy array syntax. For example, you can iterate over datasets in a file, or check out the .shape or .dtype attributes of datasets. You don't need to know anything special about HDF5 to get started.

In addition to the easy-to-use high level interface, h5py rests on a object-oriented Cython wrapping of the HDF5 C API. Almost anything you can do from C in HDF5, you can do from h5py.

Best of all, the files you create are in a widely-used standard binary format, which you can exchange with other people, including those who use programs like IDL and MATLAB.

As far as I know there has not really been a complete set of HDF5 bindings for D yet.

Bindings should have three levels:
1. pure C API declaration
2. 'nice' D wrapper around C API (eg that knows about strings, not just char*)
3. idiomatic D interface that uses CTFE/templates

I borrowed Stefan Frijter's work on (1) above to get started. I cannot keep track of things when split over too many source files, so I put everything in one file - hdf5.d.

Have implemented a basic version of 2. Includes throwOnError rather than forcing checking status C style, but the exception code is not very good/complete (time + lack of experience with D exceptions).

(3) will have to come later.

It's more or less complete, and the examples I have translated so far mostly work. But still a work in progress. Any help/suggestions appreciated. [I am doing this for myself, so project is not as pretty as I would like in an ideal world].

Reply via email to