On Saturday, 10 January 2015 at 20:55:05 UTC, Laeeth Isharc wrote:
Hi Aldanor.
I wrote a slightly longer reply, but mislaid the file somewhere.
I guess your question might relate to wrapping the HDF5 library
- something that I have already done in a basic way, although I
welcome your project, as no doubt we will get to a higher
quality eventual solution that way.
One question about accurately representing the HDF5 object
hierarchy. Are you sure you wish to do this rather than
present a flattened approach oriented to what makes sense to
make things easy for the user in the way that is done by h5py
and pytables?
In terms of the actual garbage generated by this library -
there are lots of small objects. The little ones are things
like a file access attribute, or a schema for a dataset. But
really the total size taken up by the small ones is unlikely to
amount to much for scientific computing or for quant finance if
you have a small number of users and are not building some kind
of public web server. I think it should be satisfactory for
the little objects just to wrap the C functions with a D
wrapper and rely on the object destructor calling the C
function to free memory. On the rare occasions when not, it
will be pretty obvious to the user and he can always call
destroy directly.
For the big ones, maybe reference counting brings enough value
to be useful - I don't know. But mostly you are either passing
data to HDF5 to write, or you are receiving data from it. In
the former case you pass it a pointer to the data, and I don't
think it keeps it around. In the latter, you know how big the
buffer needs to be, and you can just allocate something from
the heap of the right size (and if using reflection, type) and
use destroy on it when done.
So I don't have enough experience yet with either D or HDF5 to
be confident in my view, but my inclination is to think that
one doesn't need to worry about reference counting. Since
objects are small and there are not that many of them, relying
on the destructor to be run (manually if need be) seems likely
to be fine, as I understand it. I may well be wrong on this,
and would like to understand the reasons if so.
Laeeth.
Thanks for the reply. Yes, this concerns my HDF5 wrapper project;
the main concern is not that the memory consumption of course,
but rather explicitly controlling lifetimes of the objects
(especially objects like files -- so you are can be sure there
are no zombie handles floating around). Most of the time when
you're doing some operations on an HDF5 file you want all handles
to get closed by the time you're done (i.e. by the time you leave
the scope) which feels natural (e.g. close groups, links etc).
Some operations in HDF5, particularly those related to
linking/unlinking/closing may behave different if an object has
any chilld objects with open handles. In addition to that, the C
HDF5 library retains the right to reuse both the memory and id
once the refcount drops to zero so it's best to be precise about
that and keep a registry of weak references to all C ids that D
knows about (sort of the same way as h5py does in Python).