I don't think you've read h5py source in enough detail :)

You're right - I haven't done more than browsed it.

It's based HEAVILY on duck typing.

There is a question here about what to do in D. On the one hand, the flexibility of being able to open a foreign HDF5 file where you don't know beforehand the dataset type is very nice. On the other, the adaptations needed to handle this flexibly get in the way when you are dealing with your own data that has a set format and where recompilation is acceptable if it changes. Looking at the 'ease' of processing JSON, even using vibed, I think that one will need to implement both eventually, but perhaps starting with static typing.


In addition, it has way MORE classes than the C++ hierarchy does. E.g., the high-level File object actually has these parents: File : Group, Group : HLObject, MutableMappingWithLock, HLObject : CommonStateObject and internally the File also keeps a reference to file id which is an instance of FileID which inherits from GroupID which inherits from ObjectID, do I need to continue?

Okay - I guess there is a distinction between the interface to the outside world (where I think the h5py etc way is superior for most uses) and the implementation. Is not the reason h5py has lots of classes primarily because that is how you write good code in python, whereas in many cases this is not true in D (not that you should ban classes, but often structs + free floating functions are more suitable).

PyTables, on the contrary is quite badly written (although it works quite well and there are brilliant folks on the dev team like francesc alted) and looks like a dump of C code interweaved with hackish Python code.

Interesting.  What do you think is low quality about the design?

In h5py you can do things like file["/dataset"].write(...) --> this just wouldn't work as is in a strictly typed language since the indexing operator generally returns you something of a Location type (or an interface, rather) which can be a group/datatype/dataset which is only known at runtime.

Well, if you don't mind recompiling your code when the data set type changes (or you encounter a new data set) then you can do that (which is what I posted a link to earlier).

It depends on your use case. It's hard to think of an application more dynamic than web sites, and yet people seem happy enough with vibed's use of compiled diet templates as the primary implementation. They would like the option of dynamic ones too, and I think this would be useful in this domain too, since one does look at foreign data on occasion. One could of course use the quick compilation of D to regenerate parts of the code when this happens. Whether or not this is acceptable depends on your use case - for some it might be okay, but obviously it is no good if you are writing a generic H5 browser/charting tool.

So I think if you don't allow static dataset typing it means the flexibility of dynamic typing gets in the way for some uses (which might be most of them), but you need to add dynamic typing too.

Shall we move this to a different thread and/or email, as I am afraid I have hijacked the poor original poster's request.

On the refcounting question, I confess that I do not fully understand your concern, which may well reflect a lack of deep experience with D on my part. Adam Ruppe suggests that it's generally okay to rely on a struct destructor to call C cleanup code. I can appreciate this may not be true with h5 and, if you can spare the time, I would love to understand more precisely why not.

Out of all of them, only the dataset supports the write method but you don't know it's going to be a dataset. See the problem?

In this case I didn't quite follow.  Where does this fall down ?

void h5write(T)(Dataset x, T data)


I have your email somewhere and will drop you a line. Or you can email me laeeth at laeeth.com. And let's create a new thread.



Laeeth.

Reply via email to