Hello, A while ago I had proposed an extension to PyTables to support files that use Dimension Scales. In the meantime I had to work on different projects, so I didn't have the time to respond. But now I found time to work on it again, so here my response to the last mail that we exchanged.
> Now, apart of some small technical details, the thing that I find the most > 'arguable' is the fact that you have introduced a new dtype (typecode 'r') > for NumPy (good trick, BTW). Is that strictly necessary? My worries is that > NumPy would decide in the future to make use of the 'r' typecode, in which > case, we would have a problem. Would not it be possible to use plain python > nested lists for this? I find the latter preferable, but perhaps you have > some use case for wanting a native NumPy typecode. I thought about that (actually, I once even wrote a version like that, but you mentioned that PyTables now supports record arrays as attributes). The problem is that dimension scales use extremely complicated data structures (a table in an attribute containing references) and doing all that with standard python data types is certainly possible. The drawback is that it is really hard to write a generic solution that works for a wide range of usecases. Therefore I ended up with a very complicated and very specific code that could cope with not much more than Dimension Scales. So, once we want to add support for other new data structures we would have to add new special code. The numpy reference solution on the other hand is very generic, it will be easy to also introduce tables in datasets containing references or other stuff. The problem of numpy using the typecode r for something else I don't see as a big problem: there is nearly noone who introduces new numpy datatypes, so we can just ask the numpy guys to reserve the r for us. Since they don't get many of those request (actually, I haven't found any) I guess they won't obstain. > Also, I don't see that you have made a proper implementation of 'Dimension > Scales' as understood in: > > http://ftp.hdfgroup.org/HDF5/Tutor/h5dimscale.html > > but only support for HDF5 references (but I suppose that, with a little more > of work we can be there...) Well, I wrote that once in an email. On purpose I didn't support dimension scales directly, but added support for the necessary data structures in PyTables. This has several advantages: firstly, my code runs without problems with hdf5 1.6, while dimension scales exist only starting 1.8. Supporting dimension scales directly would also have ment that we are creating attributes that we cannot interpret, and will show up as an unknown type only. Actually, PyTables at some point wasn't able to open datasets containing dimension scales at all. Once one has the support for references, and variable-length lists as attributes, it is trivial to do that in python. > Finally, I miss some test units, but that should easy to solve. Yep, I'm working on that. Especially some that create dimension scales... (I have written some already, they're just unreadable to anyone else than me...) > If you agree to work with that, I'd like to open a new public branch in the > PyTables repository and give you commit permissions there. We can continue > discussing the issues here so that other people can contribute with > opinions, test units, docstrings or whatever. That would be real cool. Greetings, Martin ------------------------------------------------------------------------------ Throughout its 18-year history, RSA Conference consistently attracts the world's best and brightest in the field, creating opportunities for Conference attendees to learn about information security's most important issues through interactions with peers, luminaries and emerging and established companies. http://p.sf.net/sfu/rsaconf-dev2dev _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users