> That is of great help! I had tried parsing a dtype in createTable (not
> accepted), but I never thought of trying an np-ndarray with that
> dtype. If that works, it will solve my problem (will try at work
> tomorrow).

OK, I have now tried creating a table using a nump.recarray in the
description field, and on close reading of the Library reference sec.
4.2.3 regarding the createTable method I see that this is indeed
documented there and so is the pydoc you get when you do, e.g., a
h5file.createtable? in an IPython shell.

What mislead me, was the Tutorial chapter 3 in the User's Guide, which
I followed to get a good understanding. First, in sec. 3.1.5 "Creating
a new table" an example is shown where you use an IsDescription
instance as the description parameter, and later in the more
complicated example on p. 34-35 in sec. 3.4 "Multidimensional table
cells and automatic sanity checks", it is also shown that you can use
a dictionary with XXXCol() instances as the dictionary values.

>From that example I got the impression that if you had a dtype, the
path to creating a table of that type would be to first create such a
dictionary or a Description instance translating the dtype values,
i.e., '<f8' to the XXXCol() instances, e.g., Float32Col(), which is
why I started looking for a factory method like
Description.from_dtype().

To not lead other numpy users along the same erroneous path (because
the Library reference is not the first chaper you start reading from
beginning to end), I could suggest to mention the recarray createtable
path somewhere in the tutorial section in Sec. 3.1.5 or 3.4,
alternatively write a chapter or an appendix specifically targeted at
the PyTables/numpy interplay, as it seems like there are many nifty
tricks you can use to get your work done very easy and fast if you are
working with numpy arrays together with PyTables.

Finally, I would also suggest to extend the implementation of
createTable such that the description paramer can also accept a
np.dtype instance. Personally, I append to a table in chunks of, say
1MB to keep down memory usage (I may have 1GB in a single table), and
here I prefer to first define the table, and then append data (seems
like the cleanest thing to do), so I therefore first use

h5file.createTable(...,description=np.empty(0, dtype=my_dtype), ...)

to define the table based on a dtype, which seems kinda awkward as
inside this method you are extracting the dtype from array. Having the
possibility to do

h5file.createTable(...,description=my_dtype, ...)

seems more intuitive for me.

Finally, this should not be considered as a rant, just some
observations from a new user.

My general impression is that the "product" is well very polished and
stable, the user's guide is well-written and the package is very, very
nice (it is an answer to my praisers and the end of a long frustrating
search for a package which could need my needs).

As mentioned in another post, there is an issue with installing LZO
for windows users and other details, but overall very good.

I am actually thrilled about PyTables, and I have eagerly demonstrated
its capabilities to my co-workers today (first create an HDF5 and then
browse and filter it using VITables), and they are impressed as well.
I think it is amazing that such a product is available under a free
license, and I would like to thank the creators for taking their time
to make it.

Cheers,
Kim

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to