A Monday 01 February 2010 15:15:45 Jon Olav Vik escrigué:
> I'd like suggestions on how to work around
> NotImplementedError: record arrays with columns with type description
>  ``([('a', '
> <f8'), ('b', '<f8')],(3,))`` are not supported yet, sorry
> 
> I have multivariate time-series with a fixed number of time-steps stored as
> numpy structured arrays. For example, two records with a field called "y"
>  that contains an array of shape (3,) with items having fields "a" and "b".
> 
> (¤¤¤ = >>> to get through Gmane's top-posting filter)
> 
> ¤¤¤ x
> rec.array([([(0.0, 1.0), (2.0, 3.0), (4.0, 5.0)],),
>        ([(6.0, 7.0), (8.0, 9.0), (10.0, 11.0)],)],
>       dtype=[('y', [('a', '<f8'), ('b', '<f8')], 3)])
> ¤¤¤ x.y.a
> array([[  0.,   2.,   4.],
>        [  6.,   8.,  10.]])
> 
> However, it seems PyTables cannot deal with a column that contains an array
>  of a structured dtype. Example:
> 
> import numpy as np
> import tables as pt
> dtype = [("y", [("a", float), ("b", float)], 3)]
> x = np.arange(12.0).view(dtype, np.recarray)
> x.y.a[1] # array([  6.,   8.,  10.])
> f = pt.openFile("testnest.h5", "w")
> t = f.createTable(f.root, "t", x)
> Traceback (most recent call last):
> NotImplementedError: record arrays with columns with type description
>  ``([('a', '
> <f8'), ('b', '<f8')],(3,))`` are not supported yet, sorry
> 
> What is the best workaround for this, given that I have a lot of existing
>  code based on the assumption that each record has a field called y that
>  contains a vector whose items have fields ("a", "b")?

Yes, multidimensional tables are not supported yet (although multidimensional 
columns are).  I plan to implement column-wise tables in the near future, and 
I think that multidimensional tables would fit well within these new beasts.
 
> One option is to store each record as a plain float array, keeping the
>  dtype as an extra attribute and converting every time I put data in or
>  out.
> 
> ¤¤¤ xf = x.view(float).reshape(len(x), -1)
> ¤¤¤ xf
> array([[  0.,   1.,   2.,   3.,   4.,   5.],
>        [  6.,   7.,   8.,   9.,  10.,  11.]])
> 
> Another is to go from a record that has a field that
> ...contains a vector whose items have fields ("a", "b")
> to
> ...has fields ("a", "b") whose items are vectors
> 
> This can be stored in PyTables. However, the data in each record will be
> transposed (all time-points for each field will be contiguous).

Well, I think this is your best bet with PyTables for the moment.  If this is 
not enough for you, you may want to try h5py that, if I recall correctly, has 
support for multidimensional tables.

Cheers,

-- 
Francesc Alted

------------------------------------------------------------------------------
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to