Hi Abiel, On Fri, 2009-09-25 at 23:07 -0400, Abiel Reinhart wrote: > I am attempting to store a large number of moderately-sized > variable-length numpy arrays in a PyTables database, where each array > can be referred to by a string key. Looking through the mailing list > archives, it seems that one possible solution to this problem is to > simply create a large number of Array objects.
<snip> Another solution is to create a VLArray (variable-length array). Like this: >>> import tables >>> import numpy as np >>> h5f = tables.openFile('test.h5', 'w') >>> h5f.createVLArray('/', 'test', tables.Int32Atom()) /test (VLArray(0,)) '' atom = Int32Atom(shape=(), dflt=0) byteorder = 'little' nrows = 0 flavor = 'numpy' >>> for i in range(10000): ... a1 = np.arange(np.random.randint(1000, 10000)) ... h5f.root.test.append(a1) On my puny Eee PC (Atom 1.6 Ghz variable), linux, python 2.6 it runs in roughly 7 seconds, while the arrays are recreated throughout the loop and have variable size. So already it is faster than your test. You can reference a particular array with h5f.root.test[idx] where idx can of course be textual, in the sense that h5f.root.test[int(idx)] can be used if idx is '1233'. At least, this was suggested by Francesc when I brought up my own problem on this list. Good luck, David ------------------------------------------------------------------------------ Come build with us! The BlackBerry® Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9-12, 2009. Register now! http://p.sf.net/sfu/devconf _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users