Francesc Alted <falted <at> pytables.org> writes: > > A Friday 02 May 2008, Glenn escrigué: > > Hello, > > I would like to use pytables to store the output from a spectrometer. > > The spectra come in at a rapid rate. I am having trouble > > understanding how to set up a data structure for the data. The two > > options that seem reasonable are an EArray and a Table. The example > > shown for an EArray leaves me wondering how to make an array of > > numpy 1D array rows that I can dynamically add to. > > If all the data you want to save is homogeneous, using an EArray is ok. > See below an example of use: > > N = 10 # your 1D array length > f = tables.openFile("test.h5", "w") > e = f.createEArray(f.root, 'earray', tables.FloatAtom(), (0,N), 'test') > for i in xrange(10): > e.append([numpy.random.rand(N)]) > f.close() > > > With a Table, I > > tried setting up an IsDescription subclass but could not figure out > > how to add a member to again represent a 1D array. > > Generally speaking, a Table is best for saving heterogeneous datasets. > In addition, the I/O is buffered in PyTables space (and not only in > HDF5) and it is generally faster than using an EArray, so it may be > more adequate in your case. > > Representing a 1D column is as easy as passing a 'shape=(N,)' argument > to your 1D columns. Look at this example: > > N = 10 # your 1D array length > class TTable(tables.IsDescription): > col1 = tables.Int32Col(pos=0) > col2 = tables.Float64Col(shape=(N,), pos=1) # you 1D column > f = tables.openFile("test.h5", "w") > t = f.createTable(f.root, 'table', TTable, 'table test') > for i in xrange(10): > t.append([[i, numpy.random.rand(N)]]) > t.flush() > f.close() > > Hope that helps, >
Thank you for the help, I got it working with a Table now. I have a couple of new questions: My table has a column with a 1000 element 1d numpy array. I would like to do the following types of operations where I treat this column as a N x 1000 2d array, call it X: mean(X,axis=0) std(X[k].reshape((k, N/k))) In the mean case, I could imagine doing something like: m = zeros((1,1000)) for row in X: m = m + x m/N But it seems like this will be slow. I tried just numpy.mean(X) out of curiosity, but it took forever and finally ran out of memory. I assume it was forming a copy of the array in memory. Thanks again for the help! ------------------------------------------------------------------------- This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users