Francesc Alted <falted <at> pytables.org> writes:
>
> A Friday 02 May 2008, Glenn escrigué:
> > Hello,
> > I would like to use pytables to store the output from a spectrometer.
> > The spectra come in at a rapid rate. I am having trouble
> > understanding how to set up a data structure for the data. The two
> > options that seem reasonable are an EArray and a Table. The example
> > shown for an EArray leaves me wondering how to make an array of
> > numpy 1D array rows that I can dynamically add to.
>
> If all the data you want to save is homogeneous, using an EArray is ok.
> See below an example of use:
>
> N = 10 # your 1D array length
> f = tables.openFile("test.h5", "w")
> e = f.createEArray(f.root, 'earray', tables.FloatAtom(), (0,N), 'test')
> for i in xrange(10):
> e.append([numpy.random.rand(N)])
> f.close()
>
> > With a Table, I
> > tried setting up an IsDescription subclass but could not figure out
> > how to add a member to again represent a 1D array.
>
> Generally speaking, a Table is best for saving heterogeneous datasets.
> In addition, the I/O is buffered in PyTables space (and not only in
> HDF5) and it is generally faster than using an EArray, so it may be
> more adequate in your case.
>
> Representing a 1D column is as easy as passing a 'shape=(N,)' argument
> to your 1D columns. Look at this example:
>
> N = 10 # your 1D array length
> class TTable(tables.IsDescription):
> col1 = tables.Int32Col(pos=0)
> col2 = tables.Float64Col(shape=(N,), pos=1) # you 1D column
> f = tables.openFile("test.h5", "w")
> t = f.createTable(f.root, 'table', TTable, 'table test')
> for i in xrange(10):
> t.append([[i, numpy.random.rand(N)]])
> t.flush()
> f.close()
>
> Hope that helps,
>
Thank you for the help, I got it working with a Table now.
I have a couple of new questions:
My table has a column with a 1000 element 1d numpy array. I would like to do the
following types of operations where I treat this column as a N x 1000 2d array,
call it X:
mean(X,axis=0)
std(X[k].reshape((k, N/k)))
In the mean case, I could imagine doing something like:
m = zeros((1,1000))
for row in X:
m = m + x
m/N
But it seems like this will be slow. I tried just numpy.mean(X) out of
curiosity, but it took forever and finally ran out of memory. I assume it was
forming a copy of the array in memory.
Thanks again for the help!
-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
Don't miss this year's exciting event. There's still time to save $100.
Use priority code J8TL2D2.
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
Pytables-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pytables-users