2009/10/30 Stephen Simmons <m...@stevesimmons.com>: > I should clarify what I meant...... > > Suppose I have a recarray with 50 fields and want to read just one of > those fields. PyTables/HDF will read in the compressed data for chunks > of complete rows, decompress the full 50 fields, and then give me back > the data for just one field. > > I'm after a solution where asking for a single field reads in the bytes > for just that field from disk and decompresses it. > > This is similar to the difference between databases storing their data > as rows or columns. See for example Mike Stonebraker's C-store > column-oriented database (http://db.lcs.mit.edu/projects/cstore/vldb.pdf).
Is there any reason not to simply store the data as a collection of separate arrays, one per column? It shouldn't be too hard to write a wrapper to give this nicer syntax, while implementing it under the hood with HDF5... Anne > Stephen > > > > Francesc Alted wrote: >> A Friday 30 October 2009 14:18:05 Stephen Simmons escrigué: >> >>> - Pytables (HDF using chunked storage for recarrays with LZO >>> compression and shuffle filter) >>> - can't extract individual field from a recarray >>> >> >> Er... Have you tried the ``cols`` accessor? >> >> http://www.pytables.org/docs/manual/ch04.html#ColsClassDescr >> >> Cheers, >> >> > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion