Hi,
Is anyone working on alternative storage options for numpy arrays, and
specifically recarrays? My main application involves processing series
of large recarrays (say 1000 recarrays, each with 5M rows having 50
fields). Existing options meet some but not all of my requirements.
Dag Sverre Seljebotn:
Hi,
Is anyone working on alternative storage options for numpy arrays, and
specifically recarrays? My main application involves processing series
of large recarrays (say 1000 recarrays, each with 5M rows having 50
fields). Existing options meet some but not all of my
Stephen Simmons wrote:
P.S. Maybe this will be too much work, and I'd be better off sticking
with Pytables.
I can't judge that, but I want to share some thoughts (rant?):
- Are you ready to not only write the code, but maintain it over years to
come, and work through nasty bugs, and think
Unless I read your request or the documentation wrong, h5py already
supports pulling specific fields out of compound data types:
http://h5py.alfven.org/docs-1.1/guide/hl.html#id3
For compound data, you can specify multiple field names alongside
the numeric slices:
dset[FieldA]
A Friday 30 October 2009 14:18:05 Stephen Simmons escrigué:
- Pytables (HDF using chunked storage for recarrays with LZO
compression and shuffle filter)
- can't extract individual field from a recarray
Er... Have you tried the ``cols`` accessor?
2009/10/30 Stephen Simmons m...@stevesimmons.com:
I should clarify what I meant..
Suppose I have a recarray with 50 fields and want to read just one of
those fields. PyTables/HDF will read in the compressed data for chunks
of complete rows, decompress the full 50 fields, and then give me