Re: [Numpy-discussion] Loading a > GB file into array

Ivan Vilata i Balaguer Fri, 30 Nov 2007 10:19:58 -0800

Martin Spacek (el 2007-11-30 a les 00:47:41 -0800) va dir::

>[...]
> I find that if I load the file in two pieces into two arrays, say 1GB
> and 0.3GB respectively, I can avoid the memory error. So it seems that
> it's not that windows can't allocate the memory, just that it can't
> allocate enough contiguous memory. I'm OK with this, but for indexing
> convenience, I'd like to be able to treat the two arrays as if they were
> one. Specifically, this file is movie data, and the array I'd like to
> get out of this is of shape (nframes, height, width).
>[...]


Well, one thing you could do is dump your data into a PyTables_
``CArray`` dataset, which you may afterwards access as if its was a
NumPy array to get slices which are actually NumPy arrays.  PyTables
datasets have no problem in working with datasets exceeding memory size.
For instance::

  h5f = tables.openFile('foo.h5', 'w')
  carray = h5f.createCArray(
      '/', 'bar', atom=tables.UInt8Atom(), shape=(TOTAL_NROWS, 3) )
  base = 0
  for array in your_list_of_partial_arrays:
      carray[base:base+len(array)] = array
      base += len(array)
  carray.flush()

  # Now you can access ``carray`` as a NumPy array.
  carray[42] --> a (3,) uint8 NumPy array
  carray[10:20] --> a (10, 3) uint8 NumPy array
  carray[42,2] --> a NumPy uint8 scalar, "width" for row 42

(You may use an ``EArray`` dataset if you want to enlarge it with new
rows afterwards, or a ``Table`` if you want a different type for each
field.)

.. _PyTables: http://www.pytables.org/

HTH,

::

        Ivan Vilata i Balaguer   >qo<   http://www.carabos.com/
               Cárabos Coop. V.  V  V   Enjoy Data
                                  ""

signature.asc
Description: Digital signature

_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Loading a > GB file into array

Reply via email to