Hey Bartosz, A Wednesday 02 February 2011 09:37:14 Bartosz Telenczuk escrigué: > Hi, > > I am trying to implement efficient out-of-memory computations on > large arrays. I have two questions: > > 1) My data is stored in binary files, which I read using > numpy.memmap. Is there a way to efficiently copy from memmap to > CArray without reading all data into memory first? I suppose I could > use iterate over chunks, but then I would need to optimize the > chunksizes.
Yes, just try loading data in chunks. For example, let's say that your array is bidimensional; I think something like should work: carray = tables.createCArray(...) for i,row in enumerate(your_memmap_array): carray[i] = row Maybe using EArrays would be slightly simpler: earray = tables.createEArray(...) for i,row in your_memmap_array: earray.append(row) For other dimensions you have to find an appropriate chunk, but the example above illustrates the idea. > 2) In the data I want to find threshold crossings. In numpy I usually > do it using nonzero function: > > import numpy as np > a = np.random.randn(100) > T = 0 > i, = np.nonzero((a[:-1]<T) & (a[1:]>T)) > > How can I implement it with tables.Expr? You can't. tables.Expr only support expressions as in numexpr, and unfortunately, this does not include indexing variables in the middle of expressions (as in your example). Hope this helps! -- Francesc Alted ------------------------------------------------------------------------------ Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! Finally, a world-class log management solution at an even better price-free! Download using promo code Free_Logger_4_Dev2Dev. Offer expires February 28th, so secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsight-sfd2d _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users