A Thursday 03 February 2011 13:35:55 Bartosz Telenczuk escrigué:
> Is there a way to iterate over chunks, or should I just check
> chunkshape and adjust indices appropriately?

Well, I'd say that, in general, the suggested approach should work for 
most of arrays.  For example, if the arrays are 3-d instead of 2-d, the 
iterator will return 2-d slices, but that's all.  For 1-d the approach 
above will work, but too slowly, so you must find an appropriate slice 
that works efficiently.

Choosing the slice size should be not difficult, just something that is 
not too large or too small (anything between 1 MB ~ 10 MB should do 
fine).  The only thing to have in mind is that your slices should not 
exceed your available memory.  PyTables will automatically determine an 
adequate HDF5 chunk size for your on-disk datasets.

> Right. I think that I could avoid indexing by using something like
> this:
> 
> i, = np.nonzero(a>T)
> 
> which returns a list of indices. Since this list should be much
> shorter than "a", I could use fancy indexing in numpy to extract the
> threshold crossings from "i":
> 
> j, = np.nonzero(np.diff(i)>1)
> crossings = i[j]
> 
> However, in order to do that one needs nonzero function in
> tables.Expr. Is it possible to implement it?
> 
> Another, simpler solution combines nonzero, diff function and type
> conversion in tables.Expr expression like this:
> 
> crossing, = np.nonzero(np.diff((a>T)*1)>0)
> 
> Would it be possible to implement it either way?

Uh, no.  tables.Expr only supports simple element-wise operations whose 
output has the same shape than operands (so `nonzero` is not supported).  
Also, it cannot carry out operations that makes use of different indices 
in operands for computing some element (so `diff` is be supported 
either).  Rather, you need to think about tables.Expr (and numexpr in 
general) as a virtual machine that only accepts vectors (matrices) and 
can perform operations only among elements in the same positions (mostly 
like a SIMD processor).

-- 
Francesc Alted

------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to