[Pytables-users] Chunk selection for optimized data access

Andreas Hilboll Mon, 03 Jun 2013 05:44:34 -0700

Hi,

I'm storing large datasets (5760 x 2880 x ~150) in a compressed EArray
(the last dimension represents time, and once per month there'll be one
more 5760x2880 array to add to the end).


Now, extracting timeseries at one index location is slow; e.g., for four
indices, it takes several seconds:

   In [19]: idx = ((5000, 600, 800, 900), (1000, 2000, 500, 1))

   In [20]: %time AA = np.vstack([_a[i,j] for i,j in zip(*idx)])
   CPU times: user 4.31 s, sys: 0.07 s, total: 4.38 s
   Wall time: 7.17 s

I have the feeling that this performance could be improved, but I'm not
sure about how to properly use the `chunkshape` parameter in my case.

Any help is greatly appreciated :)

Cheers, Andreas.

------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite
It's a free troubleshooting tool designed for production
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap2
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

[Pytables-users] Chunk selection for optimized data access

Reply via email to