On 03.06.2013 14:43, Andreas Hilboll wrote: > Hi, > > I'm storing large datasets (5760 x 2880 x ~150) in a compressed EArray > (the last dimension represents time, and once per month there'll be one > more 5760x2880 array to add to the end). > > Now, extracting timeseries at one index location is slow; e.g., for four > indices, it takes several seconds: > > In [19]: idx = ((5000, 600, 800, 900), (1000, 2000, 500, 1)) > > In [20]: %time AA = np.vstack([_a[i,j] for i,j in zip(*idx)]) > CPU times: user 4.31 s, sys: 0.07 s, total: 4.38 s > Wall time: 7.17 s > > I have the feeling that this performance could be improved, but I'm not > sure about how to properly use the `chunkshape` parameter in my case. > > Any help is greatly appreciated :) > > Cheers, Andreas.
PS: If I could get significant performance gains by not using an EArray and therefore re-creating the whole database each month, then this would also be an option. -- Andreas. ------------------------------------------------------------------------------ Get 100% visibility into Java/.NET code with AppDynamics Lite It's a free troubleshooting tool designed for production Get down to code-level detail for bottlenecks, with <2% overhead. Download for free and get started troubleshooting in minutes. http://p.sf.net/sfu/appdyn_d2d_ap2 _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users