and for the record...yes, it should be much faster than 4 seconds.

>>> foo = np.empty([5760,2880,150],dtype=np.float32)
>>> idx = ((5000,600,800,900),(1000,2000,500,1))
>>> import time
>>> t0 = time.time();bar=np.vstack([foo[i,j] for i,j in zip(*idx)]);t1=time.time(); print t1-t0

On Jun 03, 2013, at 10:45 PM, Andreas Hilboll <> wrote:

On 03.06.2013 14:43, Andreas Hilboll wrote:
> Hi,
> I'm storing large datasets (5760 x 2880 x ~150) in a compressed EArray
> (the last dimension represents time, and once per month there'll be one
> more 5760x2880 array to add to the end).
> Now, extracting timeseries at one index location is slow; e.g., for four
> indices, it takes several seconds:
> In [19]: idx = ((5000, 600, 800, 900), (1000, 2000, 500, 1))
> In [20]: %time AA = np.vstack([_a[i,j] for i,j in zip(*idx)])
> CPU times: user 4.31 s, sys: 0.07 s, total: 4.38 s
> Wall time: 7.17 s
> I have the feeling that this performance could be improved, but I'm not
> sure about how to properly use the `chunkshape` parameter in my case.
> Any help is greatly appreciated :)
> Cheers, Andreas.

PS: If I could get significant performance gains by not using an EArray
and therefore re-creating the whole database each month, then this would
also be an option.

-- Andreas.

Get 100% visibility into Java/.NET code with AppDynamics Lite
It's a free troubleshooting tool designed for production
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
Pytables-users mailing list
How ServiceNow helps IT people transform IT departments:
1. A cloud service to automate IT design, transition and operations
2. Dashboards that offer high-level views of enterprise services
3. A single system of record for all IT processes
Pytables-users mailing list

Reply via email to