Re: [Pytables-users] ANN: PyTables 3.0 final

2013-06-03 Thread Seref Arikan
Many thanks for keeping such a great piece of work up and running. I've just seen some features in the release notes, features which I was going to need in the very near future! Great job! Best regards Seref Arikan On Sat, Jun 1, 2013 at 12:33 PM, Antonio Valentino

[Pytables-users] Chunk selection for optimized data access

2013-06-03 Thread Andreas Hilboll
Hi, I'm storing large datasets (5760 x 2880 x ~150) in a compressed EArray (the last dimension represents time, and once per month there'll be one more 5760x2880 array to add to the end). Now, extracting timeseries at one index location is slow; e.g., for four indices, it takes several seconds:

Re: [Pytables-users] Chunk selection for optimized data access

2013-06-03 Thread Andreas Hilboll
On 03.06.2013 14:43, Andreas Hilboll wrote: Hi, I'm storing large datasets (5760 x 2880 x ~150) in a compressed EArray (the last dimension represents time, and once per month there'll be one more 5760x2880 array to add to the end). Now, extracting timeseries at one index location is slow;

Re: [Pytables-users] Chunk selection for optimized data access

2013-06-03 Thread Anthony Scopatz
Hi Andreas, First off, nothing should be this bad, but What is the data type of the array? Also are you selecting chunksize manually or letting PyTables figure it out? Here are some things that you can try: 1. Query with fancy indexing, once. That is, rather than using a list

[Pytables-users] Anyone want to present at PyData Boston, July 27-28th

2013-06-03 Thread Anthony Scopatz
Hey everyone, Leah Silen (CC'd) of NumFOCUS was wondering if anyone wanted to give a talk or tutorial about PyTables at PyData Boston [1]. I don't think that I'll be able to make it, but I highly encourage others to take her up on this. This sort of thing shouldn't be too hard to put together

Re: [Pytables-users] Chunk selection for optimized data access

2013-06-03 Thread Tim Burgess
My thoughts are:- try it without any compression. Assuming 32 bit floats, your monthly 5760 x 2880 is only about 65MB. Uncompressed data may perform well and at the least it will give you a baseline to work from - and will help if you are investigating IO tuning.- I have found with CArray that the

Re: [Pytables-users] Chunk selection for optimized data access

2013-06-03 Thread Anthony Scopatz
Opps! I forgot to mention CArray! On Mon, Jun 3, 2013 at 10:35 PM, Tim Burgess timburg...@mac.com wrote: My thoughts are: - try it without any compression. Assuming 32 bit floats, your monthly 5760 x 2880 is only about 65MB. Uncompressed data may perform well and at the least it will give

Re: [Pytables-users] Chunk selection for optimized data access

2013-06-03 Thread Tim Burgess
and for the record...yes, it should be much faster than 4 seconds. foo = np.empty([5760,2880,150],dtype=np.float32) idx = ((5000,600,800,900),(1000,2000,500,1)) import time t0 = time.time();bar=np.vstack([foo[i,j] for i,j in zip(*idx)]);t1=time.time(); print t1-t00.000144004821777On Jun 03, 2013,