Re: [Pytables-users] Optimizing pytables for reading entire columns at a time

2012-09-28 Thread Anthony Scopatz
On Fri, Sep 28, 2012 at 2:46 AM, Francesc Alted wrote: > On 9/27/12 8:10 PM, Anthony Scopatz wrote: > > > > I think I remember seeing there was a performance limit with tables > > > 255 columns. I can't find a reference to that so it's possible I made > > it up. However, I was wondering if carr

Re: [Pytables-users] Optimizing pytables for reading entire columns at a time

2012-09-28 Thread Francesc Alted
On 9/27/12 8:10 PM, Anthony Scopatz wrote: > > I think I remember seeing there was a performance limit with tables > > 255 columns. I can't find a reference to that so it's possible I made > it up. However, I was wondering if carrays had some limitation like > that. > > Tables are a different

Re: [Pytables-users] Optimizing pytables for reading entire columns at a time

2012-09-27 Thread Anthony Scopatz
On Thu, Sep 27, 2012 at 11:02 AM, Luke Lee wrote: > Are there any performance issues with relatively large carrays? For > example, say I have a carray with 300,000 float64s in it. Is there some > threshold where I could expect performance to degrade or anything? > Hello Luke, The breakdowns h

Re: [Pytables-users] Optimizing pytables for reading entire columns at a time

2012-09-24 Thread Ümit Seren
With CArrays you can only have one specific type for the array (int, float, etc) whereas with a table each column can have a different type (string, float, etc). If you want to replicate this with carray, you would have to have multiple carray's for each type. I think for storing numerical data whe

Re: [Pytables-users] Optimizing pytables for reading entire columns at a time

2012-09-21 Thread Anthony Scopatz
On Fri, Sep 21, 2012 at 4:55 PM, Francesc Alted wrote: > On 9/21/12 10:07 PM, Anthony Scopatz wrote: > > On Fri, Sep 21, 2012 at 10:49 AM, Luke Lee > > wrote: > > > > Hi again, > > > > I haven't been getting the updates via email so I'm attempting to > >

Re: [Pytables-users] Optimizing pytables for reading entire columns at a time

2012-09-21 Thread Francesc Alted
On 9/21/12 10:07 PM, Anthony Scopatz wrote: > On Fri, Sep 21, 2012 at 10:49 AM, Luke Lee > wrote: > > Hi again, > > I haven't been getting the updates via email so I'm attempting to > post again to respond. > > Thanks everyone for the suggestions. I ha

Re: [Pytables-users] Optimizing pytables for reading entire columns at a time

2012-09-21 Thread Anthony Scopatz
On Fri, Sep 21, 2012 at 10:49 AM, Luke Lee wrote: > Hi again, > > I haven't been getting the updates via email so I'm attempting to post > again to respond. > > Thanks everyone for the suggestions. I have a few questions: > > 1. What is the benefit of using the stand-alone carray project ( > ht

Re: [Pytables-users] Optimizing pytables for reading entire columns at a time

2012-09-21 Thread Luke Lee
Hi again, I haven't been getting the updates via email so I'm attempting to post again to respond. Thanks everyone for the suggestions. I have a few questions: 1. What is the benefit of using the stand-alone carray project ( https://github.com/FrancescAlted/carray) vs Pytables.carray? 2. I re

Re: [Pytables-users] Optimizing pytables for reading entire columns at a time

2012-09-21 Thread Alvaro Tejero Cantero
Hi! You may want to have a look | reuse | combine your approach with that implemented in pandas (pandas.io.pytables.HDFStore) https://github.com/pydata/pandas/blob/master/pandas/io/pytables.py (see _write_array method) A certain liberality in Pandas with dtypes (partly induced by the missing da

Re: [Pytables-users] Optimizing pytables for reading entire columns at a time

2012-09-20 Thread Anthony Scopatz
Luke, I'd also like to mention, that if you don't want to wait for us to implement this we will gladly take contributions ;). If you need help getting started or throughout the process we are also happy to provide that too. Please sign up for PyTables Dev (pytables-...@googlegroups.com) so we mo

Re: [Pytables-users] Optimizing pytables for reading entire columns at a time

2012-09-20 Thread Josh Ayers
Depending on your use case, you may be able to get around this by storing each column in its own table. That will effectively store the data in column-first order. Instead of creating a table, you would create a group, which then contains a separate table for each column. If you want, you can wr

Re: [Pytables-users] Optimizing pytables for reading entire columns at a time

2012-09-19 Thread Francesc Alted
On 9/19/12 3:37 PM, Luke Lee wrote: > Hi all, > > I'm attempting to optimize my HDF5/pytables application for reading > entire columns at a time. I was wondering what the best way to go > about this is. > > My HDF5 has the following properties: > > - 400,000+ rows > - 25 columns > - 147 MB in to