Re: [Pytables-users] Table.where and conditions across tables

Francesc Alted Mon, 26 Mar 2012 15:57:50 -0700

Hi Alvaro,

On 3/26/12 12:43 PM, Alvaro Tejero Cantero wrote:
> Would it be an option to have
>
> * raw data on one table
> * all imaginable columns used for query conditions in another table


Yes, that sounds like a good solution to me.

> (but how to grow it in columns without deleting&  recreating?)

You can't (at least on cheap way).  Maybe you may want to create 
additional tables and grouping them in terms of the columns you are 
going to need for your queries.

>
> and fetch indexes for the first based on .whereList(condition) of the second?

Exactly.

> Are there alternatives?

Yes.  The alternative would be to have column-wise tables, that would 
allow you to add and remove columns at a a cost of almost zero.  This 
idea of column-wise tables is quite flexible, and would let you have 
even variable-length columns, as well as computed columns (that is, data 
that is generated from other columns based on other columns).  These 
will have a lot of applications, IMO.  I would like to add this proposal 
to our next round of applications for projects to improve PyTables.  
Let's see how it goes.

Francesc
>
> -á.
>
>
>
> On Mon, Mar 26, 2012 at 18:29, Alvaro Tejero Cantero<alv...@minin.es>  wrote:
>> Hi there,
>>
>> I am following advice by Anthony and giving a go at representing
>> different sensors in my dataset as columns in a Table, or in several
>> Tables. This is about in-kernel queries.
>>
>> The documentation of condvars in Table.where [1] says "condvars should
>> consist of identifier-like strings pointing to Column (see The Column
>> class) instances of this table, or to other values (which will be
>> converted to arrays)".
>>
>> Conversion to arrays will likely exhaust the memory and be slow.
>> Furthermore, when I tried with a toy example (naively extrapolating
>> the behaviour of indexing in numpy), I obtained
>>
>> In [109]: valuesext = [x['V01'] for x in tet1.where("""(b>18)&
>> (a<4)""", condvars={'a':tet1.cols.V01,'b':tet2.cols.V02})]
>>
>> (... elided output)
>> ValueError: variable ``b`` refers to a column which is not part of
>> table ``/tetrode1
>>
>> I am interested in the scenario where an in-kernel query is applied to
>> a table based in columns *from other tables*  that still are aligned
>> with the current table (same number of elements). These conditions may
>> be sophisticated and mix columns from the local table as well.
>>
>> One obvious solution would be to put all aligned columns on the same
>> table. But adding columns to a table is cumbersome, and I cannot think
>> beforehand of the many precomputed columns that I would like to use as
>> query conditions.
>>
>> What do you recommend in this scenario?
>>
>> -á.
>>
>> [1] 
>> http://pytables.github.com/usersguide/libref.html?highlight=vlstring#tables.Table.where
> ------------------------------------------------------------------------------
> This SF email is sponsosred by:
> Try Windows Azure free for 90 days Click Here
> http://p.sf.net/sfu/sfd2d-msazure
> _______________________________________________
> Pytables-users mailing list
> Pytables-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/pytables-users


-- 
Francesc Alted


------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Re: [Pytables-users] Table.where and conditions across tables

Reply via email to