Re: [Pytables-users] Table.where and conditions across tables

Alvaro Tejero Cantero Tue, 27 Mar 2012 00:21:56 -0700

>> (but how to grow it in columns without deleting&  recreating?)
>
> You can't (at least on cheap way).  Maybe you may want to create
> additional tables and grouping them in terms of the columns you are
> going to need for your queries.


Sorry, it is not clear to me: create new tables and (grouping =
grouping in HDF5 Groups) them  in terms of the columns?
As far as I understood, only columns in the same table (regardless the
group of the table) can be queried together with the in-kernel engine?

>> Are there alternatives?
>
> Yes.  The alternative would be to have column-wise tables, that would
> allow you to add and remove columns at a a cost of almost zero.  This
> idea of column-wise tables is quite flexible, and would let you have
> even variable-length columns, as well as computed columns (that is, data
> that is generated from other columns based on other columns).  These
> will have a lot of applications, IMO.  I would like to add this proposal
> to our next round of applications for projects to improve PyTables.
> Let's see how it goes.

This sounds definitely interesting. But I see the interest that
PyTables can query columns in different tables in-kernel, because it
removes one big constraint for data layout (and this is in turn
important because attr dictionaries can only be attached to whole
tables AFAIK). The solution I suggest would be that whenever other
columns are involved, the in-kernel engine loops over the zip of the
columns. It could do a pre-check on column length before starting.

This would be a quite useful enhancement for me.

> Francesc
>>
>> -á.
>>
>>
>>
>> On Mon, Mar 26, 2012 at 18:29, Alvaro Tejero Cantero<alv...@minin.es>  wrote:
>>> Hi there,
>>>
>>> I am following advice by Anthony and giving a go at representing
>>> different sensors in my dataset as columns in a Table, or in several
>>> Tables. This is about in-kernel queries.
>>>
>>> The documentation of condvars in Table.where [1] says "condvars should
>>> consist of identifier-like strings pointing to Column (see The Column
>>> class) instances of this table, or to other values (which will be
>>> converted to arrays)".
>>>
>>> Conversion to arrays will likely exhaust the memory and be slow.
>>> Furthermore, when I tried with a toy example (naively extrapolating
>>> the behaviour of indexing in numpy), I obtained
>>>
>>> In [109]: valuesext = [x['V01'] for x in tet1.where("""(b>18)&
>>> (a<4)""", condvars={'a':tet1.cols.V01,'b':tet2.cols.V02})]
>>>
>>> (... elided output)
>>> ValueError: variable ``b`` refers to a column which is not part of
>>> table ``/tetrode1
>>>
>>> I am interested in the scenario where an in-kernel query is applied to
>>> a table based in columns *from other tables*  that still are aligned
>>> with the current table (same number of elements). These conditions may
>>> be sophisticated and mix columns from the local table as well.
>>>
>>> One obvious solution would be to put all aligned columns on the same
>>> table. But adding columns to a table is cumbersome, and I cannot think
>>> beforehand of the many precomputed columns that I would like to use as
>>> query conditions.
>>>
>>> What do you recommend in this scenario?
>>>
>>> -á.
>>>
>>> [1] 
>>> http://pytables.github.com/usersguide/libref.html?highlight=vlstring#tables.Table.where
>> ------------------------------------------------------------------------------
>> This SF email is sponsosred by:
>> Try Windows Azure free for 90 days Click Here
>> http://p.sf.net/sfu/sfd2d-msazure
>> _______________________________________________
>> Pytables-users mailing list
>> Pytables-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/pytables-users
>
>
> --
> Francesc Alted
>
>
> ------------------------------------------------------------------------------
> This SF email is sponsosred by:
> Try Windows Azure free for 90 days Click Here
> http://p.sf.net/sfu/sfd2d-msazure
> _______________________________________________
> Pytables-users mailing list
> Pytables-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/pytables-users

------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Re: [Pytables-users] Table.where and conditions across tables

Reply via email to