On Nov 17, 2011, at 10:35 PM, Alan Marchiori wrote:
> Hello,
Hi Alan,
> I am attempting to use PyTables (v2.3.1) to store timestamped data and
> things were going well until I added a column index. While the column
> is indexed no data is returned from a table.where call!
>
> This behavior is demonstrated with the following test code:
> ---begin test.py---
> import tables
> import random
>
> class Descr(tables.IsDescription):
> when = tables.Time64Col(pos = 1)
> value = tables.Float32Col(pos = 2)
>
> h5f = tables.openFile('/tmp/tmp.h5', 'w')
> tbl = h5f.createTable('/', 'test', Descr)
>
> tbl.cols.when.createIndex(_verbose = True)
>
> t = 1321031471.0 # 11/11/11 11:11:11
> tbl.append([(t + i, random.random()) for i in range(1000)])
> tbl.flush()
>
> def query(s):
> print 'is_index =', tbl.cols.when.is_indexed
> print [(row['when'], row['value']) for row in tbl.where(s)]
> print tbl.readWhere(wherestr)
>
> wherestr = '(when >= %d) & (when < %d)'%(t, t+5)
> query(wherestr)
> tbl.cols.when.removeIndex()
> query(wherestr)
>
> h5f.close()
> ---end test.py---
>
> This creates the table for storing time/value pairs, inserts some
> synthetic data, and then checks to see if there is data in the table.
> When the table is created there is an index added to the 'where'
> column. The first query returns no data (which is incorrect). Then
> the column index is removed (via table.removeIndex) and the query is
> repeated. This time 5 results are returned as expected. The data is
> clearly there however the index is somehow breaking the where logic.
> Here is the output I get:
>
> ---begin output---
> is_index = True
> []
> []
> is_index = False
> [(1321031471.0, 0.6449417471885681), (1321031472.0,
> 0.7889317274093628), (1321031473.0, 0.609708845615387), (1321031474.0,
> 0.9120397567749023), (1321031475.0, 0.2386845201253891)]
> [(1321031471.0, 0.6449417471885681) (1321031472.0, 0.7889317274093628)
> (1321031473.0, 0.609708845615387) (1321031474.0, 0.9120397567749023)
> (1321031475.0, 0.2386845201253891)]
> ---end output---
>
> Creating the index after the data has been inserted produces the same
> behavior (no data is returned while the index exists). Any
> suggestions would be greatly appreciated.
I've reproduced with a number of different index configurations. If I change
the column type to Float64, then the index works as expected.
BEFORE:
Initial index: verbose has_index= True use_index=
frozenset(['Awhen']) where= 0 readWhere= 0
remove index has_index= False use_index=
frozenset([]) where= 5 readWhere= 5
re-add index (non-verbose) has_index= True use_index=
frozenset(['Awhen']) where= 0 readWhere= 0
remove again has_index= False use_index=
frozenset([]) where= 5 readWhere= 5
re-add index (with flush) has_index= True use_index=
frozenset(['Awhen']) where= 0 readWhere= 0
re-add index (full) has_index= True use_index=
frozenset(['Awhen']) where= 0 readWhere= 0
re-add index (ultralight) has_index= True use_index=
frozenset(['Awhen']) where= 0 readWhere= 0
re-add index (o=0) has_index= True use_index=
frozenset(['Awhen']) where= 0 readWhere= 0
re-add index (o=9) has_index= True use_index=
frozenset(['Awhen']) where= 0 readWhere= 0
re-index has_index= True use_index=
frozenset(['Awhen']) where= 0 readWhere= 0
also index value has_index= True use_index=
frozenset(['Awhen']) where= 0 readWhere= 0
AFTER:
Initial index: verbose has_index= True use_index=
frozenset(['Awhen']) where= 5 readWhere= 5
remove index has_index= False use_index=
frozenset([]) where= 5 readWhere= 5
re-add index (non-verbose) has_index= True use_index=
frozenset(['Awhen']) where= 5 readWhere= 5
remove again has_index= False use_index=
frozenset([]) where= 5 readWhere= 5
re-add index (with flush) has_index= True use_index=
frozenset(['Awhen']) where= 5 readWhere= 5
re-add index (full) has_index= True use_index=
frozenset(['Awhen']) where= 5 readWhere= 5
re-add index (ultralight) has_index= True use_index=
frozenset(['Awhen']) where= 5 readWhere= 5
re-add index (o=0) has_index= True use_index=
frozenset(['Awhen']) where= 5 readWhere= 5
re-add index (o=9) has_index= True use_index=
frozenset(['Awhen']) where= 5 readWhere= 5
re-index has_index= True use_index=
frozenset(['Awhen']) where= 5 readWhere= 5
also index value has_index= True use_index=
frozenset(['Awhen']) where= 5 readWhere= 5
Cheers,
~Josh.
> Alan
------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure
contains a definitive record of customers, application performance,
security threats, fraudulent activity, and more. Splunk takes this
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
Pytables-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pytables-users