Re: [Pytables-users] What is the result of calling craeteIndex() on multiple columns?

Anthony Scopatz Tue, 26 Jun 2012 14:30:57 -0700

On Tue, Jun 26, 2012 at 4:19 PM, Aquil H. Abdullah <[email protected]
> wrote:


>  Hello All,
>
> In my newbist state, I called createIndex on two columns in one of my
> tables:
>
> import tables
> table_desc = {'timestamp':tables.Time32Col(),
> 'symbol':tables.StringCol(8), 'observation':tables.Float32Col()}
> h5f = tables.openFile('test.h5',mode='w')
> group = h5f.createGroup('/','data')
> table = h5f.createTable(group, 'test',table_desc,'Test Table')
> table.cols.timestamp.createIndex()
> table.cols.symbol.createIndex()
> …
>
> Now from what I've been able to find on the internet an index is only
> associated with one column:
>
> class tables.Index
>     Represents the index of a column in a table.
>
>     This class is used to keep the indexing information for columns in a
> Table dataset (see The Table class). It is actually the descendant of the
>     Group class (see The Group class), with some added functionality. An
> Index is always associated with one and only one column in a table.
>
> - PyTables 2.3.1 User's Guide - Library Reference/The Index Class
> http://pytables.github.com/usersguide/libref.html#indexclassdescr
> - Efficient way to verify that records are unique in Python/PyTables
> http://stackoverflow.com/questions/1315129/efficient-way-to-verify-that-records-are-unique-in-python-pytables
> - Hints For SQL Users (Creating an index)
> http://www.pytables.org/moin/HintsForSQLUsers#Creatinganindex
>
> So how does PyTables interpret a table with multiple column indices?  The
> best solution that I've found is creating a hash from the two fields that I
> am interested in indexing and then indexing that table on that hash.
>
> The other solution would be to shard my data by symbol and then index each
> symbol table by timestamp.
>
> Can anyone explain what effect two index columns has on Pytables?
> Also, can anyone tell me if they've come up with a better solution for
> dealing with tables that require multiple indices than any that I've
> mentioned?
>

I don't have a lot of time right now, but maybe create a nested column or a
column with a compound data type that is just a tuple of the two data types
you are interested in.  Then index against the super column.  Storing a
hash in another column is probably not the greatest way to do this...

Hopefully someone else can jump in and answer this one.


>
> Regards,
>
> --
> Aquil H. Abdullah
>
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> Pytables-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/pytables-users
>
>

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/

_______________________________________________
Pytables-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pytables-users

Re: [Pytables-users] What is the result of calling craeteIndex() on multiple columns?

Reply via email to