Why not read in just the date and ID columns to start with, then do a
numpy.unique() or python set() on theses, then query based on the unique
values? Seems like it might be faster....
Be Well
Anthony
On Mon, Jul 2, 2012 at 5:16 PM, Aquil H. Abdullah
<aquil.abdul...@gmail.com>wrote:
> Hello All,
>
> I have a table that is indexed by two keys, and I would like to search for
> duplicate keys. So here is my naive slow implementation: (code I posted on
> stackoverflow)
>
> import tables
>
>
> h5f = tables.openFile('filename.h5')
>
>
> tbl = h5f.getNode('/data','data_table') # assumes group data and table
> data_table
>
>
> counter += 0
>
>
> for row in tbl:
>
>
> ts = row['date'] # timestamp (ts) or date
>
>
> uid = row['userID']
>
>
> query = '(date == %d) & (userID == "%s")' % (ts, uid)
>
>
> result = tbl.readWhere(query)
>
>
> if len(result) > 1:
>
>
> # Do something here
>
>
> pass
>
>
> counter += 1
>
>
> if counter % 1000 == 0: print '%d rows processed'
>
>
>
> --
> Aquil H. Abdullah
> aquil.abdul...@gmail.com
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> Pytables-users mailing list
> Pytables-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/pytables-users
>
>
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users