Hello All,
I've recently started using PyTables and I am very excited about it's speed
and ease of use for large datasets, however, I have a problem that I have
not been able to solve with regards to user defined table attributes.
I have a table that contains observations about of entities that can be
classified as different types. The timestamp for the last observation of
these entities may be different. For processing, this table I would like to
be able to determine the timestamp of the last observation for each of
these entities. The problem is easy as long as I know the entity types.
For example:
import tables
h5file = tables.openFile('data.h5',mode='r+')
tbl = h5file.getNode('/series','data1')
last_obs = max(x['timestamp'] for x in tbl.where("""entity_type=='e1'"""))
However, my problems is that as I read from my source I may not always know
the entity type before hand. I was going to add a last_observation
attribute to my table, however, I found the link
https://github.com/PyTables/PyTables/issues/145, which says that attributes
aren't persistent. So I have two questions:
1. Are there any user-defined attributes that are persistent?
2. Does anyone have any other suggestions? Besides separating the entities
into separate tables where I could then just do a max on the timestamp
field/col?
--
Aquil H. Abdullah
aquil.abdul...@gmail.com
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users