Hi everyone,
 
I've been toying around with the idea of creating a maskedEArray class which 
would behave the same as EArray except it would handle masked arrays (and 
hence, missing data points). I can't quite think of an efficient way to do this 
at the moment though.
 
Subclassing EArray is not a problem in and of itself. I was able to create a 
subclass of File and EArray and read and write my new sub-class with no 
problems. It is easy to add some custom attributes to my class that are handled 
also. That is the good news.
 
The tricky part comes when I try to figure out a way to represent the missing 
values. A really simple way to do this would be to store a separate boolean 
EArray corresponding to the mask for each masked array. However, I'd prefer to 
keep the information for one masked array contained within a single node from 
an organizational point of view. Similarly, I could store a boolean array as an 
attribute for each EArray, but as far as I know you can't read just a portion 
of an array attribute, it would be an all or nothing propisition which means if 
I wanted to read just a single data point I'd have to read the entire mask too.
 
Just wondering if anyone has any ideas/suggestions on how to implement 
something like this? The only thing preventing me from using pytables is the 
lack of support for missing observations (which is a show stopper for me, I 
work with financial/economic data). Is anyone else interested in this 
capability?
 
Thanks,
 
- Matt Knox
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to