Hi everyone,
I've been toying around with the idea of creating a maskedEArray class which
would behave the same as EArray except it would handle masked arrays (and
hence, missing data points). I can't quite think of an efficient way to do this
at the moment though.
Subclassing EArray is not a problem in and of itself. I was able to create a
subclass of File and EArray and read and write my new sub-class with no
problems. It is easy to add some custom attributes to my class that are handled
also. That is the good news.
The tricky part comes when I try to figure out a way to represent the missing
values. A really simple way to do this would be to store a separate boolean
EArray corresponding to the mask for each masked array. However, I'd prefer to
keep the information for one masked array contained within a single node from
an organizational point of view. Similarly, I could store a boolean array as an
attribute for each EArray, but as far as I know you can't read just a portion
of an array attribute, it would be an all or nothing propisition which means if
I wanted to read just a single data point I'd have to read the entire mask too.
Just wondering if anyone has any ideas/suggestions on how to implement
something like this? The only thing preventing me from using pytables is the
lack of support for missing observations (which is a show stopper for me, I
work with financial/economic data). Is anyone else interested in this
capability?
Thanks,
- Matt Knox
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users