The netCDF library gives me a masked array so I have to explicitly transform that into a regular numpy array.

Ahh interesting.  Depending on the netCDF version the file was made with, you should be able to read the file directly from PyTables.  You could thus directly get a normal numpy array.  This *should* be possible, but I have never tried it ;)
 
I think the netCDF3 functionality has been taken out or at least deprecated (https://github.com/PyTables/PyTables/issues/68). Using the python-netCDF4 module allows me to pull from pretty much any netcdf file and the inherent masking is sometimes very useful where the dataset is smaller and I can live with the lower performance of masks.   


 
I've looked under the covers and have seen that the ma masked implementation is all pure Python and so there is a performance drawback. I'm not up to speed yet on where the numpy.na masking implementation is (started a new job here).

I tried to do an implementation in memory (except for the final write) and found that I have about 2GB of indices when I extract the quality indices. Simply using those indexes, memory usage grows to over 64GB and I eventually run out of memory and start churning away in swap.

For the moment, I have pulled down the latest git master and am using the new in-memory HDF feature. This seems to give be better performance and is code-wise pretty simple so for the moment, it's good enough.

Awesome! I am glad that this is working for you.
 
Yes - appears to work great!

------------------------------------------------------------------------------
Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester  
Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the  
endpoint security space. For insight on selecting the right partner to 
tackle endpoint security challenges, access the full report. 
http://p.sf.net/sfu/symantec-dev2dev
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to