Hi all, I have a sparse 3.4M x 3.4M adjacency matrix with nnz = 23M and wanted to see if CArray was an appropriate solution for storing it. Right now I am using the NumPy binary format for storing the data in coordinate format and loading the matrix with Scipy's sparse coo_matrix class. As far as I understand, with CArray the matrix would be written in full (zeros included) but a) since it's chunked accessing it does not take memory and b) with compression enabled it would possible to keep the size of the file reasonable.
If my assumptions are correct, then here is my problem: I am running into problems when writing the CArray to disk. I adapted the example from the documentation [1] and when I run the code on a 6000x6000 matrix with nnz = 17K I achieve a decent speed of roughly 4100 elements/s. However, when I try it on the full matrix the writing speed drops to 4 elements/s. Am I doing something wrong? Any feedback would be greatly appreciated! Code: https://gist.github.com/junkieDolphin/5843064 Cheers, Giovanni [1] http://pytables.github.io/usersguide/libref/homogenous_storage.html#the-carray-class -- Giovanni Luca Ciampaglia ☞ http://www.inf.usi.ch/phd/ciampaglia/ ✆ (812) 287-3471 ✉ glciamp...@gmail.com ------------------------------------------------------------------------------ This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users