I have only glanced at the problem so I may have missed something but my
approach to a large matrix would be to realise it is a flat file and
mmap it. Your program would then treat it as a memory resident
structure. The VM features of the OS would perform paging as necessary
to keep a working set of the matrix in real memory.
Andreas wrote:
Hello Benilton,
some years ago i came across pyTables ( http://www.pytables.org ). It's
a wrapper for the HDF5-format. PyTables claims to handle high
data-thruput very well. It supports Matrix/Array-formats as these are
typically used in scientific-projects. PyTables does not provide any
form of relational-model, but it sounds to me that this is probably not
what you need in first place. Maybe u can boost the performance of u'r
calculations as soon as u can load/store Arrays/Matrixes en piece.
I used it for document-clustering and was very happy being able to
store compressed-Arrays generated with Numeric/NumArray-packages. The
performance on ~1000 documents inside the cluster was fine though not
as critical as yours. I appreciated the ease of use and the chance to
easily add metadata into the dataset. Yes and a Jva-Gui is also avail.
I don't know if your data-sets/data-types fit into this scenario, but
maybe you want to take a look into the FAQ [ http://
www.pytables.org/moin/FAQ#head- b32537aba805dac2a1bf9cd6606c4fddcd964f96 ].
good luck, andreas
-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------
-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------