What you are asking for is called an "out of core" matrix-multiplication 
algorithm.  If you google for it, you'll find several papers on this.  The 
key (as is also true for in-memory matrix operations), is to maximize 
memory locality.

However, there doesn't seem to be much free code available for this 
problem.  My suspicion is that it is not so popular because:

1) If your problem doesn't fit in memory, these days the first choice is 
usually to go to a distributed-memory cluster rather than hitting the disk.
2) The cubic scaling of matrix-matrix products means that you can't 
increase the size too much anyway.  If you give me 1000x more processing 
power, I can handle a 10x bigger matrix rank (assuming dense matrices).

PS. HDF5 lets you specify how an array is "chunked" for storage, and lets 
you read back subsets of an array at a time.  All of this functionality is 
exposed in HDF5.jl, I believe.  Possibly it would be more efficient to 
memory-map the array and let the OS deal with paging it in and out of 
memory.  Note that Julia matrices are stored in column-major format 
(contiguous columns stored one after another), so you'll want to access the 
matrix column-by-column in your matrix-vector products.

Reply via email to