Do you expect to have very large integer values, or only values over a limited range?
If your integer values will fit in into 16-bit range (or even 32-bit, if you're on a 64-bit machine, the default dtype is float64...) you can potentially halve your memory usage. I.e. Something like: data = numpy.loadtxt(filename, dtype=numpy.int16) Alternately, if you're already planning on using a (scipy) sparse array anyway, it's easy to do something like this: import numpy as np import scipy.sparse I, J, V = [], [], [] with open('infile.txt') as infile: for i, line in enumerate(infile): line = np.array(line.strip().split(), dtype=np.int) nonzeros, = line.nonzero() I.extend([i]*nonzeros.size) J.extend(nonzeros) V.extend(line[nonzeros]) data = scipy.sparse.coo_matrix((V,(I,J)), dtype=np.int, shape=(i+1, line.size)) This will be much slower than numpy.loadtxt(...), but if you're just converting the output of loadtxt to a sparse array, regardless, this would avoid memory usage problems (assuming the array is mostly sparse, of course). Hope that helps, -Joe On Fri, Feb 25, 2011 at 9:37 AM, Jaidev Deshpande < deshpande.jai...@gmail.com> wrote: > Hi > > Is it possible to load a text file 664 MB large with integer values and > about 98% sparse? numpy.loadtxt() shows a memory error. > > If it's not possible, what alternatives could I have? > > The usable RAM on my machine running Windows 7 is 3.24 GB. > > Thanks. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion