This is all very good info. Especially, the byteswap. Ill be testing it momentarily. As far as a detailed explanation of the problem....
In essence, I am applying sparse matrix multiplication. The matrix of which I am dealing with in the matter described is nxn. Generally, this matrix is 1-20% sparse. I use it in spatial data analysis, where the matrix W represents the spatial association between n observations. The operations I perform on it are generally related to the spatial lag of a variable... or Wy, where y is a nxk matrix (usually k=1). As k is generally small, the y vector and the result vector are represented by numpy arrays. I can have nxkx2 pieces of info in mem (usually). What I cant have is n**2. So, I store each row of W in a file as a record consisting of 3 parts: 1) row, nn (# of neighbors) 2) nhs (nx1) vector of integers representing the columns in row[i] != 0 3) weights (nx1) vector of floats corresponding to the index in the previous row The first two parts of the record are known as a GAL or geographic algorithm library. Since a lot of my W matrices have distance metrics associated with them I added the third. I think this might be termed by someone else as an enhanced GAL. At any rate, this allows me to perform this operation on large datasets w/o running out of mem. -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Christopher Barker Sent: Tuesday, February 13, 2007 4:07 PM To: Discussion of Numerical Python Subject: Re: [Numpy-discussion] fromstring, tostring slow? Mark Janikas wrote: > I don't think I can do that because I have heterogeneous rows of > data.... I.e. the columns in each row are different in length. like I said, show us your whole problem... But you don't have to write.read all the data at once with from/tofile() anyway. Each of your "rows" has to be in a separate array anyway, as numpy arrays don't support "ragged" arrays, but each row can be written with tofile() > Furthermore, when reading it back in, I want to read only bytes of the > info at a time so I can save memory. In this case, I only want to have > one record in mem at once. you can make multiple calls to fromfile(), thou you'll have to know how long each record is. > Another issue has arisen from taking this routine cross-platform.... > namely, if I write the file on Windows I cant read it on Solaris. I > assume the big-little endian is at hand here. yup. > I know using the struct > module that I can pack using either one. so can numpy. see the "byteswap" method, and you can specify a particular endianess with a datatype when you read with fromfile(): a = N.fromfile(DataFile, dtype=N.dtype("<d"), count=20) reads 20 little-endian doubles from DataFile, regardless of the native endianess of the machine you're on. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception [EMAIL PROTECTED] _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion