Hello, I have data stored in binary files. Some of these files are huge...upwards of 2 gigs or more. They consist of 32-bit float complex numbers where the first 32 bits of the file is the real component, the second 32bits is the imaginary, the 3rd 32-bits is the real component of the second number, etc.
I'd like to be able to read in just the real components, load them into a numpy.ndarray, then load the imaginary coponents and load them into a numpy.ndarray. I need the real and imaginary components stored in seperate arrays, they cannot be in a single array of complex numbers except for temporarily. I'm trying to avoid temporary storage, though, because of the size of the files. I'm currently reading the file scanline-by-scanline to extract rows of complex numbers which I then loop over and load into the real/ imaginary arrays as follows: self._realData = numpy.empty((Rows, Columns), dtype = numpy.float32) self._imaginaryData = numpy.empty((Rows, Columns), dtype = numpy.float32) floatData = array.array('f') for CurrentRow in range(Rows): floatData.fromfile(DataFH, (Columns*2)) position = 0 for CurrentColumn in range(Columns): self._realData[CurrentRow, CurrentColumn] = floatData[position] self._imaginaryData[CurrentRow, CurrentColumn] = floatData[position+1] position = position + 2 The above code works but is much too slow. If I comment out the body of the "for CurrentColumn in range(Columns)" loop, the performance is perfectly adequate i.e. function call overhead associated with the "fromfile(...)" call is not very bad at all. What seems to be most time-consuming are the simple assignment statements in the "CurrentColumn" for-loop. Does anyone see any ways of speeding this up at all? Reading everything into a complex64 ndarray in one fell swoop would certainly be easier and faster, but at some point I'll need to split this array into two parts (real / imaginary). I'd like to have that done initially to keep the memory usage down since the files are so ginormous. Psyco is out because I need 64-bits, and I didn't see anything on the forums regarding a method that reads in every other 32-bit chunk form a file into an array. I'm not sure what else to try. Thanks in advance. L -- http://mail.python.org/mailman/listinfo/python-list