Re: [Numpy-discussion] reading *big* inhomogenous text matrices *fast*?

2008-08-14 Thread Christopher Barker
One other potential downside of using python lists to accumulate numbers is that you are storing python objects (python ints or floats, or...) rather than raw numbers, which has got to incur some memory overhead. How does array.array perform in this context? It has an append() method, and one

Re: [Numpy-discussion] reading *big* inhomogenous text matrices *fast*?

2008-08-14 Thread Robert Kern
On Thu, Aug 14, 2008 at 11:51, Christopher Barker [EMAIL PROTECTED] wrote: One other potential downside of using python lists to accumulate numbers is that you are storing python objects (python ints or floats, or...) rather than raw numbers, which has got to incur some memory overhead. How

Re: [Numpy-discussion] reading *big* inhomogenous text matrices *fast*?

2008-08-14 Thread Dan Lenski
On Thu, 14 Aug 2008 04:40:16 +, Daniel Lenski wrote: I assume that list-of-arrays is more memory-efficient since array elements don't have the overhead of full-blown Python objects. But list- of-lists is probably more time-efficient since I think it's faster to convert the whole array at

[Numpy-discussion] reading *big* inhomogenous text matrices *fast*?

2008-08-13 Thread Dan Lenski
Hi all, I'm using NumPy to read and process data from ASCII UCD files. This is a file format for describing unstructured finite-element meshes. Most of the file consists of rectangular, numerical text matrices, easily and efficiently read with loadtxt(). But there is one particularly nasty

Re: [Numpy-discussion] reading *big* inhomogenous text matrices *fast*?

2008-08-13 Thread Zachary Pincus
Hi Dan, Your approach generates numerous large temporary arrays and lists. If the files are large, the slowdown could be because all that memory allocation is causing some VM thrashing. I've run into that at times parsing large text files. Perhaps better would be to iterate through the

Re: [Numpy-discussion] reading *big* inhomogenous text matrices *fast*?

2008-08-13 Thread Daniel Lenski
On Wed, 13 Aug 2008 16:57:32 -0400, Zachary Pincus wrote: Your approach generates numerous large temporary arrays and lists. If the files are large, the slowdown could be because all that memory allocation is causing some VM thrashing. I've run into that at times parsing large text files.

Re: [Numpy-discussion] reading *big* inhomogenous text matrices *fast*?

2008-08-13 Thread robert . kern
On 2008-08-13, Daniel Lenski [EMAIL PROTECTED] wrote: On Wed, 13 Aug 2008 16:57:32 -0400, Zachary Pincus wrote: Your approach generates numerous large temporary arrays and lists. If the files are large, the slowdown could be because all that memory allocation is causing some VM thrashing. I've

Re: [Numpy-discussion] reading *big* inhomogenous text matrices *fast*?

2008-08-13 Thread Daniel Lenski
On Wed, 13 Aug 2008 20:55:02 -0500, robert.kern wrote: This is similar to what I tried originally! Unfortunately, repeatedly appending to a list seems to be very slow... I guess Python keeps reallocating and copying the list as it grows. (It would be nice to be able to tune the increments by

Re: [Numpy-discussion] reading *big* inhomogenous text matrices *fast*?

2008-08-13 Thread Zachary Pincus
This is similar to what I tried originally! Unfortunately, repeatedly appending to a list seems to be very slow... I guess Python keeps reallocating and copying the list as it grows. (It would be nice to be able to tune the increments by which the list size increases.) Robert's right, as

Re: [Numpy-discussion] reading *big* inhomogenous text matrices *fast*?

2008-08-13 Thread Robert Kern
On Wed, Aug 13, 2008 at 21:07, Daniel Lenski [EMAIL PROTECTED] wrote: On Wed, 13 Aug 2008 20:55:02 -0500, robert.kern wrote: This is similar to what I tried originally! Unfortunately, repeatedly appending to a list seems to be very slow... I guess Python keeps reallocating and copying the

Re: [Numpy-discussion] reading *big* inhomogenous text matrices *fast*?

2008-08-13 Thread Daniel Lenski
On Wed, 13 Aug 2008 21:42:51 -0500, Robert Kern wrote: Here is the appropriate snippet in Objects/listobject.c: /* This over-allocates proportional to the list size, making room * for additional growth. The over-allocation is mild, but is * enough to give

Re: [Numpy-discussion] reading *big* inhomogenous text matrices *fast*?

2008-08-13 Thread Daniel Lenski
On Wed, 13 Aug 2008 22:11:07 -0400, Zachary Pincus wrote: Try profiling the code just to make sure that it is the list append that's slow, and not something else happening on that line, e.g.. From what you and others have pointed out, I'm pretty sure I must have been doing something else wrong

Re: [Numpy-discussion] reading *big* inhomogenous text matrices *fast*?

2008-08-13 Thread Sebastien Binet
Hi, Raymond Hettinger had a good talk at PyCon this year about the details of the Python containers. Here are the slides from the EuroPython version (I assume). http://www.pycon.it/static/pycon2/slides/containers.ppt Thanks! Looks like the only caveat is that the whole thing may