A Saturday 19 September 2009 20:35:56 David Fokkema escrigué:
> > Again, I don't know why sys.getsizeof returns 40 MB for a list of 10
> > millions of *long* integers.  Look at this:
> >
> > In [40]: sys.getsizeof(1)
> > Out[40]: 24
>
> 12
>
> > In [41]: sys.getsizeof(1L)
> > Out[41]: 32
>
> 16
>
> > In [42]: sys.getsizeof(2**62)
> > Out[42]: 24
>
> 24
>
> > In [43]: sys.getsizeof(2**63)
> > Out[43]: 40
>
> 24 (yes, even 2**64)
>
> I'm on 32-bit, you're probably on 64-bit? Wow, this is really platform
> specific. More so than I'd expect from my C experience.

Yes, my platform is 64-bit, so this surely is explanation for the discrepancy.

> > So, it is not clear to me how many bytes would take a long integer, but
> > it clearly depends on the number of significant digits.  At any ratem I
> > find 40 MB for 10 millions of longs to be a too small figure (?).
>
> Well... See:
> >>> a = range(2**70-10, 2**70+10)
> >>> sys.getsizeof(a[0])
>
> 24
>
> >>> sys.getsizeof(a[0:1])
>
> 36
>
> >>> sys.getsizeof(a[0:2])
>
> 40
>
> >>> sys.getsizeof(a[0:3])
>
> 44
>
> >>> sys.getsizeof(a[0:4])
>
> 48
>
> which shows that if sys.getsizeof is correct (which it should be,
> according to the official docs) a long only takes up about 4 bytes
> (probably indeed some nice power-of-two representation). There is some
> overhead for a list, there is some overhead for a single long, but
> that's about it: 4 bytes per value. That makes a list of 10 million
> longs about 40 million bytes, I guess.

Mmh, sorry but 4 bytes per long integer definitely seems too little to me.  
IMO, the most probably explanation is that what sys.getsizeof is returning for 
the case of lists is the *overhead* due to the list structure.  But then, you 
have to add the size of the objects there.

> > The correct check for a leak would be to put the list creation in a loop
> > and see if memory consumption grows or it stabilizes at the size of the
> > list.  For example, memory usage in:
> >
> > create_tables()
> > for i in range(N):
> >     r = test_query()
> >
> > is always around 1 GB (on my 64-bit machine), no matter N is 5, 10 or 20.
> > This is a *clear* evidence that a memory leak is not developing here.
>
> Indeed, with N sufficiently large (> 2) it stabilizes. So, at least
> there's no malloc without free in the query loop. Furthermore, if I do
> t = r to 'copy' the result list and then redo the query, memory size
> will balloon again. Since I have only 1 Gb of memory, I redid the tests
> with only 1 million rows, and saw that sys.getsizeof report 4 Mb, while
> moving the list and redoing the query added 24 Mb to python memory
> usage, while del-ing all copies only returned a fraction of that to the
> system.

Supposing that sys.getofsize is correctly reporting the size of longs in your 
machine (that is, about 16 ~ 24 bytes), plus 4 MB of list overhead, then the 
24 MB figure for 1 million-element list is something reasonable.

>
> Another strange thing is that:
> >>> a = []
> >>> for i in xrange(long(10e6)):
>
> ...     a.append(long(100e6))
> ...
>
> >>> b = []
> >>> for i in xrange(long(10e6)):
>
> ...     b.append(100000000)
> ...
>
> >>> a == b
>
> True
>
> >>> del a
> >>> del b
>
> Resident memory usage: after python start: 3M, after 'a': 276M, after 'b':
> 314M. After 'del a': 41M, after 'del b': 3M. Sys.getsizeof returns about
> 40M for both 'a' and 'b', consistent with the size of 'b'.
>
> I should probably move this to the python list...

It should help, yes.  Please report back about your findings; I'm curious on 
what is going on, but I'm almost sure that sys.getsizeof is not reporting the 
*actual* memory consumption of an object (at least, for complex objects like 
lists, dicts...).

-- 
Francesc Alted

------------------------------------------------------------------------------
Come build with us! The BlackBerry® Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9-12, 2009. Register now!
http://p.sf.net/sfu/devconf
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to