On 08/07/2010 02:45 AM, dmtr wrote: > I'm running into some performance / memory bottlenecks on large lists. > Is there any easy way to minimize/optimize memory usage? > > Simple str() and unicode objects() [Python 2.6.4/Linux/x86]: >>>> sys.getsizeof('') 24 bytes >>>> sys.getsizeof('0') 25 bytes >>>> sys.getsizeof(u'') 28 bytes >>>> sys.getsizeof(u'0') 32 bytes > > Lists of str() and unicode() objects (see ref. code below): >>>> [str(i) for i in xrange(0, 10000000)] 370 Mb (37 bytes/item) >>>> [unicode(i) for i in xrange(0, 10000000)] 613 Mb (63 bytes/item) > > Well... 63 bytes per item for very short unicode strings... Is there > any way to do better than that? Perhaps some compact unicode objects?
There is a certain price you pay for having full-feature Python objects. What are you trying to accomplish anyway? Maybe the array module can be of some help. Or numpy? > > -- Regards, Dmitry > > ---- > import os, time, re > start = time.time() > l = [unicode(i) for i in xrange(0, 10000000)] > dt = time.time() - start > vm = re.findall("(VmPeak.*|VmSize.*)", open('/proc/%d/status' % > os.getpid()).read()) > print "%d keys, %s, %f seconds, %f keys per second" % (len(l), vm, dt, > len(l) / dt) -- http://mail.python.org/mailman/listinfo/python-list