I'm running into some performance / memory bottlenecks on large lists. Is there any easy way to minimize/optimize memory usage?
Simple str() and unicode objects() [Python 2.6.4/Linux/x86]: >>> sys.getsizeof('') 24 bytes >>> sys.getsizeof('0') 25 bytes >>> sys.getsizeof(u'') 28 bytes >>> sys.getsizeof(u'0') 32 bytes Lists of str() and unicode() objects (see ref. code below): >>> [str(i) for i in xrange(0, 10000000)] 370 Mb (37 bytes/item) >>> [unicode(i) for i in xrange(0, 10000000)] 613 Mb (63 bytes/item) Well... 63 bytes per item for very short unicode strings... Is there any way to do better than that? Perhaps some compact unicode objects? -- Regards, Dmitry ---- import os, time, re start = time.time() l = [unicode(i) for i in xrange(0, 10000000)] dt = time.time() - start vm = re.findall("(VmPeak.*|VmSize.*)", open('/proc/%d/status' % os.getpid()).read()) print "%d keys, %s, %f seconds, %f keys per second" % (len(l), vm, dt, len(l) / dt) -- http://mail.python.org/mailman/listinfo/python-list