Re: Is there any way to minimize str()/unicode() objects memory usage [Python 2.6.4] ?

2010-08-07 Thread dmtr
On Aug 6, 10:56 pm, Michael Torrie torr...@gmail.com wrote: On 08/06/2010 07:56 PM, dmtr wrote: Ultimately a dict that can store ~20,000,000 entries: (u'short string' : (int, int, int, int, int, int, int)). I think you really need a real database engine.  With the proper indexes, MySQL

Re: Is there any way to minimize str()/unicode() objects memory usage ?[Python 2.6.4] ?

2010-08-07 Thread garabik-news-2005-05
dmtr dchich...@gmail.com wrote: What I'm really looking for is a dict() that maps short unicode strings into tuples with integers. But just having a *compact* list container for unicode strings would help a lot (because I could add a __dict__ and go from it). At this point, I'd suggest to

Re: Is there any way to minimize str()/unicode() objects memory usage [Python 2.6.4] ?

2010-08-07 Thread Peter Otten
dmtr wrote: Well... 63 bytes per item for very short unicode strings... Is there any way to do better than that? Perhaps some compact unicode objects? There is a certain price you pay for having full-feature Python objects. Are there any *compact* Python objects? Optimized for

Re: Is there any way to minimize str()/unicode() objects memory usage [Python 2.6.4] ?

2010-08-07 Thread dmtr
On Aug 6, 11:50 pm, Peter Otten __pete...@web.de wrote: I don't know to what extent it still applys but switching off cyclic garbage collection with import gc gc.disable() Haven't tried it on the real dataset. On the synthetic test it (and sys.setcheckinterval(10)) gave ~2% speedup and

Re: Is there any way to minimize str()/unicode() objects memory usage [Python 2.6.4] ?

2010-08-07 Thread dmtr
Correction. I've copy-pasted it wrong! array.array('i', (i, i+1, i+2, i +3, i+4, i+5, i+6)) was the best. for i in xrange(0, 100): d[unicode(i)] = (i, i+1, i+2, i+3, i+4, i+5, i+6) 100 keys, ['VmPeak:\t 224704 kB', 'VmSize:\t 224704 kB'], 4.079240 seconds, 245143.698209 keys per

Re: Is there any way to minimize str()/unicode() objects memory usage [Python 2.6.4] ?

2010-08-07 Thread Peter Otten
dmtr wrote: On Aug 6, 11:50 pm, Peter Otten __pete...@web.de wrote: I don't know to what extent it still applys but switching off cyclic garbage collection with import gc gc.disable() Haven't tried it on the real dataset. On the synthetic test it (and sys.setcheckinterval(10))

Re: Is there any way to minimize str()/unicode() objects memory usage [Python 2.6.4] ?

2010-08-07 Thread dmtr
Looking at your benchmark, random.choice(letters) has probably less overhead than letters[random.randint(...)]. You might even try to inline it as Right... random.choice()... I'm a bit new to python, always something to learn. But anyway in that benchmark (from http://bugs.python.org/issue9520

Re: Is there any way to minimize str()/unicode() objects memory usage [Python 2.6.4] ?

2010-08-07 Thread dmtr
I guess with the actual dataset I'll be able to improve the memory usage a bit, with BioPython::trie. That would probably be enough optimization to continue working with some comfort. On this test code BioPython::trie gives a bit of improvement in terms of memory. Not much though... d = dict()

Re: Is there any way to minimize str()/unicode() objects memory usage [Python 2.6.4] ?

2010-08-07 Thread Nobody
On Fri, 06 Aug 2010 18:39:27 -0700, dmtr wrote: Steven, thank you for answering. See my comments inline. Perhaps I should have formulated my question a bit differently: Are there any *compact* high performance containers for unicode()/str() objects in Python? By *compact* I don't mean

Is there any way to minimize str()/unicode() objects memory usage [Python 2.6.4] ?

2010-08-06 Thread dmtr
I'm running into some performance / memory bottlenecks on large lists. Is there any easy way to minimize/optimize memory usage? Simple str() and unicode objects() [Python 2.6.4/Linux/x86]: sys.getsizeof('') 24 bytes sys.getsizeof('0')25 bytes sys.getsizeof(u'')28 bytes

Re: Is there any way to minimize str()/unicode() objects memory usage [Python 2.6.4] ?

2010-08-06 Thread Steven D'Aprano
On Fri, 06 Aug 2010 17:45:31 -0700, dmtr wrote: I'm running into some performance / memory bottlenecks on large lists. Is there any easy way to minimize/optimize memory usage? Yes, lots of ways. For example, do you *need* large lists? Often a better design is to use generators and iterators

Re: Is there any way to minimize str()/unicode() objects memory usage [Python 2.6.4] ?

2010-08-06 Thread Thomas Jollans
On 08/07/2010 02:45 AM, dmtr wrote: I'm running into some performance / memory bottlenecks on large lists. Is there any easy way to minimize/optimize memory usage? Simple str() and unicode objects() [Python 2.6.4/Linux/x86]: sys.getsizeof('') 24 bytes sys.getsizeof('0')25 bytes

Re: Is there any way to minimize str()/unicode() objects memory usage [Python 2.6.4] ?

2010-08-06 Thread dmtr
Steven, thank you for answering. See my comments inline. Perhaps I should have formulated my question a bit differently: Are there any *compact* high performance containers for unicode()/str() objects in Python? By *compact* I don't mean compression. Just optimized for memory usage, rather than

Re: Is there any way to minimize str()/unicode() objects memory usage [Python 2.6.4] ?

2010-08-06 Thread dmtr
Well...  63 bytes per item for very short unicode strings... Is there any way to do better than that? Perhaps some compact unicode objects? There is a certain price you pay for having full-feature Python objects. Are there any *compact* Python objects? Optimized for compactness? What are

Re: Is there any way to minimize str()/unicode() objects memory usage [Python 2.6.4] ?

2010-08-06 Thread Chris Rebert
On Fri, Aug 6, 2010 at 6:39 PM, dmtr dchich...@gmail.com wrote: snip Well...  63 bytes per item for very short unicode strings... Is there any way to do better than that? Perhaps some compact unicode objects? If you think that unicode objects are going to be *smaller* than byte strings, I

Re: Is there any way to minimize str()/unicode() objects memory usage [Python 2.6.4] ?

2010-08-06 Thread Christian Heimes
I'm running into some performance / memory bottlenecks on large lists. Is there any easy way to minimize/optimize memory usage? Simple str() and unicode objects() [Python 2.6.4/Linux/x86]: sys.getsizeof('') 24 bytes sys.getsizeof('0')25 bytes sys.getsizeof(u'')28 bytes

Re: Is there any way to minimize str()/unicode() objects memory usage [Python 2.6.4] ?

2010-08-06 Thread Neil Hodgson
dmtr: What I'm really looking for is a dict() that maps short unicode strings into tuples with integers. But just having a *compact* list container for unicode strings would help a lot (because I could add a __dict__ and go from it). Add them all into one string or array and use indexes

Re: Is there any way to minimize str()/unicode() objects memory usage [Python 2.6.4] ?

2010-08-06 Thread Carl Banks
On Aug 6, 6:56 pm, dmtr dchich...@gmail.com wrote: Well...  63 bytes per item for very short unicode strings... Is there any way to do better than that? Perhaps some compact unicode objects? There is a certain price you pay for having full-feature Python objects. Are there any *compact*

Re: Is there any way to minimize str()/unicode() objects memory usage [Python 2.6.4] ?

2010-08-06 Thread Michael Torrie
On 08/06/2010 07:56 PM, dmtr wrote: Ultimately a dict that can store ~20,000,000 entries: (u'short string' : (int, int, int, int, int, int, int)). I think you really need a real database engine. With the proper indexes, MySQL could be very fast storing and retrieving this information for you.