On Aug 6, 10:56 pm, Michael Torrie torr...@gmail.com wrote:
On 08/06/2010 07:56 PM, dmtr wrote:
Ultimately a dict that can store ~20,000,000 entries: (u'short
string' : (int, int, int, int, int, int, int)).
I think you really need a real database engine. With the proper
indexes, MySQL
dmtr dchich...@gmail.com wrote:
What I'm really looking for is a dict() that maps short unicode
strings into tuples with integers. But just having a *compact* list
container for unicode strings would help a lot (because I could add a
__dict__ and go from it).
At this point, I'd suggest to
dmtr wrote:
Well... 63 bytes per item for very short unicode strings... Is there
any way to do better than that? Perhaps some compact unicode objects?
There is a certain price you pay for having full-feature Python objects.
Are there any *compact* Python objects? Optimized for
On Aug 6, 11:50 pm, Peter Otten __pete...@web.de wrote:
I don't know to what extent it still applys but switching off cyclic garbage
collection with
import gc
gc.disable()
Haven't tried it on the real dataset. On the synthetic test it (and
sys.setcheckinterval(10)) gave ~2% speedup and
Correction. I've copy-pasted it wrong! array.array('i', (i, i+1, i+2, i
+3, i+4, i+5, i+6)) was the best.
for i in xrange(0, 100): d[unicode(i)] = (i, i+1, i+2, i+3, i+4, i+5,
i+6)
100 keys, ['VmPeak:\t 224704 kB', 'VmSize:\t 224704 kB'],
4.079240 seconds, 245143.698209 keys per
dmtr wrote:
On Aug 6, 11:50 pm, Peter Otten __pete...@web.de wrote:
I don't know to what extent it still applys but switching off cyclic
garbage collection with
import gc
gc.disable()
Haven't tried it on the real dataset. On the synthetic test it (and
sys.setcheckinterval(10))
Looking at your benchmark, random.choice(letters) has probably less overhead
than letters[random.randint(...)]. You might even try to inline it as
Right... random.choice()... I'm a bit new to python, always something
to learn. But anyway in that benchmark (from http://bugs.python.org/issue9520
I guess with the actual dataset I'll be able to improve the memory
usage a bit, with BioPython::trie. That would probably be enough
optimization to continue working with some comfort. On this test code
BioPython::trie gives a bit of improvement in terms of memory. Not
much though...
d = dict()
On Fri, 06 Aug 2010 18:39:27 -0700, dmtr wrote:
Steven, thank you for answering. See my comments inline. Perhaps I
should have formulated my question a bit differently: Are there any
*compact* high performance containers for unicode()/str() objects in
Python? By *compact* I don't mean
I'm running into some performance / memory bottlenecks on large lists.
Is there any easy way to minimize/optimize memory usage?
Simple str() and unicode objects() [Python 2.6.4/Linux/x86]:
sys.getsizeof('') 24 bytes
sys.getsizeof('0')25 bytes
sys.getsizeof(u'')28 bytes
On Fri, 06 Aug 2010 17:45:31 -0700, dmtr wrote:
I'm running into some performance / memory bottlenecks on large lists.
Is there any easy way to minimize/optimize memory usage?
Yes, lots of ways. For example, do you *need* large lists? Often a better
design is to use generators and iterators
On 08/07/2010 02:45 AM, dmtr wrote:
I'm running into some performance / memory bottlenecks on large lists.
Is there any easy way to minimize/optimize memory usage?
Simple str() and unicode objects() [Python 2.6.4/Linux/x86]:
sys.getsizeof('') 24 bytes
sys.getsizeof('0')25 bytes
Steven, thank you for answering. See my comments inline. Perhaps I
should have formulated my question a bit differently: Are there any
*compact* high performance containers for unicode()/str() objects in
Python? By *compact* I don't mean compression. Just optimized for
memory usage, rather than
Well... 63 bytes per item for very short unicode strings... Is there
any way to do better than that? Perhaps some compact unicode objects?
There is a certain price you pay for having full-feature Python objects.
Are there any *compact* Python objects? Optimized for compactness?
What are
On Fri, Aug 6, 2010 at 6:39 PM, dmtr dchich...@gmail.com wrote:
snip
Well... 63 bytes per item for very short unicode strings... Is there
any way to do better than that? Perhaps some compact unicode objects?
If you think that unicode objects are going to be *smaller* than byte
strings, I
I'm running into some performance / memory bottlenecks on large lists.
Is there any easy way to minimize/optimize memory usage?
Simple str() and unicode objects() [Python 2.6.4/Linux/x86]:
sys.getsizeof('') 24 bytes
sys.getsizeof('0')25 bytes
sys.getsizeof(u'')28 bytes
dmtr:
What I'm really looking for is a dict() that maps short unicode
strings into tuples with integers. But just having a *compact* list
container for unicode strings would help a lot (because I could add a
__dict__ and go from it).
Add them all into one string or array and use indexes
On Aug 6, 6:56 pm, dmtr dchich...@gmail.com wrote:
Well... 63 bytes per item for very short unicode strings... Is there
any way to do better than that? Perhaps some compact unicode objects?
There is a certain price you pay for having full-feature Python objects.
Are there any *compact*
On 08/06/2010 07:56 PM, dmtr wrote:
Ultimately a dict that can store ~20,000,000 entries: (u'short
string' : (int, int, int, int, int, int, int)).
I think you really need a real database engine. With the proper
indexes, MySQL could be very fast storing and retrieving this
information for you.
19 matches
Mail list logo