#15367: Empty lists while creating parents
---------------------------+-------------------------
Reporter: roed | Owner:
Type: defect | Status: new
Priority: major | Milestone: sage-5.13
Component: memleak | Resolution:
Keywords: | Merged in:
Authors: | Reviewers:
Report Upstream: N/A | Work issues:
Branch: | Commit:
Dependencies: | Stopgaps:
---------------------------+-------------------------
Comment (by nbruin):
Memory footprint:
{{{
sage: from sage.structure.coerce_dict import MonoDict
sage: memus= get_memory_usage();
sage: L=[MonoDict(23) for i in xrange(2*10^5)]
sage: print get_memory_usage(memus);
430.5078125
sage: memus= get_memory_usage();
sage: L2=[MonoDict(53) for i in xrange(2*10^5)]
sage: print get_memory_usage(memus);
908.0078125
sage: memus= get_memory_usage();
sage: L3=[sage.structure.coerce_dict.TripleDict(23) for i in
xrange(2*10^5)]
sage: print get_memory_usage(memus);
430.5546875
sage: L4=[sage.structure.coerce_dict.TripleDict(11) for i in
xrange(2*10^5)]
sage: print get_memory_usage(memus);
248.15234375
}}}
So we see that if we assume that `1MB=10^6B`, then the memory footprint of
empty !TripleDict and !MonoDict is about 112 to 93 to 86 bytes per
initialized bucket (for empty dicts they should have about the same memory
footprint. This would not be true for an open-coded hash table). So
scaling back all these dictionaries to starting size 11 is probably quite
reasonable: That allows 6 entries in them before resizing would be
triggered.
It also makes a difference in initialization time:
{{{
sage: from sage.structure.coerce_dict import MonoDict, TripleDict
sage: %time _=[sage.structure.coerce_dict.TripleDict(11) for i in
xrange(2*10^5)]
CPU times: user 1.30 s, sys: 0.07 s, total: 1.37 s
Wall time: 1.37 s
sage: %time _=[sage.structure.coerce_dict.TripleDict(23) for i in
xrange(2*10^5)]
CPU times: user 2.14 s, sys: 0.11 s, total: 2.25 s
Wall time: 2.26 s
sage: %time _=[sage.structure.coerce_dict.TripleDict(53) for i in
xrange(2*10^5)]
CPU times: user 3.33 s, sys: 0.16 s, total: 3.49 s
Wall time: 3.50 s
}}}
As you can see, allocating the empty lists is a pretty significant cost as
well. For an open-coded hash table we would expect on a 64-bit machine: 16
bytes per entry in !MonoDict and 32 bytes per entry in !TripleDict
(compared to 24 bytes for a python dict)
Furthermore, allocation time would be fairly independent of size: the
store would simply be a slab of memory that can be initialized to null by
a straight memory-zeroing call.
So I'm pretty sure we could make !MonoDict and !TripleDict quite a bit
faster and more memory-efficient by basically just copying the open hash-
table design for python's dict. We'd be working with table sizes that are
a power of two, so we'd probably want to shift the addresses that we use
as keys (given that the bottom 3 bits probably carry no information)
It's quite a bit of work to get right, so we should probably first
establish that this is actually a bottleneck somewhere. I don't think the
runtime characteristics of the implementations would be hugely different
(but I cannot exclude something like a factor of 2 in runtime)
--
Ticket URL: <http://trac.sagemath.org/ticket/15367#comment:4>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica,
and MATLAB
--
You received this message because you are subscribed to the Google Groups
"sage-trac" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sage-trac.
For more options, visit https://groups.google.com/groups/opt_out.