#11115: Rewrite cached_method in Cython
---------------------------+------------------------------------------------
Reporter: SimonKing | Owner: jason
Type: enhancement | Status: needs_review
Priority: major | Milestone: sage-4.7
Component: misc | Keywords: category cython cache
Work_issues: | Upstream: N/A
Reviewer: | Author: Simon King
Merged: | Dependencies:
---------------------------+------------------------------------------------
Comment(by nbruin):
Responding to http://groups.google.com/group/sage-
devel/msg/0aa5aca883603f64
Most machines have a limited resource (CPU cache) which means that in
general, one expects that applications with a smaller memory footprint
should have a tendency to run a bit faster. Given that I would have a
slight preference towards option (3), i.e., keep elements small.
These things are notoriously difficult to quantify and are very processor
dependent - good implementations of linear algebra libraries have to be
tuned to be cache-friendly on each architecture.
The general principle is the following: General memory access is
relatively slow, so CPUs have a cache (nowadays several levels) of fast
memory so that repeated access to the same memory locations doesn't have
to go all the way to main memory and can be done faster.
This works because memory usage tends to be temporally and spatially
local. (temporal: the same memory locations tend to be accessed frequently
before moving on; spatial: programs tend to access memory locations that
are close together).
Making elements bigger will reduce both: If a CPU has enough cache to
store N words and an element occupies s words, then a program will run
quickly if its inner loop refers to at most N/s elements. As soon as it
refers to more, the cache starts thrashing and you'll notice a poorer
performance. As you see, small s is better. Your proposal would change s
to s+1 ...
The spatial component comes in because caching rarely happens with a
resolution of single words (because the overhead of storing which memory
locations are cached). That's why it's good to have related data stored in
consecutive memory. Increasing s also reduces that ... (apart from the
fact that Python does not give us control over location of allocation
anyway, but we can hope Python does this sanely).
Again, this is notoriously hard to quantify, but if you fix the problem
and the architecture one can usually recognise drops in performance as
data size increases, due to the different levels of cache starting to
thrash. So, in general, small elements are better, both for memory use and
for performance.
--
Ticket URL: <http://trac.sagemath.org/sage_trac/ticket/11115#comment:34>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica,
and MATLAB
--
You received this message because you are subscribed to the Google Groups
"sage-trac" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/sage-trac?hl=en.