#6484: sage.combinat.ranker improvements
-------------------------------------+-------------------------------------
Reporter: nthiery | Owner: nthiery
Type: enhancement | Status: needs_review
Priority: major | Milestone: sage-6.4
Component: combinatorics | Resolution:
Keywords: rank, unrank | Merged in:
Authors: Nicolas M. ThiƩry | Reviewers:
Report Upstream: N/A | Work issues:
Branch: | Commit:
u/nthiery/sage_combinat_ranker_improvements|
220cf7c58941afb5e775842da73e2291a800e0d9
Dependencies: | Stopgaps:
-------------------------------------+-------------------------------------
Comment (by nthiery):
Ah, good point; I thought our cached functions had a special case for
one parameter functions. Other than that, using a cached function was
mostly a (failed) attempt at having something picklable as well.
So, let's see about speed with various implementations:
{{{
sage: l = range(100);
sage: d = { x:i for i,x in enumerate(l) }
sage: @cached_function
....: def rank_cached(): pass
sage: for i,x in enumerate(l):
....: rank_cached.set_cache(i, x)
sage: def rank_dict(x):
....: return d[x]
sage: rank_dict_getitem = d.__getitem__
sage: cython("""
l = range(100);
d = { x:i for i,x in enumerate(l) }
cpdef int rank_dict_cython(x):
return d[x]
""")
sage: cython("""
l = range(100);
d = { x:i for i,x in enumerate(l) }
cpdef int rank_dict_cython_exception(x):
try:
return d[x]
except KeyError:
raise ValueError("%s is not blah blah blah"%x)
""")
sage: cython("""
cdef class RankFromList(dict):
def __call__(self, x):
try:
return self[x]
except KeyError:
raise ValueError("%s is not blah blah blah"%x)
""")
sage: rank_dict_cython_class = RankFromList((x,i) for i,x in enumerate(l))
}}}
Here are the timings, in decreasing order:
{{{
sage: sage: %timeit rank_cached(50)
The slowest run took 41.96 times longer than the fastest. This could mean
that an intermediate result is being cached
1000000 loops, best of 3: 381 ns per loop
sage: sage: %timeit rank_dict(50)
The slowest run took 15.56 times longer than the fastest. This could mean
that an intermediate result is being cached
1000000 loops, best of 3: 245 ns per loop
sage: sage: %timeit rank_dict_cython_class(50)
The slowest run took 22.12 times longer than the fastest. This could mean
that an intermediate result is being cached
1000000 loops, best of 3: 226 ns per loop
sage: sage: %timeit rank_dict_cython_exception(50)
The slowest run took 36.63 times longer than the fastest. This could mean
that an intermediate result is being cached
1000000 loops, best of 3: 195 ns per loop
sage: sage: %timeit rank_dict_cython(50)
The slowest run took 31.18 times longer than the fastest. This could mean
that an intermediate result is being cached
1000000 loops, best of 3: 191 ns per loop
sage: sage: %timeit rank_dict_getitem(50)
The slowest run took 17.96 times longer than the fastest. This could mean
that an intermediate result is being cached
1000000 loops, best of 3: 173 ns per loop
}}}
Those timing seem to be consistent from one call to the other. Most of
the time this feature is used with non trivial objects where the
bottleneck is the computation of hash/equality of the objects. So we
don't necessarily need to optimize to the last bit.
In view of the above, the fastest is to return the `__getitem__`
method of the dictionary. The main caveat is that we don't have
control on the exception (a `KeyError` instead of a
`ValueError`). Maybe it is not so bad since a `KeyError` is a special
kind of `ValueError`.
The most flexible is to use a class. In particular, this would make
the ranker picklable (this can be useful: I have had cases where an
object would not pickle properly because one of it's attribute was a
ranker). This also opens the door for reintroducing laziness later
on. It's 30% slower than `getitem` which could be acceptable (I was in
fact expecting a smaller overhead; maybe I missed something in the
cythonization).
What do you think?
Cheers,
Nicolas
--
Ticket URL: <http://trac.sagemath.org/ticket/6484#comment:17>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica,
and MATLAB
--
You received this message because you are subscribed to the Google Groups
"sage-trac" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sage-trac.
For more options, visit https://groups.google.com/d/optout.