#6484: sage.combinat.ranker improvements
-------------------------------------+-------------------------------------
       Reporter:  nthiery            |        Owner:  nthiery
           Type:  enhancement        |       Status:  needs_review
       Priority:  major              |    Milestone:  sage-6.4
      Component:  combinatorics      |   Resolution:
       Keywords:  rank, unrank       |    Merged in:
        Authors:  Nicolas M. ThiƩry  |    Reviewers:
Report Upstream:  N/A                |  Work issues:
         Branch:                     |       Commit:
  u/nthiery/sage_combinat_ranker_improvements|  
220cf7c58941afb5e775842da73e2291a800e0d9
   Dependencies:                     |     Stopgaps:
-------------------------------------+-------------------------------------

Comment (by nthiery):

 Ah, good point; I thought our cached functions had a special case for
 one parameter functions. Other than that, using a cached function was
 mostly a (failed) attempt at having something picklable as well.

 So, let's see about speed with various implementations:
 {{{
 sage: l = range(100);
 sage: d = { x:i for i,x in enumerate(l) }
 sage: @cached_function
 ....: def rank_cached(): pass
 sage: for i,x in enumerate(l):
 ....:     rank_cached.set_cache(i, x)
 sage: def rank_dict(x):
 ....:     return d[x]
 sage: rank_dict_getitem = d.__getitem__

 sage: cython("""
 l = range(100);
 d = { x:i for i,x in enumerate(l) }
 cpdef int rank_dict_cython(x):
     return d[x]
 """)
 sage: cython("""
 l = range(100);
 d = { x:i for i,x in enumerate(l) }
 cpdef int rank_dict_cython_exception(x):
     try:
         return d[x]
     except KeyError:
         raise ValueError("%s is not blah blah blah"%x)
 """)

 sage: cython("""
 cdef class RankFromList(dict):
     def __call__(self, x):
         try:
             return self[x]
         except KeyError:
             raise ValueError("%s is not blah blah blah"%x)
 """)
 sage: rank_dict_cython_class = RankFromList((x,i) for i,x in enumerate(l))
 }}}

 Here are the timings, in decreasing order:
 {{{
 sage: sage: %timeit rank_cached(50)
 The slowest run took 41.96 times longer than the fastest. This could mean
 that an intermediate result is being cached
 1000000 loops, best of 3: 381 ns per loop
 sage: sage: %timeit rank_dict(50)
 The slowest run took 15.56 times longer than the fastest. This could mean
 that an intermediate result is being cached
 1000000 loops, best of 3: 245 ns per loop
 sage: sage: %timeit rank_dict_cython_class(50)
 The slowest run took 22.12 times longer than the fastest. This could mean
 that an intermediate result is being cached
 1000000 loops, best of 3: 226 ns per loop
 sage: sage: %timeit rank_dict_cython_exception(50)
 The slowest run took 36.63 times longer than the fastest. This could mean
 that an intermediate result is being cached
 1000000 loops, best of 3: 195 ns per loop
 sage: sage: %timeit rank_dict_cython(50)
 The slowest run took 31.18 times longer than the fastest. This could mean
 that an intermediate result is being cached
 1000000 loops, best of 3: 191 ns per loop
 sage: sage: %timeit rank_dict_getitem(50)
 The slowest run took 17.96 times longer than the fastest. This could mean
 that an intermediate result is being cached
 1000000 loops, best of 3: 173 ns per loop
 }}}

 Those timing seem to be consistent from one call to the other. Most of
 the time this feature is used with non trivial objects where the
 bottleneck is the computation of hash/equality of the objects. So we
 don't necessarily need to optimize to the last bit.

 In view of the above, the fastest is to return the `__getitem__`
 method of the dictionary. The main caveat is that we don't have
 control on the exception (a `KeyError` instead of a
 `ValueError`). Maybe it is not so bad since a `KeyError` is a special
 kind of `ValueError`.

 The most flexible is to use a class. In particular, this would make
 the ranker picklable (this can be useful: I have had cases where an
 object would not pickle properly because one of it's attribute was a
 ranker). This also opens the door for reintroducing laziness later
 on. It's 30% slower than `getitem` which could be acceptable (I was in
 fact expecting a smaller overhead; maybe I missed something in the
 cythonization).

 What do you think?

 Cheers,
                          Nicolas

--
Ticket URL: <http://trac.sagemath.org/ticket/6484#comment:17>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica, 
and MATLAB

-- 
You received this message because you are subscribed to the Google Groups 
"sage-trac" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sage-trac.
For more options, visit https://groups.google.com/d/optout.

Reply via email to