#8187: improve equality tests for words
-----------------------------+----------------------------------------------
   Reporter:  slabbe         |       Owner:  sage-combinat
       Type:  enhancement    |      Status:  new          
   Priority:  major          |   Milestone:  sage-4.3.2   
  Component:  combinatorics  |    Keywords:               
     Author:                 |    Upstream:  N/A          
   Reviewer:                 |      Merged:               
Work_issues:                 |  
-----------------------------+----------------------------------------------
 More often, when we compare two words, we test their equality and not that
 one is less than the other. So why not implement the equatlity test!? This
 ticket does this for datatypes and for math objects. It also improve the
 comparison function (by removing it!).

 1. This patch adds equality test for `WordDatatype_str`,
 `WordDatatype_list` and `WordDatatype_tuple`  (via `__richcmp__`) usefull
 when comparing two words represented by the same kind of data. It is now
 as fast as the python object :

 {{{
 BEFORE:

     sage: w = Word(range(10000))
     sage: z = Word(range(10000))
     sage: %timeit w == z
     125 loops, best of 3: 4.08 ms per loop

 AFTER:

     sage: w = Word(range(10000))
     sage: z = Word(range(10000))
     sage: %timeit w == z
     625 loops, best of 3: 422 µs per loop

 PYTHON OBJECT:

     sage: w = range(10000)
     sage: z = range(10000)
     sage: %timeit w == z
     625 loops, best of 3: 442 µs per loop
 }}}

 {{{
 BEFORE:

     sage: w = Word(tuple(range(10000)))
     sage: z = Word(tuple(range(10000)))
     sage: %timeit w == z
     125 loops, best of 3: 3.97 ms per loop

 AFTER:

     sage: w = Word(tuple(range(10000)))
     sage: z = Word(tuple(range(10000)))
     sage: %timeit w == z
     625 loops, best of 3: 419 µs per loop

 PYTHON OBJECT:

     sage: w = tuple(range(10000))
     sage: z = tuple(range(10000))
     sage: %timeit w == z
     625 loops, best of 3: 420 µs per loop
 }}}

 {{{
 BEFORE:

     sage: w = Word('a'*10000)
     sage: z = Word('a'*10000)
     sage: %timeit w == z
     125 loops, best of 3: 3.9 ms per loop

 AFTER:

     sage: w = Word('a'*10000)
     sage: z = Word('a'*10000)
     sage: %timeit w == z
     625 loops, best of 3: 2.36 µs per loop

 PYTHON OBJECT:

     sage: w = 'a'*10000
     sage: z = 'a'*10000
     sage: %timeit w == z
     625 loops, best of 3: 2.03 µs per loop

 }}}

 2. Remove the `__cmp__` from `FiniteWord_class` since the same function in
 `Word_class` does the job anyway and in a cleaner way : it doesn't use the
 (useless?) coerce function. Surprinsingly, removing it makes it faster :

 {{{
 BEFORE:

     sage: w = Word([0]*10000)
     sage: z = Word([0]*10000, alphabet=[0,1])
     sage: type(w)
     <class 'sage.combinat.words.word.FiniteWord_list'>
     sage: type(z)
     <class 'sage.combinat.words.word.FiniteWord_list'>
     sage: %timeit w.__cmp__(w)
     125 loops, best of 3: 3.79 ms per loop
     sage: %timeit w.__cmp__(z)
     25 loops, best of 3: 13.3 ms per loop
     sage: %timeit z.__cmp__(w)
     5 loops, best of 3: 50.1 ms per loop
     sage: %timeit z.__cmp__(z)
     25 loops, best of 3: 35.7 ms per loop


 AFTER:

     sage: w = Word([0]*10000)
     sage: z = Word([0]*10000, alphabet=[0,1])
     sage: type(w)
     <class 'sage.combinat.words.word.FiniteWord_list'>
     sage: type(z)
     <class 'sage.combinat.words.word.FiniteWord_list'>
     sage: %timeit w.__cmp__(w)
     125 loops, best of 3: 3.89 ms per loop
     sage: %timeit w.__cmp__(z)
     125 loops, best of 3: 5.4 ms per loop
     sage: %timeit z.__cmp__(w)
     25 loops, best of 3: 35.9 ms per loop
     sage: %timeit z.__cmp__(z)
     25 loops, best of 3: 35.7 ms per loop
 }}}

 NOTE : The difference between w and z above is that the parent of w is the
 alphabet of all python objects which uses the cmp of python to compare the
 letters whereas z compares its letters relatively to the order of the
 letters defined by the parent (here 0 < 1 but one could also say 1 < 0)
 which is slower.


 3. Add the `__eq__` and `__ne__` for `Word_class` because it is faster
 than using `__cmp__` (especially when parent alphabet are defined):

 no parents :

 {{{
 BEFORE:

     sage: L = range(10000)
     sage: t = tuple(L)
     sage: w = Word(L)
     sage: z = Word(t)
     sage: type(w)
     <class 'sage.combinat.words.word.FiniteWord_list'>
     sage: type(z)
     <class 'sage.combinat.words.word.FiniteWord_tuple'>
     sage: %timeit w == z
     125 loops, best of 3: 3.69 ms per loop


 AFTER:

     sage: L = range(10000)
     sage: t = tuple(L)
     sage: w = Word(L)
     sage: z = Word(t)
     sage: type(w)
     <class 'sage.combinat.words.word.FiniteWord_list'>
     sage: type(z)
     <class 'sage.combinat.words.word.FiniteWord_tuple'>
     sage: %timeit w == z
     625 loops, best of 3: 1.44 ms per loop
 }}}

 with parents (!!):

 {{{
 BEFORE:

     sage: W = Words([0,1,2])
     sage: w = W([0, 1, 1, 2]*4000)
     sage: z = W([0, 1, 1, 2]*4000)
     sage: %timeit w == z
     5 loops, best of 3: 63 ms per loop

 AFTER:

     sage: W = Words([0,1,2])
     sage: w = W([0, 1, 1, 2]*4000)
     sage: z = W([0, 1, 1, 2]*4000)
     sage: %timeit w == z
     125 loops, best of 3: 2.57 ms per loop
 }}}

-- 
Ticket URL: <http://trac.sagemath.org/sage_trac/ticket/8187>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica, 
and MATLAB

-- 
You received this message because you are subscribed to the Google Groups 
"sage-trac" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/sage-trac?hl=en.

Reply via email to