Re: [sage-trac] #16935: Faster palindromes function for the Words library

sage-trac Wed, 10 Sep 2014 13:36:18 -0700

#16935: Faster palindromes function for the Words library
-------------------------------------+-------------------------------------
       Reporter:  nadialafreniere    |        Owner:
           Type:  enhancement        |       Status:  new
       Priority:  major              |    Milestone:  sage-6.4
      Component:  combinatorics      |   Resolution:
       Keywords:  words, finite      |    Merged in:
  words, palindromes                 |    Reviewers:
        Authors:  Nadia Lafrenière   |  Work issues:
Report Upstream:  N/A                |       Commit:
         Branch:                     |  ca9a46b527309d80a785b8d0090f66b7691e1c9d
  u/nadialafreniere/16935            |     Stopgaps:
   Dependencies:                     |
-------------------------------------+-------------------------------------
Changes (by nadialafreniere):


 * commit:  a86cf9334a58fe016e06577e915a5ccf33cce7c7 =>
     ca9a46b527309d80a785b8d0090f66b7691e1c9d


Old description:

> I've written a faster algorithm to get the set of all the distinct
> palindromic factors of a given word. There's also a new function called
> long_lpc (for long method to get the longest palindome centered at a
> given position) that is added.
>
> The principle of the algorithm is to calculate, for every letter and
> every space between two letters, the longest palindrome centered in that
> position. Once a longer palindrome is found, the information leftwise to
> the center is copied to the right because the proper subpalindromes are
> present at least twice.
>
> At the same time, there is a list created recording the length of the
> longest palindromic suffix of each prefix of the word. Since the number
> of prefixes is bounded by one plus the length of the word, the set of
> palindromes to be compared is not too big (the comparison is necessary to
> get the unicity).
>
> There is still a problem with the palindromes() function already in SAGE
> that has not been fixed with the new algorithm : For `f`-palindromes (or
> pseudo-palindromes), the word morphism `f` cannot apply to words
> generated with the WordGenerator class on the default alphabet (0,1). The
> main problem is that the type of the letters of these words are `int`
> (not words, nor any type that can be changed into words). I did not find
> any way to fix it, except to change the alphabet when generating these
> words.
>
> {{{
> sage: fib=words.FibonacciWord()[:1000]
> sage: print "type de fib : " + str(type(fib))
> type de fib : <class
> 'sage.combinat.words.word.FiniteWord_iter_with_caching'>
>
> sage: print "type de fib[5] : " + str(type(fib[5]))
> type de fib[5] : <type 'int'>
>
> sage: print "type de words.KolakoskiWord()[2] : " +
> str(type(words.KolakoskiWord()[2]))
> type de words.KolakoskiWord()[2] : <type 'int'>
>
> sage: print "type de words.ThueMorseWord()[2] : " +
> str(type(words.ThueMorseWord()[2]))
> type de words.ThueMorseWord()[2] : <type 'int'>
> }}}

New description:

 I've written a faster algorithm to get the set of all the distinct
 palindromic factors of a given word. There's also a new function called
 long_lpc (for long method to get the longest palindome centered at a given
 position) that is added.

 You can see here that the algorithm is factor, especially for long words.

 Here is the old version :
 {{{
 sage: f=WordMorphism({'a':['a','b'], 'b':['a']}).fixed_point('a')[:1000]
 sage: %time M=f.palindromes()
 CPU times: user 1.83 s, sys: 22.8 ms, total: 1.85 s
 Wall time: 1.83 s

 sage: t=WordMorphism({'a':['a','b'],
 'b':['b','a']}).fixed_point('a')[:1000]
 sage: %time N=t.palindromes()
 CPU times: user 1.02 s, sys: 8.4 ms, total: 1.03 s
 Wall time: 1.02 s

 sage: tm=WordMorphism({'a':['a','b'],
 'b':['b','a']}).fixed_point('a')[:5000]
 sage: %time N=tm.palindromes()
 CPU times: user 28.5 s, sys: 117 ms, total: 28.7 s
 Wall time: 28.6 s
 }}}

 And the new version :
 {{{
 sage: f=WordMorphism({'a':['a','b'], 'b':['a']}).fixed_point('a')[:1000]
 sage: %time M=f.palindromes()
 CPU times: user 436 ms, sys: 20 ms, total: 456 ms
 Wall time: 450 ms

 sage: t=WordMorphism({'a':['a','b'],
 'b':['b','a']}).fixed_point('a')[:1000]
 sage: %time N=t.palindromes()
 CPU times: user 362 ms, sys: 19.7 ms, total: 382 ms
 Wall time: 371 ms

 sage: tm=WordMorphism({'a':['a','b'],
 'b':['b','a']}).fixed_point('a')[:5000]
 sage: %time N=tm.palindromes()
 CPU times: user 2.94 s, sys: 45.3 ms, total: 2.99 s
 Wall time: 2.96 s

 }}}


 The principle of the algorithm is to calculate, for every letter and every
 space between two letters, the longest palindrome centered in that
 position. Once a longer palindrome is found, the information leftwise to
 the center is copied to the right because the proper subpalindromes are
 present at least twice.

 At the same time, there is a list created recording the length of the
 longest palindromic suffix of each prefix of the word. Since the number of
 prefixes is bounded by one plus the length of the word, the set of
 palindromes to be compared is not too big (the comparison is necessary to
 get the unicity).

 There is still a problem with the palindromes() function already in SAGE
 that has not been fixed with the new algorithm : For `f`-palindromes (or
 pseudo-palindromes), the word morphism `f` cannot apply to words generated
 with the WordGenerator class on the default alphabet (0,1). The main
 problem is that the type of the letters of these words are `int` (not
 words, nor any type that can be changed into words). I did not find any
 way to fix it, except to change the alphabet when generating these words.

 {{{
 sage: fib=words.FibonacciWord()[:1000]
 sage: type(fib)
 <class 'sage.combinat.words.word.FiniteWord_iter_with_caching'>

 sage: type(fib[5])
 <type 'int'>

 sage: type(words.KolakoskiWord()[2])
 <type 'int'>

 sage: type(words.ThueMorseWord()[2])
 <type 'int'>
 }}}

--

--
Ticket URL: <http://trac.sagemath.org/ticket/16935#comment:4>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica, 
and MATLAB

-- 
You received this message because you are subscribed to the Google Groups 
"sage-trac" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sage-trac.
For more options, visit https://groups.google.com/d/optout.

Re: [sage-trac] #16935: Faster palindromes function for the Words library

Reply via email to