#16935: Faster palindromes function for the Words library
-------------------------------------+-------------------------------------
Reporter: nadialafreniere | Owner:
Type: enhancement | Status: new
Priority: major | Milestone: sage-6.4
Component: combinatorics | Resolution:
Keywords: words, finite | Merged in:
words, palindromes | Reviewers:
Authors: Nadia Lafrenière | Work issues:
Report Upstream: N/A | Commit:
Branch: | ca9a46b527309d80a785b8d0090f66b7691e1c9d
u/nadialafreniere/16935 | Stopgaps:
Dependencies: |
-------------------------------------+-------------------------------------
Changes (by nadialafreniere):
* commit: a86cf9334a58fe016e06577e915a5ccf33cce7c7 =>
ca9a46b527309d80a785b8d0090f66b7691e1c9d
Old description:
> I've written a faster algorithm to get the set of all the distinct
> palindromic factors of a given word. There's also a new function called
> long_lpc (for long method to get the longest palindome centered at a
> given position) that is added.
>
> The principle of the algorithm is to calculate, for every letter and
> every space between two letters, the longest palindrome centered in that
> position. Once a longer palindrome is found, the information leftwise to
> the center is copied to the right because the proper subpalindromes are
> present at least twice.
>
> At the same time, there is a list created recording the length of the
> longest palindromic suffix of each prefix of the word. Since the number
> of prefixes is bounded by one plus the length of the word, the set of
> palindromes to be compared is not too big (the comparison is necessary to
> get the unicity).
>
> There is still a problem with the palindromes() function already in SAGE
> that has not been fixed with the new algorithm : For `f`-palindromes (or
> pseudo-palindromes), the word morphism `f` cannot apply to words
> generated with the WordGenerator class on the default alphabet (0,1). The
> main problem is that the type of the letters of these words are `int`
> (not words, nor any type that can be changed into words). I did not find
> any way to fix it, except to change the alphabet when generating these
> words.
>
> {{{
> sage: fib=words.FibonacciWord()[:1000]
> sage: print "type de fib : " + str(type(fib))
> type de fib : <class
> 'sage.combinat.words.word.FiniteWord_iter_with_caching'>
>
> sage: print "type de fib[5] : " + str(type(fib[5]))
> type de fib[5] : <type 'int'>
>
> sage: print "type de words.KolakoskiWord()[2] : " +
> str(type(words.KolakoskiWord()[2]))
> type de words.KolakoskiWord()[2] : <type 'int'>
>
> sage: print "type de words.ThueMorseWord()[2] : " +
> str(type(words.ThueMorseWord()[2]))
> type de words.ThueMorseWord()[2] : <type 'int'>
> }}}
New description:
I've written a faster algorithm to get the set of all the distinct
palindromic factors of a given word. There's also a new function called
long_lpc (for long method to get the longest palindome centered at a given
position) that is added.
You can see here that the algorithm is factor, especially for long words.
Here is the old version :
{{{
sage: f=WordMorphism({'a':['a','b'], 'b':['a']}).fixed_point('a')[:1000]
sage: %time M=f.palindromes()
CPU times: user 1.83 s, sys: 22.8 ms, total: 1.85 s
Wall time: 1.83 s
sage: t=WordMorphism({'a':['a','b'],
'b':['b','a']}).fixed_point('a')[:1000]
sage: %time N=t.palindromes()
CPU times: user 1.02 s, sys: 8.4 ms, total: 1.03 s
Wall time: 1.02 s
sage: tm=WordMorphism({'a':['a','b'],
'b':['b','a']}).fixed_point('a')[:5000]
sage: %time N=tm.palindromes()
CPU times: user 28.5 s, sys: 117 ms, total: 28.7 s
Wall time: 28.6 s
}}}
And the new version :
{{{
sage: f=WordMorphism({'a':['a','b'], 'b':['a']}).fixed_point('a')[:1000]
sage: %time M=f.palindromes()
CPU times: user 436 ms, sys: 20 ms, total: 456 ms
Wall time: 450 ms
sage: t=WordMorphism({'a':['a','b'],
'b':['b','a']}).fixed_point('a')[:1000]
sage: %time N=t.palindromes()
CPU times: user 362 ms, sys: 19.7 ms, total: 382 ms
Wall time: 371 ms
sage: tm=WordMorphism({'a':['a','b'],
'b':['b','a']}).fixed_point('a')[:5000]
sage: %time N=tm.palindromes()
CPU times: user 2.94 s, sys: 45.3 ms, total: 2.99 s
Wall time: 2.96 s
}}}
The principle of the algorithm is to calculate, for every letter and every
space between two letters, the longest palindrome centered in that
position. Once a longer palindrome is found, the information leftwise to
the center is copied to the right because the proper subpalindromes are
present at least twice.
At the same time, there is a list created recording the length of the
longest palindromic suffix of each prefix of the word. Since the number of
prefixes is bounded by one plus the length of the word, the set of
palindromes to be compared is not too big (the comparison is necessary to
get the unicity).
There is still a problem with the palindromes() function already in SAGE
that has not been fixed with the new algorithm : For `f`-palindromes (or
pseudo-palindromes), the word morphism `f` cannot apply to words generated
with the WordGenerator class on the default alphabet (0,1). The main
problem is that the type of the letters of these words are `int` (not
words, nor any type that can be changed into words). I did not find any
way to fix it, except to change the alphabet when generating these words.
{{{
sage: fib=words.FibonacciWord()[:1000]
sage: type(fib)
<class 'sage.combinat.words.word.FiniteWord_iter_with_caching'>
sage: type(fib[5])
<type 'int'>
sage: type(words.KolakoskiWord()[2])
<type 'int'>
sage: type(words.ThueMorseWord()[2])
<type 'int'>
}}}
--
--
Ticket URL: <http://trac.sagemath.org/ticket/16935#comment:4>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica,
and MATLAB
--
You received this message because you are subscribed to the Google Groups
"sage-trac" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sage-trac.
For more options, visit https://groups.google.com/d/optout.