#15481: Words.__contains__ returns wrong answers
---------------------------------+-------------------------
Reporter: ncohen | Owner:
Type: defect | Status: new
Priority: major | Milestone: sage-5.13
Component: combinatorics | Resolution:
Keywords: | Merged in:
Authors: | Reviewers:
Report Upstream: N/A | Work issues:
Branch: | Commit:
Dependencies: | Stopgaps:
---------------------------------+-------------------------
Comment (by slabbe):
Replying to [ticket:15481 ncohen]:
First, I want to say that I think all these problems could be solved or
get more clean by first finishing the ticket at #12224. It is a big patch,
but I am ready to put time on it to continue the review. Vincent : what's
the status of it? Do you think we will manage to get your great
improvement in Sage anytime soon?
Now, I am answering to each of the problem mention in the ticket according
to the choices that we made some years ago. If any of the choice we made
is bad, then please tell!
> {{{
> sage: Word('12') in Words('12')
> False
The default alphabet when using just `Word('12')` is the set of Python
object, because we wanted any list to be turned into a word:
{{{
sage: Word('12').parent().alphabet()
Set of Python objects of type 'object'
}}}
Of course, by default, we could have computed the alphabet from the
letters appearing in the word, but this is not efficient when the word is
very long or when the word is infinite. So, we decided it is to the user
to specify the alphabet. Also, sometimes the word lives in a larger
alphabet, so the alphabet can not be computed from the word.
In short, the user may specify the alphabet. And this fixes the above
problem:
{{{
sage: Word('12', alphabet=['1','2']) in Words('12')
True
sage: Word('12', alphabet=['1','2', '3']) in Words('123')
True
}}}
> {{{
> sage: Words(2,5).random_element() in Words(2, finite=False)
> True
I agree, this is a bug: a finite word must not be in a set of infinite
words.
{{{
sage: Words(2,5)
Words of length 5 over {1, 2}
sage: Words(2, finite=False)
Infinite Words over {1, 2}
}}}
> {{{
> sage: words.FibonacciWord() in Words([0,1], infinite=False)
> True
I agree this is a bug: it seems the `infinite` argument and sometimes the
`finite` argument is not taken into account:
{{{
sage: Words(2,infinite=True)
Words over {1, 2}
sage: Words(2,infinite=False)
Words over {1, 2}
sage: Words(2,finite=True)
Words over {1, 2}
sage: Words(2,finite=False)
Infinite Words over {1, 2}
}}}
> {{{
> sage: Words('123')('121212') in Words(2)
> False
This is OK. The alphabet of integers and of str are considered different.
{{{
sage: Words('12')
Words over {'1', '2'}
sage: Words(2)
Words over {1, 2}
sage: Words(2) == Words('12')
False
}}}
> {{{
> sage: Words('123')('121212') in Words('1234')('121212')
> False
> sage: Words('12B')('121212') in Words('1234')('121212')
> False
I don't know if the above test what you meant. The above `in` tests
whether the left is a letter of the right:
{{{
sage: w = Word('123')
sage: w.__contains__?
...
Definition: w.__contains__(self, a)
Docstring:
Test whether "a" is a letter of "self".
...
}}}
Maybe you meant this :
{{{
sage: Words('12B')('121212') in Words('1234')
False
sage: Words('123')('121212') in Words('1234')
False
sage: Words('1234')('121212') in Words('1234')
True
}}}
Yes, this is a choice we made. Maybe it is wrong. We decided `w in W` if
and only if `W is the parent of w`. It makes the code more efficient but
also less flexible... Testing if some words is in some set of words
appears so often in the code that we decided to make it like that. I think
the `__contains__` could be changed to be more flexible because we don't
have to use the `__contains__` to test for the parent of some words.
> {{{
> sage: words.FibonacciWord() in Words([0,1,2])
> False
> }}}
Same comment as above: `w in W` if and only if `W is the parent of w`.
Should this be fixed?
{{{
sage: words.FibonacciWord() in Words([0,1])
True
sage: words.FibonacciWord() in Words([0,1,2])
False
}}}
>
> I have no idea how this can be fixed, so if a Word guy can look at
this.. `O_o`
== SUMMARY ==
So, in summary, things must be fixed according to
- the behavior of the input `finite=False`, etc. and
- the behavior of `__contains__`.
Do I miss something?
== About the behavior of contains ==
So, what should `__contains__` be?
1. It seems you would like:
{{{
#!python
def __contains__(self, word):
A = self.alphabet()
return all(letter in A for letter in word)
}}}
This is much too long computations for words we know the alphabet. Also,
it doesn't work for infinite words. But if it is what the user expect, why
not?
2. Maybe something more wise could be:
{{{
#!python
def __contains__(self, word):
A = self.alphabet()
B = word.parent().alphabet()
return B is subset of A
}}}
This will make code more efficient, but the user must provide an alphabet
to the word. Thus, this won't work (if the default alphabet of Word
remains the set of all Python objects):
{{{
sage: Word('111223') in Words('123')
}}}
Also, I remember we had this idea of using subset because of the fact that
alphabet are ordered and in some cases (I don't remember exactly which
case), testing `is subset` was not enough. But, I think we can forget
about this.
3. For reference, the actual code is something like this :
{{{
#!python
def __contains__(self, word):
A = self.alphabet()
B = word.parent().alphabet()
return B is A
}}}
So are you happy with option 2? Any other options?
--
Ticket URL: <http://trac.sagemath.org/ticket/15481#comment:2>
Sage <http://www.sagemath.org>
Sage: Creating a Viable Open Source Alternative to Magma, Maple, Mathematica,
and MATLAB
--
You received this message because you are subscribed to the Google Groups
"sage-trac" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sage-trac.
For more options, visit https://groups.google.com/groups/opt_out.