Ho ok, my bad I did not read enough...
Maybe just put that note at the class level? If you just read the class
description, it is not clear that there is a difference between the two concepts
Thanks for the clarification !
Envoyé depuis mon smartphone Samsung Galaxy.
Message d'origine De : Rob Tompkins Date
: 31.03.17 13:34 (GMT+01:00) À : Commons Users List
Objet : Re: [text] Longest common subsequence wrong result?
Hello Sébastien,
From what I can tell this would be expected behaviour. I think this hinges on
the definition of “subsequence” differing from the definition of “substring.”
By this I mean that a subsequence to be an enumerated list of elements derived
by deleting some (possibly zero) elements from the original enumerated list.
Whereas, a substring is an enumerated list of characters derived by deleting
some (possibly zero) elements from the original character list and that our new
character list were adjacent in the original list.
So, in your example of “Gandalf” and “Sauron” share the subsequence {a, n}.
But, it we were to restrict to substring, then the longest commons substring
would simply be {a}.
I’ve tried to spell this out in the javadoc here
(http://commons.apache.org/proper/commons-text/javadocs/api-release/org/apache/commons/text/similarity/LongestCommonSubsequence.html#logestCommonSubsequence-java.lang.CharSequence-java.lang.CharSequence-),
but I suppose I should have been clearer in the documentation.
Do let me know if you think there’s a way to better present this details.
Many thanks and all the best,
-Rob
> On Mar 31, 2017, at 7:16 AM, Sébastien Piller wrote:
>
> Hi all,
> If I call
> new LongestCommonSubsequence ().apply ("xxx","yyy")
> I get 0 (correct)
> If I call
> new LongestCommonSubsequence ().apply ("Gandalf","Sauron")
> I get 2 which looks incorrect to me (should have got 1 since there is no
> sequence of 2 chars on both strings. Is it a bug or an expected behavior?
> Thanks
>
> Envoyé depuis mon smartphone Samsung Galaxy.