[ 
https://issues.apache.org/jira/browse/COLLECTIONS-404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644082#comment-13644082
 ] 

Thomas Neidhart commented on COLLECTIONS-404:
---------------------------------------------

Yes exactly.

Another thing I have found when comparing the implementation with the paper:

(SequencesComparator#getMiddleSnake: line 162 ff)
{noformat}
                // Second step
                if (delta % 2 != 0 && delta - d <= k && k <= delta + d) {
                    if (vUp[i-delta] <= vDown[i]) {
                        return buildSnake(vUp[i-delta], k + start1 - start2, 
end1, end2);
                    }
                }
{noformat}

according to the paper it should be:

{noformat}
                // Second step
                if (delta % 2 != 0 && delta - (d - 1) <= k && k <= delta + (d - 
1)) {
                    if (vUp[i-delta] <= vDown[i]) {
                        return buildSnake(vUp[i-delta], k + start1 - start2, 
end1, end2);
                    }
                }
{noformat}

can you confirm this with the author of the code?

Thanks

Thomas
                
> Adding an implementation of Eugene Myers difference algorithm
> -------------------------------------------------------------
>
>                 Key: COLLECTIONS-404
>                 URL: https://issues.apache.org/jira/browse/COLLECTIONS-404
>             Project: Commons Collections
>          Issue Type: Improvement
>          Components: Collection
>    Affects Versions: 3.2.1
>         Environment: all
>            Reporter: Luc Maisonobe
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: commons-collections-difference.patch, 
> commons-collections-difference-v2.patch, comparator.zip
>
>
> The difference algorithm aims at comparing two sequences of objects and 
> return an "edit script" which represents how one can transform the first 
> sequence into the second sequence. The script describes the various insert 
> object, delete object and keep object commands. The script is guaranteed to 
> be the shortest possible in terms of number of commands.
> From the script, one can either extract longest common sub-sequences (i.e. 
> how similar the sequences are) or on the contrary the needed changes (i.e. 
> how different the sequences are).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to