[jira] [Commented] (TEXT-103) Add provision to change the cost for insert, delete and replace operation in levenshtein distance
[ https://issues.apache.org/jira/browse/TEXT-103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18065957#comment-18065957 ] Ron Ladin commented on TEXT-103: Hi [~ggregory] , I've just submitted the PR for this issue. Since this is my first contribution, the GitHub Actions are waiting for a maintainer's approval to run. Looking forward to your feedback! > Add provision to change the cost for insert, delete and replace operation in > levenshtein distance > - > > Key: TEXT-103 > URL: https://issues.apache.org/jira/browse/TEXT-103 > Project: Commons Text > Issue Type: Improvement >Reporter: Rohit Agarwal >Priority: Minor > Labels: newbie, patch > Fix For: 1.x > > Original Estimate: 48h > Remaining Estimate: 48h > > There are two implementation of levenshtein distance, unlimitedCompare and > limitedCompare. > I propose to generalise the levenshtein distance by adding an option to > change the value of > 1) Addition of Character. > 2) Deletion of Character. > 3) Substitution of Character. > Currently they are all set to 1. For backward compatibility this will be the > default case. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TEXT-103) Add provision to change the cost for insert, delete and replace operation in levenshtein distance
[ https://issues.apache.org/jira/browse/TEXT-103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18065914#comment-18065914 ] Gary D. Gregory commented on TEXT-103: -- [~rladin] Thank you for your explanation. > Add provision to change the cost for insert, delete and replace operation in > levenshtein distance > - > > Key: TEXT-103 > URL: https://issues.apache.org/jira/browse/TEXT-103 > Project: Commons Text > Issue Type: Improvement >Reporter: Rohit Agarwal >Priority: Minor > Labels: newbie, patch > Fix For: 1.x > > Original Estimate: 48h > Remaining Estimate: 48h > > There are two implementation of levenshtein distance, unlimitedCompare and > limitedCompare. > I propose to generalise the levenshtein distance by adding an option to > change the value of > 1) Addition of Character. > 2) Deletion of Character. > 3) Substitution of Character. > Currently they are all set to 1. For backward compatibility this will be the > default case. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TEXT-103) Add provision to change the cost for insert, delete and replace operation in levenshtein distance
[ https://issues.apache.org/jira/browse/TEXT-103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18065910#comment-18065910 ] Ron Ladin commented on TEXT-103: Hi ,[~ggregory] The idea is to allow custom costs for insert, delete, and substitute operations, which are currently hardcoded to 1. This enables Weighted Levenshtein Distance. In practice, this is useful when some typos are more likely than others. For example, in OCR, confusing '0' with 'O' should cost less than a random change. Same goes for keyboard proximity swapping adjacent keys is a common error that shouldn't always have the same weight as a completely different character. The implementation is backward compatible and maintains the original O(min(n, m)) memory efficiency. > Add provision to change the cost for insert, delete and replace operation in > levenshtein distance > - > > Key: TEXT-103 > URL: https://issues.apache.org/jira/browse/TEXT-103 > Project: Commons Text > Issue Type: Improvement >Reporter: Rohit Agarwal >Priority: Minor > Labels: newbie, patch > Fix For: 1.x > > Original Estimate: 48h > Remaining Estimate: 48h > > There are two implementation of levenshtein distance, unlimitedCompare and > limitedCompare. > I propose to generalise the levenshtein distance by adding an option to > change the value of > 1) Addition of Character. > 2) Deletion of Character. > 3) Substitution of Character. > Currently they are all set to 1. For backward compatibility this will be the > default case. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TEXT-103) Add provision to change the cost for insert, delete and replace operation in levenshtein distance
[ https://issues.apache.org/jira/browse/TEXT-103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18065909#comment-18065909 ] Gary D. Gregory commented on TEXT-103: -- Hello [~rohitag2100], [~rladin] Can explain what this request would do and why it's needed? > Add provision to change the cost for insert, delete and replace operation in > levenshtein distance > - > > Key: TEXT-103 > URL: https://issues.apache.org/jira/browse/TEXT-103 > Project: Commons Text > Issue Type: Improvement >Reporter: Rohit Agarwal >Priority: Minor > Labels: newbie, patch > Fix For: 1.x > > Original Estimate: 48h > Remaining Estimate: 48h > > There are two implementation of levenshtein distance, unlimitedCompare and > limitedCompare. > I propose to generalise the levenshtein distance by adding an option to > change the value of > 1) Addition of Character. > 2) Deletion of Character. > 3) Substitution of Character. > Currently they are all set to 1. For backward compatibility this will be the > default case. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TEXT-103) Add provision to change the cost for insert, delete and replace operation in levenshtein distance
[ https://issues.apache.org/jira/browse/TEXT-103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18065901#comment-18065901 ] Ron Ladin commented on TEXT-103: *Hi, I would like to pick this up and work on a patch. I will submit a PR shortly* > Add provision to change the cost for insert, delete and replace operation in > levenshtein distance > - > > Key: TEXT-103 > URL: https://issues.apache.org/jira/browse/TEXT-103 > Project: Commons Text > Issue Type: Improvement >Reporter: Rohit Agarwal >Priority: Minor > Labels: newbie, patch > Fix For: 1.x > > Original Estimate: 48h > Remaining Estimate: 48h > > There are two implementation of levenshtein distance, unlimitedCompare and > limitedCompare. > I propose to generalise the levenshtein distance by adding an option to > change the value of > 1) Addition of Character. > 2) Deletion of Character. > 3) Substitution of Character. > Currently they are all set to 1. For backward compatibility this will be the > default case. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TEXT-103) Add provision to change the cost for insert, delete and replace operation in levenshtein distance
[ https://issues.apache.org/jira/browse/TEXT-103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16284052#comment-16284052 ] Rob Tompkins commented on TEXT-103: --- Go for it...it's all yours. :-) > Add provision to change the cost for insert, delete and replace operation in > levenshtein distance > - > > Key: TEXT-103 > URL: https://issues.apache.org/jira/browse/TEXT-103 > Project: Commons Text > Issue Type: Improvement >Reporter: Rohit Agarwal >Priority: Minor > Labels: newbie, patch > Fix For: 1.x > > Original Estimate: 48h > Remaining Estimate: 48h > > There are two implementation of levenshtein distance, unlimitedCompare and > limitedCompare. > I propose to generalise the levenshtein distance by adding an option to > change the value of > 1) Addition of Character. > 2) Deletion of Character. > 3) Substitution of Character. > Currently they are all set to 1. For backward compatibility this will be the > default case. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEXT-103) Add provision to change the cost for insert, delete and replace operation in levenshtein distance
[ https://issues.apache.org/jira/browse/TEXT-103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16212824#comment-16212824 ] Pascal Schumacher commented on TEXT-103: [~arohit] Consider this issue assigned to you anyway. Looking forward to the pull request/patch. Thanks in advance! (Sadly it is not possible to assign issues to people who are not part of the commons developer group in jira, so it has to stay unassigned in jira.) > Add provision to change the cost for insert, delete and replace operation in > levenshtein distance > - > > Key: TEXT-103 > URL: https://issues.apache.org/jira/browse/TEXT-103 > Project: Commons Text > Issue Type: Improvement >Reporter: Rohit Agarwal >Priority: Minor > Labels: newbie, patch > Original Estimate: 48h > Remaining Estimate: 48h > > There are two implementation of levenshtein distance, unlimitedCompare and > limitedCompare. > I propose to generalise the levenshtein distance by adding an option to > change the value of > 1) Addition of Character. > 2) Deletion of Character. > 3) Substitution of Character. > Currently they are all set to 1. For backward compatibility this will be the > default case. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (TEXT-103) Add provision to change the cost for insert, delete and replace operation in levenshtein distance
[ https://issues.apache.org/jira/browse/TEXT-103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207120#comment-16207120 ] Rohit Agarwal commented on TEXT-103: I would like to do the same, please assign this to me. > Add provision to change the cost for insert, delete and replace operation in > levenshtein distance > - > > Key: TEXT-103 > URL: https://issues.apache.org/jira/browse/TEXT-103 > Project: Commons Text > Issue Type: Improvement >Reporter: Rohit Agarwal >Priority: Minor > Labels: newbie, patch > Original Estimate: 48h > Remaining Estimate: 48h > > There are two implementation of levenshtein distance, unlimitedCompare and > limitedCompare. > I propose to generalise the levenshtein distance by adding an option to > change the value of > 1) Addition of Character. > 2) Deletion of Character. > 3) Substitution of Character. > Currently they are all set to 1. For backward compatibility this will be the > default case. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
