[ 
https://issues.apache.org/jira/browse/IMPALA-8752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896106#comment-16896106
 ] 

Norbert Luksa commented on IMPALA-8752:
---------------------------------------

https://gerrit.cloudera.org/#/c/13870/

> Add Jaro-winkler edit distance and similarity built-in function
> ---------------------------------------------------------------
>
>                 Key: IMPALA-8752
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8752
>             Project: IMPALA
>          Issue Type: New Feature
>            Reporter: Norbert Luksa
>            Assignee: Norbert Luksa
>            Priority: Major
>              Labels: built-in-function
>
> References:
>  * [Apache commons - JaroWinklerDistance 
> |[https://commons.apache.org/proper/commons-text/apidocs/org/apache/commons/text/similarity/JaroWinklerDistance.html]]
>  * [Apache commons - JaroWinklerSimilarity 
> |[https://commons.apache.org/proper/commons-text/apidocs/org/apache/commons/text/similarity/JaroWinklerSimilarity.html]]
>  * [Oracle - 
> JARO_WINKLER[_SIMILARITY]|[https://oracle-base.com/articles/11g/utl_match-string-matching-in-oracle]]
> Notable difference:
>  * With similarity, the Oracle version returns a normalized result ranging 
> from 0 to 100.
>  * In the Appache version, null values result in exceptions.
>  * Apache rounds the values to two digitsĀ 
> The scaling factor of the algorithm can be added as an extra/default argument.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to