[ 
https://issues.apache.org/jira/browse/DRILL-6519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16548584#comment-16548584
 ] 

Bridget Bevens commented on DRILL-6519:
---------------------------------------

Hi [~cgivre]

I've added the content for the phonetic functions 
[here|[https://drill.apache.org/docs/phonetic-functions/].] 

I've added the content for the string distance functions 
[here|[https://drill.apache.org/docs/string-distance-functions/].] 

I did not see the sounds_like(<string1>,<string2>) function in the changed 
files of the pull request, so I did not add it to the doc. Please

let me know if this function is supported so I can add it to the doc. 

I'm setting the doc label to doc-complete, but will make any changes you 
suggest if you have feedback for me. 

Thanks,
Bridget

> Add String Distance and Phonetic Functions
> ------------------------------------------
>
>                 Key: DRILL-6519
>                 URL: https://issues.apache.org/jira/browse/DRILL-6519
>             Project: Apache Drill
>          Issue Type: Improvement
>            Reporter: Charles Givre
>            Assignee: Charles Givre
>            Priority: Major
>              Labels: doc-impacting, ready-to-commit
>             Fix For: 1.14.0
>
>
> From a recent project, this collection of functions makes it possible to do 
> fuzzy string matching as well as phonetic matching on strings. 
>  
> The following functions are all phonetic functions and map text to a number 
> or string based on how the word sounds.  For instance "Jayme" and "Jaime" 
> have the same soundex values and hence these functions can be used to match 
> similar sounding words.
>  * caverphone1( <string> )
>  * caverphone2( <string> )
>  * cologne_phonetic( <string> )
>  * dm_soundex( <string> )
>  * double_metaphone(<string>)
>  * match_rating_encoder( <string> )
>  * metaphone(<string>)
>  * nysiis( <string> )
>  * refined_soundex(<string>)
>  * soundex(<string>)
> Additionally, there is the
> {code:java}
> sounds_like(<string1>,<string2>){code}
> function which can be used to find strings that sound similar.   For instance:
>  
> {code:java}
> SELECT * 
> FROM <data>
> WHERE sounds_like( last_name, 'Gretsky' )
> {code}
> h2. String Distance Functions
> In addition to the phonetic functions, there are a series of distance 
> functions which measure the difference between two strings.  The functions 
> include:
>  * cosine_distance(<string1>,<string2>)
>  * fuzzy_score(<string1>,<string2>)
>  * hamming_distance (<string1>,<string2>)
>  * jaccard_distance (<string1>,<string2>)
>  * jaro_distance (<string1>,<string2>)
>  * levenshtein_distance (<string1>,<string2>)
>  * longest_common_substring_distance(<string1>,<string2>)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to