[
https://issues.apache.org/jira/browse/DRILL-6519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16548584#comment-16548584
]
Bridget Bevens edited comment on DRILL-6519 at 7/18/18 11:51 PM:
-----------------------------------------------------------------
Hi [~cgivre]
I've added the content for the phonetic functions here.
I've added the content for the string distance functions here.
I did not see the sounds_like(<string1>,<string2>) function in the changed
files of the pull request, so I did not add it to the doc. Please
let me know if this function is supported so I can add it to the doc.
I'm setting the doc label to doc-complete, but will make any changes you
suggest if you have feedback for me.
Thanks,
Bridget
was (Author: bbevens):
Hi [~cgivre]
I've added the content for the phonetic functions
[here|[https://drill.apache.org/docs/phonetic-functions/].]
I've added the content for the string distance functions
[here|[https://drill.apache.org/docs/string-distance-functions/].]
I did not see the sounds_like(<string1>,<string2>) function in the changed
files of the pull request, so I did not add it to the doc. Please
let me know if this function is supported so I can add it to the doc.
I'm setting the doc label to doc-complete, but will make any changes you
suggest if you have feedback for me.
Thanks,
Bridget
> Add String Distance and Phonetic Functions
> ------------------------------------------
>
> Key: DRILL-6519
> URL: https://issues.apache.org/jira/browse/DRILL-6519
> Project: Apache Drill
> Issue Type: Improvement
> Reporter: Charles Givre
> Assignee: Charles Givre
> Priority: Major
> Labels: doc-impacting, ready-to-commit
> Fix For: 1.14.0
>
>
> From a recent project, this collection of functions makes it possible to do
> fuzzy string matching as well as phonetic matching on strings.
>
> The following functions are all phonetic functions and map text to a number
> or string based on how the word sounds. For instance "Jayme" and "Jaime"
> have the same soundex values and hence these functions can be used to match
> similar sounding words.
> * caverphone1( <string> )
> * caverphone2( <string> )
> * cologne_phonetic( <string> )
> * dm_soundex( <string> )
> * double_metaphone(<string>)
> * match_rating_encoder( <string> )
> * metaphone(<string>)
> * nysiis( <string> )
> * refined_soundex(<string>)
> * soundex(<string>)
> Additionally, there is the
> {code:java}
> sounds_like(<string1>,<string2>){code}
> function which can be used to find strings that sound similar. For instance:
>
> {code:java}
> SELECT *
> FROM <data>
> WHERE sounds_like( last_name, 'Gretsky' )
> {code}
> h2. String Distance Functions
> In addition to the phonetic functions, there are a series of distance
> functions which measure the difference between two strings. The functions
> include:
> * cosine_distance(<string1>,<string2>)
> * fuzzy_score(<string1>,<string2>)
> * hamming_distance (<string1>,<string2>)
> * jaccard_distance (<string1>,<string2>)
> * jaro_distance (<string1>,<string2>)
> * levenshtein_distance (<string1>,<string2>)
> * longest_common_substring_distance(<string1>,<string2>)
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)