The spell correction functionality in MarkLogic employs the Double
Metaphone algorithm: 

http://en.wikipedia.org/wiki/Double_Metaphone

This is a more modern and more sophisticated approach to phonetic
matches than soundex.

You can load one of the sample dictionaries on the developer site, your
own, or use the word lexicon of your database to generate a list of
terms that exist across your documents. 

Kelly

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
[EMAIL PROTECTED]
Sent: Wednesday, May 14, 2008 3:00 PM
To: [email protected]
Subject: General Digest, Vol 47, Issue 12

Send General mailing list submissions to
        [email protected]

To subscribe or unsubscribe via the World Wide Web, visit
        http://xqzone.com/mailman/listinfo/general
or, via email, send a message with subject or body 'help' to
        [EMAIL PROTECTED]

You can reach the person managing the list at
        [EMAIL PROTECTED]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of General digest..."


Today's Topics:

   1. what is marklogic (Vikash Ranjan)
   2. Fuzzy and/or phonetic searching (Steve Mallen)


----------------------------------------------------------------------

Message: 1
Date: Wed, 14 May 2008 15:05:28 +0530
From: "Vikash Ranjan" <[EMAIL PROTECTED]>
Subject: [MarkLogic Dev General] what is marklogic
To: [email protected]
Message-ID:
        <[EMAIL PROTECTED]>
Content-Type: text/plain; charset="iso-8859-1"

Hi, I am new to this Marklogic what to know some more information
regarding
Marklogic. Please anyone let me know.

Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://xqzone.marklogic.com/pipermail/general/attachments/20080514/5e038
076/attachment-0001.html

------------------------------

Message: 2
Date: Wed, 14 May 2008 15:53:49 +0100
From: Steve Mallen <[EMAIL PROTECTED]>
Subject: [MarkLogic Dev General] Fuzzy and/or phonetic searching
To: [email protected]
Message-ID: <[EMAIL PROTECTED]>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Hi folks,

I've been looking through the developer docs to try to find out if I can

do fuzzy searching or any type of phonetic searching in XQuery with Mark

Logic.

Does anyone know if there any functions to determine similarities and 
distance between strings - e.g. soundex, levenstein, metaphone?

Specifically, I'd like to be able to do lucene-style fuzzy searches 
based on levenstein distance (for example, in Lucene, a search for 
"roam~" will find words like "foam" and "roams").  The spellcheck module

looks like it does something similar, but I'm not sure what the 
implementation is based on?  How does it find words from a dictionary 
that are spelt similarly to the search term?  Is there any developer 
control over this?

I'd also like to be able to do phonetic searches, so that, for example, 
a search for "fiziks" would match "physics" since they are phonetically 
similar.  A few relational databases support "soundex" searches, and 
SOLR supports the use of various phonetic transcription algorithms.  I 
guess that I could create an index of phonetic transcriptions during 
content load, and do lookups based on that, but it would be good if 
there was something I could use 'out-of-the-box'.

Could anyone shed any light on this?

Many thanks,
-Steve



------------------------------

_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general


End of General Digest, Vol 47, Issue 12
***************************************
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to