On Sunday 28 October 2007 01:28, Nikola Smolenski wrote:
> Would it be possible to add some user defined functions to MySQL servers?
> I'm having in mind Levenshtein distance:
> http://empyrean.lib.ndsu.nodak.edu/~nem/mysql/udf/dludf.cgi?ckey=28

Given that there was no response to this, perhaps I should be a bit less 
terse.

A user defined function (UDF) is a function added to MySQL by an external 
library. It is used as any other MySQL function. You can read more at 
http://dev.mysql.com/doc/refman/5.0/en/adding-functions.html and 
http://dev.mysql.com/doc/refman/5.0/en/create-function.html . I can 
(hopefully) compile the library, but root access is needed to add the 
function to MySQL.

A function which I would like to have is Levenshtein distance. This function 
can tell how similar two strings are, similar to SOUNDEX() function, but the 
latter is limited to English language, and even in it it Levenshtein could 
perform better. You can read more at 
http://en.wikipedia.org/wiki/Levenshtein_distance . I see two very 
interesting applications for it: finding articles with similar titles, and 
measuring amount of a contribution. The first is obviously useful to locate 
missing redirects, duplicate articles and similar problems, and could later 
even be included in MediaWiki to assist searching, and the second could be 
used to highlight significant edits (edits which change a lot of text in an 
article may still have make small difference in text size) albeit this 
probably wouldn't be useful on the Toolserver.

To my knowledge, there should be no performance issues (MySQL shouldn't work 
slower with UDFs installed), and I have not found about any security issues.

If someone thinks that another UDF might be useful, see a list of them at
http://empyrean.lib.ndsu.nodak.edu/~nem/mysql/udf/

_______________________________________________
Toolserver-l mailing list
[email protected]
http://lists.wikimedia.org/mailman/listinfo/toolserver-l

Reply via email to