On 22 Dec 2009, at 10:55am, Jean-Denis Muys wrote:

> On 12/22/09 11:31 , "Sylvain Pointeau" <sylvain.point...@gmail.com> wrote:
> 
>> Imagine that I want to query my database for a certain type of word,
>> the user could enter ü or ue and I will display the corresponding items...
>> 
>> It cannot be done in the application layer...
>> 
> 
> Or maybe it can... You could for example maintain in your application,
> possibly in a SQLite table for that purpose, all equivalence classes of all
> characters.
> 
> The obvious optimisation is to omit an equivalence class when it's a
> singleton.
> 
> So you are left with a few equivalence classes that may look like this:
> 
> ü, ue
> ä, ae, æ
> e, é, è, ê, ë
> 
> (or whatever).

But at that distance you might as well get rid of vowels entirely.  That is 
what SOUNDEX and Metaphone are for:

<http://en.wikipedia.org/wiki/Soundex>

Not only do they make up for minor spelling errors, but they also compensate 
for different character sets.  You can see some details of German equivalents 
here:

<http://www.sggee.org/misc/soundex.html>
<http://de.wikipedia.org/wiki/Kölner_Phonetik>

As you can see, these things should definitely be done in the application 
layer.  Store both forms of your text: the original text and the encoded form.

Simon.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to