On 16 Jun 2009, at 4:46pm, Swithun Crowe wrote:

> How about having an extra column for each column that you want to  
> search
> in? In the extra column, have a plain lowercase ASCII version of the  
> word.
> So, for 'Sào Paulo', have 'sao paulo'. You would need to write a small
> program to convert the characters. When you want to search for  
> something,
> convert your search query into something without accents, and search  
> in
> the extra column.


That would be a good solution, but it would require some intelligence  
in the users.  Instead of writing a small conversion program you could  
write your own encoding function as an SQLite extension.  More  
puzzling to start with but far more convenient to use in the long run.

<http://www.sqlite.org/c3ref/create_function.html>

Might I suggest that if you do either of those you look into doing  
soundex as part of your handling of accented characters ?  Even  
countries that do not use accented characters find soundex encoding  
very useful.

<http://en.wikipedia.org/wiki/Soundex>

Including soundex as part of your hashing function gets rid of the  
'Zürich' problem: all three of 'Zürich', 'Zuerich', and 'Zurich' all  
render the same value using soundex, so searching for any one of them  
would return all records which contained any of the three versions.

I do not know if the SOUNDEX() function handles accented characters in  
this way.  SQLite apparently supports its built-in SOUNDEX() function  
only if built with a particular switch.

Simon.
-- 
  http://www.hearsay.demon.co.uk | I'd expect if a computer was involved
                                 | it all would have been much worse.
    No Buffy for you.            |                -- John "West" McKenna
    Leave quickly now. -- Anya   |          THE FRENCH WAS THERE

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to