Hi There!
I fear I've not expressed myself clearly (Pardon! Even my Klingon is better than my English... ;-)
Don't know about your Klingon, but your English is very good. And maybe, sometimes I should read e-mails with more care ;-)
[... good explanation about SOUNDEX ...]
I believe, that I have fully understood now.
SOUNDEX('GINO PINO') -> G515 SOUNDEX('GINO FINO') -> G515 # they sound similar.
[...]
On the other side, I think that at _usage_ time, performances won't be greatly affected (maybe no more than 1.2x or 1.5x)
Are all SOUNDEX - values in this style/syntax? I mean, if all soundexs are [a-z|A-Z][0-9]*3 than we could "translate" them into integers or long. I could imagine, that databases do have a more optimized search algorithm for int32 or int64 than for short strings. And so maybe we are more tending to factor 1.2x ? :-)
Oh, by the way we have 950.000 title and 2.000.000 names in the database. And counting... :-)
Oops - It was not on my mind, that are there so many :-) A time factor of 3x 4x or 5x for imdbpysql.py is scaring me. If I found some spare time, I will squeeze out my brain, for hopefully getting some good ideas :-) Greetings Martin ------------------------------------------------------- This SF.Net email is sponsored by xPML, a groundbreaking scripting language that extends applications into web and mobile media. Attend the live webcast and join the prime developer group breaking into this new coding territory! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 _______________________________________________ Imdbpy-devel mailing list Imdbpy-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/imdbpy-devel