Re: [Imdbpy-devel] SOUNDEX and imdbpy

Martin Kirst Fri, 07 Apr 2006 12:56:01 -0700

Hi There!


I fear I've not expressed myself clearly (Pardon!  Even my Klingon is
better than my English... ;-)

Don't know about your Klingon, but your English is very good.
And maybe, sometimes I should read e-mails with more care ;-)

[... good explanation about SOUNDEX ...]

I believe, that I have fully understood now.

  SOUNDEX('GINO PINO') -> G515
  SOUNDEX('GINO FINO') -> G515  # they sound similar.

[...]

On the other side, I think that at _usage_ time, performances
won't be greatly affected (maybe no more than 1.2x or 1.5x)


Are all SOUNDEX - values in this style/syntax?
I mean, if all soundexs are [a-z|A-Z][0-9]*3 than we could
"translate" them into integers or long.
I could imagine, that databases do have a more optimized
search algorithm for int32 or int64 than for short strings.
And so maybe we are more tending to factor 1.2x ? :-)

Oh, by the way we have 950.000 title and 2.000.000 names in the database.
And counting... :-)

Oops - It was not on my mind, that are there so many :-)

A time factor of 3x 4x or 5x for imdbpysql.py is scaring me.
If I found some spare time, I will squeeze out my brain, for
hopefully getting some good ideas :-)

Greetings
 Martin


-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Imdbpy-devel mailing list
Imdbpy-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/imdbpy-devel

Re: [Imdbpy-devel] SOUNDEX and imdbpy

Reply via email to