On Apr 07, Martin Kirst <[EMAIL PROTECTED]> wrote:

> >On the other side, I think that at _usage_ time, performances
> >won't be greatly affected (maybe no more than 1.2x or 1.5x)
> 
> Are all SOUNDEX - values in this style/syntax?

Yes, they are.  Some algorithms will output an upper char
followed by 5 digits, others have variable length - dependent
on the length of the input string.  Amongst the implementations
I've looked at, the majority uses a char and 3 digits (fixed length).

I plan to do some analysis on the database to find the ideal
length, but I think a capital letter and 3 or 4 digits will be
enough.

> I mean, if all soundexs are [a-z|A-Z][0-9]*3 than we could
> "translate" them into integers or long.
> I could imagine, that databases do have a more optimized
> search algorithm for int32 or int64 than for short strings.

That's probably true; my bet is that the difference will be
measurable, but not noticeable by the user (i.e.: maybe using
integers it is 2 times faster, but it halves an already very very
short amount of time [1]).
Anyway... talking about performance, opinions are worthless: only
a benchmark can speak. :-)

> A time factor of 3x 4x or 5x for imdbpysql.py is scaring me.

Me too, me too.
Maybe Giuseppe was right, and some speed up can be achieved
using the SQLObject cache and removing my implementation of
MoviesCache and PersonsCache.

So far the priority is to obtain a working script; after that,
fine tuning and profiling [2] can start.

> If I found some spare time, I will squeeze out my brain, for
> hopefully getting some good ideas :-)

Any hint is welcome!


+++
[1] conversion to integers will requires a bit more of CPU time at
    insert time, and we have to be sure not to trade a noticeable
    slow down of the imdbpy2sql.py script, in place of a minor
    improvement at SELECT time.
[2] I have already profiled previous releases of the script... it
    was a veeery long wait, for the results. :-)

-- 
Davide Alberani <[EMAIL PROTECTED]> [PGP KeyID: 0x465BFD47]
http://erlug.linux.it/~da/


-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Imdbpy-devel mailing list
Imdbpy-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/imdbpy-devel

Reply via email to