On Apr 08, Giuseppe Corbelli <[EMAIL PROTECTED]> wrote:
> The usage performance is more important, isn't it?
Absolutely! And database independence is a benefit that compensates
even a noticeable slow down at insert time.
Having said that, 5 or 6 hours to complete are a very long time, and
I hope they can be reduced to 2 or 3, but here the mantra is "make
it work, _then_ make it fast". :-)
I committed some changes to the CVS:
- renamed the ratober.c C module to cutils.c, because it no
more contains only ratcliff-obershelp related functions.
- implemented a C soundex function, in the cutils module.
It's similar to the MySQL's SOUNDEX() function, but without
the "0" padding if the returned string is too short and with a
maximum length of 5 (one uppercase char and four digits in the
[1-6] range).
I've (more or less) analyzed the list of titles and it seems a good
length. I've still to look at the list of names.
Without padding the returned string is variable in length, ranging from
1 to 5 chars; if the input string is empty, "0" (a zero in string)
is returned.
- implemented the pure-python equivalent of the C function (it's just
the same you've put in the utils module, with some improvements for speed).
At the moment it's in the imdb.parser.sql package; after all
it will be used only by the sql package and by the imdbpy2sql script.
It's a lot slower than the C version but still usable (20 seconds to
process 1mln strings...)
<Mr. T mode>
Da Ci functia is helluva fast, but I pity the fool messin' with da
Pytha version too!
</Mr. T mode>
The correct usage (even for the imdbpy2sql script) is:
from imdb.parser.sql import soundex
it will import the C version if available, or fall back to the pure
Python one otherwise.
/me going out to vote,
--
Davide Alberani <[EMAIL PROTECTED]> [PGP KeyID: 0x465BFD47]
http://erlug.linux.it/~da/
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Imdbpy-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/imdbpy-devel