On Apr 08, Giuseppe Corbelli <[EMAIL PROTECTED]> wrote: > The usage performance is more important, isn't it?
Absolutely! And database independence is a benefit that compensates even a noticeable slow down at insert time. Having said that, 5 or 6 hours to complete are a very long time, and I hope they can be reduced to 2 or 3, but here the mantra is "make it work, _then_ make it fast". :-) I committed some changes to the CVS: - renamed the ratober.c C module to cutils.c, because it no more contains only ratcliff-obershelp related functions. - implemented a C soundex function, in the cutils module. It's similar to the MySQL's SOUNDEX() function, but without the "0" padding if the returned string is too short and with a maximum length of 5 (one uppercase char and four digits in the [1-6] range). I've (more or less) analyzed the list of titles and it seems a good length. I've still to look at the list of names. Without padding the returned string is variable in length, ranging from 1 to 5 chars; if the input string is empty, "0" (a zero in string) is returned. - implemented the pure-python equivalent of the C function (it's just the same you've put in the utils module, with some improvements for speed). At the moment it's in the imdb.parser.sql package; after all it will be used only by the sql package and by the imdbpy2sql script. It's a lot slower than the C version but still usable (20 seconds to process 1mln strings...) <Mr. T mode> Da Ci functia is helluva fast, but I pity the fool messin' with da Pytha version too! </Mr. T mode> The correct usage (even for the imdbpy2sql script) is: from imdb.parser.sql import soundex it will import the C version if available, or fall back to the pure Python one otherwise. /me going out to vote, -- Davide Alberani <[EMAIL PROTECTED]> [PGP KeyID: 0x465BFD47] http://erlug.linux.it/~da/ ------------------------------------------------------- This SF.Net email is sponsored by xPML, a groundbreaking scripting language that extends applications into web and mobile media. Attend the live webcast and join the prime developer group breaking into this new coding territory! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 _______________________________________________ Imdbpy-devel mailing list Imdbpy-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/imdbpy-devel