On Apr 08, Giuseppe Corbelli <[EMAIL PROTECTED]> wrote:

> The usage performance is more important, isn't it?

Absolutely!  And database independence is a benefit that compensates
even a noticeable slow down at insert time.
Having said that, 5 or 6 hours to complete are a very long time, and
I hope they can be reduced to 2 or 3, but here the mantra is "make
it work, _then_ make it fast". :-)

I committed some changes to the CVS:
- renamed the ratober.c C module to cutils.c, because it no
  more contains only ratcliff-obershelp related functions.

- implemented a C soundex function, in the cutils module.
  It's similar to the MySQL's SOUNDEX() function, but without
  the "0" padding if the returned string is too short and with a
  maximum length of 5 (one uppercase char and four digits in the
  [1-6] range).
  I've (more or less) analyzed the list of titles and it seems a good
  length.  I've still to look at the list of names.
  Without padding the returned string is variable in length, ranging from
  1 to 5 chars; if the input string is empty, "0" (a zero in string)
  is returned.

- implemented the pure-python equivalent of the C function (it's just
  the same you've put in the utils module, with some improvements for speed).
  At the moment it's in the imdb.parser.sql package; after all
  it will be used only by the sql package and by the imdbpy2sql script.
  It's a lot slower than the C version but still usable (20 seconds to
  process 1mln strings...)
  <Mr. T mode>
    Da Ci functia is helluva fast, but I pity the fool messin' with da
    Pytha version too!
  </Mr. T mode>

  The correct usage (even for the imdbpy2sql script) is:
     from imdb.parser.sql import soundex

  it will import the C version if available, or fall back to the pure
  Python one otherwise.


/me going out to vote,
-- 
Davide Alberani <[EMAIL PROTECTED]> [PGP KeyID: 0x465BFD47]
http://erlug.linux.it/~da/


-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Imdbpy-devel mailing list
Imdbpy-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/imdbpy-devel

Reply via email to