On Sep 01, James Rubino <[EMAIL PROTECTED]> wrote:

> I thought I hade uniform character settings but my initial settings are utf8
> while the table rows eventually read latin1-bin

First of all, please be sure to use the latest IMDbPY package (3.6 or
the one from the CVS); you can consider to upgrade also the MySQLdb
package and SQLObject, too.

Cut and paste from the README.sqldb file:
=================================================================
[data truncated]
If you get an insane amount (hundreds or thousands, on various text
columns) of warnings like these lines:

  imdbpy2sql.py:727: Warning: Data truncated for column 'person_role' at row 
4979
  CURS.executemany(self.sqlString, self.converter(self.values()))

you probably have a problem with the configuration of your database.
The error came from strings that get cut at the first non-ASCII char (and
so you're losing a lot of information).
To obviate at this problem, you must be sure that your database
server is set up properly, with the use library/client configured
to communicate with the server in a consistent way.
E.g., for MySQL you can set:
  character-set-server   = utf8
  default-collation      = utf8_unicode_ci
  default-character-set  = utf8

of even:
  character-set-server   = latin1
  default-collation      = latin1_bin
  default-character-set  = latin1
=================================================================


> SQLdb\cursors.py", line 218, in executemany
>     r = self._query('\n'.join([query[:p], ',\n'.join(q), query[e:]]))
> MemoryError

I hope the above instructions can solve the problem, even if - to
tell the truth - this MemoryError scares me a lot... :-)

Let me know if something changes.

By the way, it's possible to work only on a subset of the plain text
data files, and this can be useful to debug your problem.  From what
I see, it's related to the data in the "MinusHashFiles" set of file;
you can copy 'alternate versions', 'goofs', 'crazy credits',
'quotes' and 'trivia' files in a separated directory, and run
imdbpy2sql.py only on these file (so that it has not to handle
'movies', 'actors', 'actresses', ... first).


-- 
Davide Alberani <[EMAIL PROTECTED]> [PGP KeyID: 0x465BFD47]
http://erlug.linux.it/~da/

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Imdbpy-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/imdbpy-devel

Reply via email to