Hi,

I'm having some trouble with importing the text files with imdbpy2sql.
I'm running Debian with python 2.6.6-8+b1, postgresql 9.0.3-1 and
imdbpy 4.7.0-1.

I created a database called imdb in the usual way. Debian puts
imdbpy2sql in /usr/share/doc/python-imdbpy/examples/imdbpy2sql.py.gz.
I usually extract it and put it in /tmp/imdbpy2sql. I ran this:
$ /tmp/imdbpy2sql.py -d ~/imdb/lists -u postgres:///var/run/postgresql/imdb

It starts processing as normal. However at some point in the middle of
the actors, psycopg2 thows a DataError.

 * FLUSHING SQLData...
SCANNING actor: Hartley, Jalaal
SCANNING actor: Harwood, Anthony (II)
 * FLUSHING PersonsCache...
 * FLUSHING SQLData...
SCANNING actor: Hatcher, Steve
SCANNING actor: Havers, Nigel
 * FLUSHING SQLData...
SCANNING actor: Hayden, Luke
 * FLUSHING CharactersCache...
Traceback (most recent call last):
  File "/tmp/imdbpy2sql.py", line 2950, in <module>
    run()
  File "/tmp/imdbpy2sql.py", line 2811, in run
    castLists(_charIDsList=characters_imdbIDs)
  File "/tmp/imdbpy2sql.py", line 1575, in castLists
    doCast(f, roleid, rolename)
  File "/tmp/imdbpy2sql.py", line 1534, in doCast
    cid = CACHE_CID.addUnique(role)
  File "/tmp/imdbpy2sql.py", line 957, in addUnique
    else: return self.add(key, miscData)
  File "/tmp/imdbpy2sql.py", line 950, in add
    self[key] = c
  File "/tmp/imdbpy2sql.py", line 860, in __setitem__
    self.flush()
  File "/tmp/imdbpy2sql.py", line 883, in flush
    self._toDB(quiet)
  File "/tmp/imdbpy2sql.py", line 1185, in _toDB
    CURS.executemany(self.sqlstr, self.converter(l))
psycopg2.DataError: invalid byte sequence for encoding "UTF8": 0xc320

When I run /usr/share/doc/python-imdbpy/goodies/reduce.sh to get the
data size down a little the whole import works fine. So I'm guessing
there are some stray characters in the text somewhere that are not
being decoded properly to unicode, but I have no idea where to try to
fix it.

Regards
--
Tom

------------------------------------------------------------------------------
Xperia(TM) PLAY
It's a major breakthrough. An authentic gaming
smartphone on the nation's most reliable network.
And it wants your games.
http://p.sf.net/sfu/verizon-sfdev
_______________________________________________
Imdbpy-devel mailing list
Imdbpy-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/imdbpy-devel

Reply via email to