Hi, I'm having some trouble with importing the text files with imdbpy2sql. I'm running Debian with python 2.6.6-8+b1, postgresql 9.0.3-1 and imdbpy 4.7.0-1.
I created a database called imdb in the usual way. Debian puts imdbpy2sql in /usr/share/doc/python-imdbpy/examples/imdbpy2sql.py.gz. I usually extract it and put it in /tmp/imdbpy2sql. I ran this: $ /tmp/imdbpy2sql.py -d ~/imdb/lists -u postgres:///var/run/postgresql/imdb It starts processing as normal. However at some point in the middle of the actors, psycopg2 thows a DataError. * FLUSHING SQLData... SCANNING actor: Hartley, Jalaal SCANNING actor: Harwood, Anthony (II) * FLUSHING PersonsCache... * FLUSHING SQLData... SCANNING actor: Hatcher, Steve SCANNING actor: Havers, Nigel * FLUSHING SQLData... SCANNING actor: Hayden, Luke * FLUSHING CharactersCache... Traceback (most recent call last): File "/tmp/imdbpy2sql.py", line 2950, in <module> run() File "/tmp/imdbpy2sql.py", line 2811, in run castLists(_charIDsList=characters_imdbIDs) File "/tmp/imdbpy2sql.py", line 1575, in castLists doCast(f, roleid, rolename) File "/tmp/imdbpy2sql.py", line 1534, in doCast cid = CACHE_CID.addUnique(role) File "/tmp/imdbpy2sql.py", line 957, in addUnique else: return self.add(key, miscData) File "/tmp/imdbpy2sql.py", line 950, in add self[key] = c File "/tmp/imdbpy2sql.py", line 860, in __setitem__ self.flush() File "/tmp/imdbpy2sql.py", line 883, in flush self._toDB(quiet) File "/tmp/imdbpy2sql.py", line 1185, in _toDB CURS.executemany(self.sqlstr, self.converter(l)) psycopg2.DataError: invalid byte sequence for encoding "UTF8": 0xc320 When I run /usr/share/doc/python-imdbpy/goodies/reduce.sh to get the data size down a little the whole import works fine. So I'm guessing there are some stray characters in the text somewhere that are not being decoded properly to unicode, but I have no idea where to try to fix it. Regards -- Tom ------------------------------------------------------------------------------ Xperia(TM) PLAY It's a major breakthrough. An authentic gaming smartphone on the nation's most reliable network. And it wants your games. http://p.sf.net/sfu/verizon-sfdev _______________________________________________ Imdbpy-devel mailing list Imdbpy-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/imdbpy-devel