On Mon, Apr 11, 2011 at 18:35, darklow <[email protected]> wrote:
>
> File "./imdbpy2sql.py", line 1194, in _toDB
> CURS.executemany(self.sqlstr, self.converter(l))
> psycopg2.DataError: invalid byte sequence for encoding "UTF8": 0xc320
> HINT: This error can also happen if the byte sequence does not match the
> encoding expected by the server, which is controlled by "client_encoding".
Hi all,
I'm writing regarding the recent "0xc320" problem with IMDbPY.
The above notice is extremely interesting, and should be investigated:
how can it be that 0xc320 is not UTF8 encodable?
It should work; from the Python prompt:
>>> unichr(0xc320).encode('utf8')
'\xec\x8c\xa0'
Anyway, as a very fast and dirty fix (the main problem is probably some
crap in the data files), try this: after line 1181 of imdbpy2sql.py, add:
k = k.replace('\xec\x8c\xa0', '')
So that the nearby lines will become:
try:
k = k.replace('\xec\x8c\xa0', '')
t = analyze_name(k)
except IMDbParserError:
Please be aware that this fix was not tested at all, but I'm
almost sure that, at the above point, 'k' is a string encoded in utf8.
Anyway, beside the "garbage theory", I have another idea
about the source of the error, but I have to verify it later...
Bye, and let me know if it works!
--
Davide Alberani <[email protected]> [PGP KeyID: 0x465BFD47]
http://www.mimante.net/
------------------------------------------------------------------------------
Benefiting from Server Virtualization: Beyond Initial Workload
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve
application availability and disaster protection. Learn more about boosting
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
_______________________________________________
Imdbpy-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/imdbpy-devel