There has never been any issues with our PostgresSQL database, we always
have used UTF-8 and are using this time.
I have tried plenty of scripts, workarounds so far, many decode().encode()
tries, but nothing helps, just gettings different errors by these.
I also tried adding following lines, to be sure everything is fine with
connection to Database:

import psycopg2
import psycopg2.extensions
psycopg2.extensions.register_type(psycopg2.extensions.UNICODE)
psycopg2.extensions.register_type(psycopg2.extensions.UNICODEARRAY)

import codecs
sys.setdefaultencoding('utf-8')

CURS.execute("SET NAMES 'utf8'")
CURS.execute("SET CLIENT_ENCODING TO 'utf8'")


But still nothing helps.
I tried reinstalling all installed dependancies and run from clean sources,
but no luck.
I tried to run scripts with SQLAlchemy instead of SQLObject, but same error,
so the problem is not there.

I woud like to ask you one thing.
Every test takes about 1h, because error takes place in Actors Cast list.
Can you please tell what are the exact list of commands that are converting
lines from file to line to sql.
So i could create new script, that tries small version of actors.list with
problematic lines only, runs few unicode() and decode() lines in correct
order and try to insert these lines in some test table into database. So i
could try, more faster and not to wait 1 hour for every try...

What i tried already is to open actor.list file with PHP, read every line
and using iconv converted string to UTF8 and inserted into PostgreSQL
database and everything worked fine. It makes me think that problem might be
somewhere in cutting line in peaces, maybe it does something wrong, cuts
some good unicode character into peaces and so invalid byte sequence
appears. If i had correct function list for Python, i could run more tests.

PS. Just run test with 4.6 version, to see if it still works with 4.6
version, then we could more easy diagnose by looking in file changes.
I'll post the results

Thank you.

On Sat, Apr 23, 2011 at 3:23 PM, Davide Alberani
<davide.alber...@gmail.com>wrote:

> On Wed, Apr 20, 2011 at 14:08, darklow <dark...@gmail.com> wrote:
> > Still no luck :/ maybe the problem is in some environmental variables or
> > settings, which on installed version are present, but running from source
> > are missing or incorrect?
>
> Seems unlikely to me.
>
> > What about this, i printed out some variables:
> > print sys.stdout.encoding -> UTF-8
> > print sys.stdin.encoding   -> UTF-8
> > print sys.getdefaultencoding(); -> ascii
> > Is it ok that  sys.getdefaultencoding(); == ascii ?
>
> These are fine.
>
> I've reproduced - at the best of my capabilities - your environment:
> - no IMDbPY installed in the system.
> - IMDbPY from source (the latest version in the Mercurial repository),
>  setting the PYTHONPATH environment variable to point to the
>  source directory.
> - the cutils C module was not compiled.
> - the last actors.list.gz file.
> - postgres 8.4; my database was created with these settings:
>  CREATE DATABASE imdb
>    WITH OWNER = postgres
>       ENCODING = 'UTF8'
>       TABLESPACE = pg_default
>       LC_COLLATE = 'it_IT.utf8'
>       LC_CTYPE = 'it_IT.utf8'
>       CONNECTION LIMIT = -1;
>
> I've run it with your and other portions of the actors.list.gz file, and
> everything went fine.
>
> Now... if I were you, I'd:
> - create a virtualenv environment with:
>    virtualenv --no-site-packages
> - install in it IMDbPY, using easy_install or pip (the executable in
>  your virtualenv, I mean) so that you'll have all the correct dependecies
>  available.
> - run the imdbpy2sql.py within your virtualenv.
>
> If it still fails:
> - check your postgres settings.
> - try using SQLite (just for a test) - see notes in README.sqldb
>
>
> HTH,
> --
> Davide Alberani <davide.alber...@gmail.com>  [PGP KeyID: 0x465BFD47]
> http://www.mimante.net/
>
------------------------------------------------------------------------------
Fulfilling the Lean Software Promise
Lean software platforms are now widely adopted and the benefits have been 
demonstrated beyond question. Learn why your peers are replacing JEE 
containers with lightweight application servers - and what you can gain 
from the move. http://p.sf.net/sfu/vmware-sfemails
_______________________________________________
Imdbpy-help mailing list
Imdbpy-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/imdbpy-help

Reply via email to