#5171: postgresql_psycopg2 backend registers psycopg2's UNICODE extension, which
can cause interference if Django is not the only component using psycopg2
---------------------------------------------+------------------------------
Reporter: Chris Wagner <[EMAIL PROTECTED]> | Owner: adrian
Status: new | Component: Database wrapper
Version: SVN | Keywords:
Stage: Unreviewed | Has_patch: 0
---------------------------------------------+------------------------------
This one led me on a wild goose chase. I am trying to use SQLAlchemy
alongside Django.
Here's the problem: The default encoding for a psycopg2 connection is
"SQL_ASCII". And, by default, psycopg2 ''accepts'' and ''passes back''
non-Unicode strings (i.e., Python {{{str}}} objects, not {{{unicode}}}
objects). SQLAlchemy works okay using this setup, as it does conversion
between {{{unicode}}} objects and utf-8-encoded {{{str}}} objects as data
passes to and from the database.
Django, however, seems to rely on psycopg2 to do the conversions; so, it
registers psycopg2's "UNICODE" extension:
{{{
psycopg2.extensions.register_type(psycopg2.extensions.UNICODE)
}}}
This is done in {{{django/db/backends/postgresql_psycopg2/base.py}}}, upon
loading that module. When this option is set, psycopg2 tries to convert
all results to {{{unicode}}} objects. If the default encoding,
"SQL_ASCII", is in use, this will cause {{{UnicodeDecodeError}}}'s to be
raised upon attempting to pull out some non-ASCII text from the
database...
However, this is okay for Django's personal needs, because it also sets
the ''client encoding'' for its psycopg2 database connection:
{{{
self.connection.set_client_encoding('UTF8')
}}}
This change, however, only affects the given {{{connection}}} object,
which is local to Django. Unfortunately, SQLAlchemy does not set the
''client encoding'' for its connections.
So, by registering psycopg2's UNICODE extension, Django places a
restriction on all psycopg2 connections that wish to deal with Unicode:
all of the connections must {{{set_client_encoding}}} to UTF8 (or perhaps
another Unicode encoding). This doesn't sound like a big deal, but:
* it would take some serious hack-arounds to make sure SQLAlchemy's
psycopg2 connections all use the right encoding (i.e., call
{{{connection.set_client_encoding('utf8')}}}), and
* this can lead to ''very difficult to trackdown'' problems.
This "bug" led to some especially odd behavior, in my case. I was finding
that, early on in my test script, there were no problems inserting and
selecting non-ASCII text into/from the database. It took me a long time
to realize that, it was only after certain parts of Django had been loaded
that errors would start flying. It took a whole lot of trial-and-error
(commenting out bits of Django, loading various modules, etc.) to get to
the bottom of things.
The only foolproof way that I can think of, for fixing this, is to program
to Django to behave as SQLAlchemy does: it should manually convert to/from
{{{unicode}}} objects.
--
Ticket URL: <http://code.djangoproject.com/ticket/5171>
Django Code <http://code.djangoproject.com/>
The web framework for perfectionists with deadlines
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at
http://groups.google.com/group/django-updates?hl=en
-~----------~----~----~----~------~----~------~--~---