Oracle support on Unicode branch

Malcolm Tredinnick Mon, 02 Jul 2007 01:03:37 -0700

I have committed some changes to fix the Oracle support on the Unicode
branch (in [5584]). Would appreciate any review and feedback from the
Oracle developers, since it seemed much more fiddly than I would have
liked, although much of that was because it took me ages to work out how
to manage client-side encoding.


All the tests pass (and the Unicode test suite stresses things quite
well), so that's promising. I haven't used it at all beyond that,
though.

A couple of questions I would ask somebody who knew stuff about
cx_Oracle:

(1) Is there any way to register custom type converters for column types
like we can do in every other database backend?

It felt very awkward to have to override fetchone(), fetchall() and
fetchmany() in order to convert strings to Unicode (cx_Oracle has no
builtin Unicode support, so some hackery was required).

(2) I am forcing NLS_LANG to ".UTF8" in the Oracle backend so that I
know the client encoding will be something that can tolerate arbitrary
Unicode characters (more or less; the surrogate pairs range is a
probably a bit wrong).

The drawback of this approach is that it will override existing country
and territory settings and thus, might, mess up the expected sorting
order for results. Any better way to handle that? Looks like I have to
set the environment variable to make this work (took long enough to
discover there was no workaround for that!). I'm a bit wary of trying to
read the current value and just alter the charset portion, but would
that be better?

(3) For future reference, is there any minimum version of cx_Oracle that
is currently required by the Oracle code? It looks like there might be a
buffer overflow issue for Unicode that was only fixed in recent
releases, but I couldn't see any functionality reasons why earlier
versions couldn't be used.

(4) I changed the types of a couple of fields (Django's text and char
fields) to NCHAR and NVARCHAR2 (from CHAR and VARCHAR2, respectively).
This seemed like the most pragmatic solution to the problem that we
don't know what the database encoding will be and the Oracle docs seemed
to point to using the N-variants as something that would work in all
cases. Have I screwed up anything by doing that?

Regards,
Malcolm

-- 
Honk if you love peace and quiet. 
http://www.pointy-stick.com/blog/


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~----------~----~----~----~------~----~------~--~---

Oracle support on Unicode branch

Reply via email to