Hello Vasily,

I'm also the unfortunate one who has to use encodings other than ascii. I'm sure that your patch helps, but I'm not sure that this is the "right way".

The thing that I learned from my dealing with unicode and string encodings is: always use unicode. What I mean is when you write your source:
* make all your data (variables, literals) as unicode
* put the -*- coding: -*- directive so that interpreter knows how to convert your u"" strings

Those two rules lead to the following:

# -*- coding: cp1251 -*-

import sqlalchemy

# note that there is no convert_unicode flag, but there is encoding flag
db = sqlalchemy.create_engine('sqlite://', encoding='cp1251')

# note a change in type of "name" column from String to Unicode
companies = sqlalchemy.Table('companies', db,
   sqlalchemy.Column('company_id', sqlalchemy.Integer, primary_key=True),
   sqlalchemy.Column('name', sqlalchemy.Unicode(50)))

# ....

# OK, unicode
Company(name=u'Какой-то текст в кодировке cp1251')

# Avoid plain strings
Company(name='Some text in ascii')


This becomes necessity if you have for example more than one database driver using different encoding. You get back unicode strings which you can combine and copy from one database to another without worrying.

db1 = sqlalchemy.create_engine('mysql://', encoding='latin2')
db2 = sqlalchemy.create_engine('oracle://', encoding='windows-1250')

ob1 = db1_mapper.select(...)
ob2 = db2_mapper.select(...)

ob1.name = ob1.name + ob2.name # All unicode, no problems


On 4/17/06, Vasily Sulatskov <[EMAIL PROTECTED]> wrote:
Hello Michael,

I  know  there's  a  database  engine  parameter  "encoding". It tells
sqlalchemy  in  which  encoding  Unicode  objects  should  be saved to
database.

I  suggest  adding another encoding, let's say "client_encoding" which
will  be  used  when  convert_unicode  is True and user assigns string
object  to  object attribute. Currently even if convert_unicode is set
to True string go to database as-is, bypassing convertion to unicode.

This  option  will  allow  to  assign  string's  in  national/platform
specific  encodings, like cp1251 straigt to object attributes and they
will be properly converted to database encoding (engine.encoding).


See,  encoding  on  client  machine  may be different from encoding in
database. You can see changes that I suggest from attached diff.

Suggested    changes    will    can    make    life    of   users   of
multilingual/multienconding  enviromnents  a  little  easier while not
affexcting all other users of SQLAlchemy.

MB> On Apr 17, 2006, at 5:47 AM, Vasily Sulatskov wrote:

>> In my opinion that's a bug and that behaviour should be changed to
>> something
>> like that:
>> 1. If object is unicode then convert it to engine specified
>> encoding (like
>> utf8) as it happens now
>> 2. If it's a string then convert it to unicode using some another
>> specifed
>> encoding (it should be added to engine parameters). This encoding
>> specifies
>> client-side encoding. It's often handy to have different encodings
>> in database
>> and on client machines (at least for people with "alternate
>> languages" :-)


MB> there already is an encoding parameter for the engine.

MB> http://www.sqlalchemy.org/docs/dbengine.myt#database_options

MB> does that solve your problem ?

--
Best regards,
Vasily                            mailto:[EMAIL PROTECTED]


Reply via email to